Image generated by Stable Diffusion 2.1, based on a prompt from Bob Buzzard
Introduction
After the go-live of
Service and Sales GPT, I felt that I had to revisit my
Salesforce CLI Open AI Plug-in
and connect it up to the GPT 4 Large Language Model. I didn't succeed in this
quest, as while I am a paying customer of OpenAI, I haven't satisfied the
requirement of making a successful payment of at least $1. The API function
I'm hitting,
Create Chat Completion, supports gpt-3.5-turbo and the gpt-4 variants, so once I've racked up
enough cost using the earlier models I can switch over by changing one
parameter. My current spending looks like it will take me a few months to get
there, but such is life with competitively priced APIs.
The Use Case
The first incarnation of the plug-in asks the model to describe Apex, CLI or
Salesforce concepts, but I wanted something that was more of a tool than a
content generator, so I decided on creating test records. The new command
takes parameters listing the field names, the desired output format, and the
number of records required, and folds these into the messages passed to the
API function. Like the Completion API, the interface is very simple:
const response = await openai.createChatCompletion({
model: 'gpt-3.5-turbo',
messages,
temperature: 1,
max_tokens: maxTokens,
top_p: 1,
frequency_penalty: 0,
presence_penalty: 0,
});
result = (response.data.choices[0].message?.content as string);
There's a few more parameters than the Completion API:
-
model - the Large Language Model that I send the request to. Right
now I've hardcoded this to the latest I can access
-
messages - the collection of messages to send. The messages build on
each other, and each message has a content (the instruction/request)
and a role (where the instruction is being sent). This allows me to
separate the instructions to the model (when the role is
assistant, I'm giving it constraints about how to behave) from the
request (when the role is user, this is the task/request I'm
asking it to carry out).
-
max_tokens is the maximum number of tokens (approximately 4
characters of text) that my request combined with the response can be. I've
set this to 3,500, which is approaching the limit of the gpt-3.5 model. If
you have a lot of fields you'll have to generate a smaller number of records
to avoid breaching this. I was able to create 50 records with 4-5 fields
inside this limit, but your mileage may vary.
-
temperature and top_p guide the model as to whether I want
precise or creative responses.
-
frequency_penalty and presence_penalty indicate whether I want
the model to continually focus on tokens if they are repeated, or focus on
new information.
As this is an asynchronous API, I await the response, then pick the first
element in the choices array.
Here's a few executions to show it in action - linebreaks have been added to
the commands to aid legibility - remove these if you copy/paste the commands.
> sf bbai data testdata -f 'Index (count starting at 1), Name (Text, product name),
Amount (Number), CloseDate (Date yyyy-mm-dd),
StageName (One of these values : Negotiating, Closed Lost, Closed Won)'
-r csv
Here are the records you requested
Index,Name,Amount,CloseDate,StageName
1,Product A,1000,2022-01-15,Closed Lost
2,Product B,2500,2022-02-28,Closed Won
3,Product C,500,2022-03-10,Closed Lost
4,Product D,800,2022-04-05,Closed Won
5,Product E,1500,2022-05-20,Negotiating
> sf bbai data testdata -f 'FirstName (Text), LastName (Text), Company (Text),
Email (Email), Rating__c (1-10)'
-n 4 -r json
Here are the records you requested
[
{
"FirstName": "John",
"LastName": "Doe",
"Company": "ABC Inc.",
"Email": "john.doe@example.com",
"Rating__c": 8
},
{
"FirstName": "Jane",
"LastName": "Smith",
"Company": "XYZ Corp.",
"Email": "jane.smith@example.com",
"Rating__c": 5
},
{
"FirstName": "Michael",
"LastName": "Johnson",
"Company": "123 Co.",
"Email": "michael.johnson@example.com",
"Rating__c": 9
},
{
"FirstName": "Sarah",
"LastName": "Williams",
"Company": "Acme Ltd.",
"Email": "sarah.williams@example.com",
"Rating__c": 7
}
]
There's a few interesting points to note here:
-
Formatting field data is conversational - e.g. when I use
Date yyyy-mm-dd the model knows that I want the date in ISO8601 format. For picklist
values, I just tell it 'One of these values' and it does the rest.
-
In the messages I asked it to generate realistic data, and while it's very
good at this for
First Name, Last Name, Email, Company, it's not when told a
Name field should be a
product name,
just giving me
Product A, Product B
etc.
-
It sometimes takes it a couple of requests to generate the output in a
format suitable for dropping into a file - I'm guessing this is because I
instruct the model and make the request in a single API call.
-
I've generated probably close to 500 records while testing this, and that
has cost me the princely sum of $0.04. If you want to play around with the
GPT models, it really is dirt cheap.
The final point I'll make, as I did in the last post, is how simple the code
is. All the effort went into the messages to ask the model to generate the
data in the correct format, not to include additional information that it was
responding to the request, to generate realistic data. Truly the key
programming language for Generative AI is the supported language that you
speak - English in my case!
As before, you can install the plug-in via :
> sf plugins install bbai
or if you have already installed it, upgrade via :
> sf plugins update
More Information