Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Model Factory #2

Open
gregpriday opened this issue Jun 29, 2021 · 0 comments
Open

Create Model Factory #2

gregpriday opened this issue Jun 29, 2021 · 0 comments

Comments

@gregpriday
Copy link
Member

gregpriday commented Jun 29, 2021

One of the most interesting features of GPT-3 is its ability to create new instances of an item. This is fairly easy to do by giving it exising instances and letting it predict the next items in the sequence. We should make this easier for Laravel users by creating a custom factory class.

While normal Laravel factories are mainly for testing, our goal is to make a class that can be used in production. Someone might use it to create an online thing generator. Like a business idea generator, a name generator, etc.

Our base factory class would just be a child class of the core Eloquent Factory class. A developer using this package would extend our factory class in a similar way to extending this class.

What all follows is a rough guide on how this factory class should function. I'm sure things will change a lot during development.

Workflow

To generate new instances of a model, a developer will need to follow these steps:

  1. Set the seed models.
  2. Configure transforms/casts to get the data and fields into a pure text version for GPT-3.
  3. Text data is sent to GPT-3 and the completion is returned.
  4. Text from GPT-3 is parsed into a new Laravel model.

Step 1: Seed Models

This can either be an Eloquent model query, or a Collection of models. In the case of an Eloquent query, it should be reexecuted for each new model generated. This is to ensure fresh data is being sent to GPT-3 each time to make sure there's some variety in the new models being generated.

/**
* Set the seed data to generate the new models
* @var
*/
public function withSeed(Collection|Query $models){}

Step 2: Configure transforms for GPT-3

Next, we need to use the seed models to transform the seed models into a format that encourage GPT-3 to generate a new model.

[this is the prompt text]

[model 1 title text]
attr1: value1
attr2: value2

[model 2 title text]
attr1: value1
attr2: value2

...

To keep things simple, each title and the attribute should appear on a single line with no line breaks. We'll need to decide to either encode line breaks, or remove additional text after the first line break.

The prompt text is a hint for GPT-3 as to what we're generating.

The title for each model is flexible and defined at the factory level. It doesn't need to be tied to an actual title field. The title is important as a way to prompt GPT-3s completions.

public function withPrompt(string $text) {}
public function withAttributes(array $fields){}
public function withAttributeCast(string $field, Cast $cast)
public function withTitle(Closure|TitleCast|string|null $title){}
public function withModelSeparator(string $separator="\n\n"){}

We could also give a Model a way to cast its attributes to/from a GPT-3 format. A good example of this would be a Model casting tags to a comma separated list, and then back to an array.

If withTitle is passed a string, then this is treated as an attribute name, which allows the factory to use a custom attribute/mutator through get{TitleName}Attribute and set{TitleName}Attribute.

Step 3: Text data is sent to GPT-3

Use a custom GPT-3 facade to complete the title and attributes. The factory should send everything to GPT-3 with all the appropriate settings.

Requires: #1

Our factory will also have a way of specifying GPT-3 complete settings. This will allow a developer to choose a more cost effective engine like Curie if necessary.

public function withGpt3Options(array $options){}

Step 4: Transform GPT-3 Completion back to a model

This is going to be the most complicated step because GPT-3 doesn't offer perfectly reliable output, so we'll need to make our code quite resilient to dealing with malformed data.

[new model title text]
attr1: value1
attr2: value2

We know the first line will always be the title text if the factory had a withTitle set. So we can take that first line and either pass it through a TitleCast, through the Closure, or pass it to the new model with set{TitleName}Attribute().

For all the following lines we know anything before the : is the attribute name, and anything after is the attirbute value. We'll still need to pass this through any casts and set it on the new model using set{Name}Attribute().

Creating Multiple Models

It should be possible to generate multiple models using a single request to GPT-3. It does tend to lose track and start generating some junk though. To keep things simple, we can start by generating one model at a time, but allow the developers to set how many models are generated per request to GPT-3.

When the user wants to generate multiple we wont add the stop value of "\n\n" from withModelSeparator. We'll leave it up to the factory to set a reasonable max_tokens value for GPT-3.

Setting Options

A factory should be able to set options using a class member variable.

class MyModelFactory extends Factory {
	protected $options = [
		'gpt3' => ['temperature' => 0.85, 'max_tokens' => 256],
		'multiple_per_request' => true
	];
}

And we'll allow options to be modified when calling the factory with:

public function withOptions(array $options) {}

This will merge any new options into the default ones.

Example Uses

https://gist.github.com/gregpriday/8f2a113773d61cf6f9053b7a6de18677

Compatibility

We'll support PHP 8.0+ and Laravel 8.0+.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant