Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Augment training batches with "on-the-fly" features #3171

Open
Riccorl opened this issue Apr 3, 2024 · 0 comments
Open

Augment training batches with "on-the-fly" features #3171

Riccorl opened this issue Apr 3, 2024 · 0 comments

Comments

@Riccorl
Copy link

Riccorl commented Apr 3, 2024

For my use case, I would like to augment the training data with features produced by the model itself. More specifically, my experiment is structured as follows:

  • Train the model for n steps, after which an Evaluation iteration is performed
  • Before continuing training, the training set (or the next portion before the next eval step) passes through the model again.
  • Add the prediction of the model to the training data before the next training iteration

I implemented a Callback for the second step that runs at the end of the evaluations (Event.EVAL_AFTER_ALL) but I'm struggling in propagating the prediction back to the training dataloader. Things that I have tried so far:

  • Add the prediction directly to the underlying dataset
  • Having a "shared" object (singleton for now) that stores the predictions

The issues are that (1) the training iterator is not "reloaded" before the end of the epoch and (2) the subprocesses where the Singleton is updated are not the same as those where the batches are computed

Can you provide some guidance?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant