Augment training batches with "on-the-fly" features #3171

Riccorl · 2024-04-03T09:21:39Z

For my use case, I would like to augment the training data with features produced by the model itself. More specifically, my experiment is structured as follows:

Train the model for n steps, after which an Evaluation iteration is performed
Before continuing training, the training set (or the next portion before the next eval step) passes through the model again.
Add the prediction of the model to the training data before the next training iteration

I implemented a Callback for the second step that runs at the end of the evaluations (Event.EVAL_AFTER_ALL) but I'm struggling in propagating the prediction back to the training dataloader. Things that I have tried so far:

Add the prediction directly to the underlying dataset
Having a "shared" object (singleton for now) that stores the predictions

The issues are that (1) the training iterator is not "reloaded" before the end of the epoch and (2) the subprocesses where the Singleton is updated are not the same as those where the batches are computed

Can you provide some guidance?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Augment training batches with "on-the-fly" features #3171

Augment training batches with "on-the-fly" features #3171

Riccorl commented Apr 3, 2024

Augment training batches with "on-the-fly" features #3171

Augment training batches with "on-the-fly" features #3171

Comments

Riccorl commented Apr 3, 2024