Trainer.fit is missing the `optimizers` parameter #2503

antoinebrl · 2023-09-01T15:19:18Z

Hello,
The .fit() method of the Trainer is missing the optimizers parameters although it is part of its documentation.

It would be nice to keep the optimizers and the schedulers together. As the schedulers depend on the dataloaders I would argue the .fit() method should take the optimizers too as input.

The text was updated successfully, but these errors were encountered:

mvpatel2000 · 2023-09-01T22:15:25Z

This is a good suggestion! We'd love to get a community PR on this issue :) otherwise, we will add to our roadmap though unfortunately it might not be done in the immediate future

wlrd · 2023-09-05T02:00:47Z

@antoinebrl @mvpatel2000 once we pass the optimizers into the .fit() method, what do we want to do with them? I am happy to take a look and help out on this if I can get some guidance on how I use the optimizers here

mvpatel2000 · 2023-09-05T17:42:43Z

@wlrd Sure! In short, you'll need to duplicate the setup we do in def __init__ for setting up optimizers starting here:

https://github.com/mosaicml/composer/blob/dev/composer/trainer/trainer.py#L988

I believe this includes just updating State to use the new optimizer.

The next hard step is adding support for DeepSpeed and FSDP. You can see the DeepSpeed step here: https://github.com/mosaicml/composer/blob/dev/composer/trainer/trainer.py#L1004

For FSDP, it becomes much more complicated.... I think you can maybe just recreate the optimizer and repoint it to the model. I'm happy to scope this part out in more detail later.

I would recommend a series of 3 PRs:

Add optimizers for no parallelism and DDP case (just update State?) along with a unit test verifying it works. Raise a ValueError if DeepSpeed or FSPD are enabled and say it's not supported yet
Add DeepSpeed support
Add FSDP support

antoinebrl added the bug Something isn't working label Sep 1, 2023

mvpatel2000 self-assigned this Sep 1, 2023

hanlint added enhancement New (engineering) enhancements, such as features or API changes. and removed bug Something isn't working labels Jan 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trainer.fit is missing the `optimizers` parameter #2503

Trainer.fit is missing the `optimizers` parameter #2503

antoinebrl commented Sep 1, 2023

mvpatel2000 commented Sep 1, 2023

wlrd commented Sep 5, 2023

mvpatel2000 commented Sep 5, 2023

Trainer.fit is missing the optimizers parameter #2503

Trainer.fit is missing the optimizers parameter #2503

Comments

antoinebrl commented Sep 1, 2023

mvpatel2000 commented Sep 1, 2023

wlrd commented Sep 5, 2023

mvpatel2000 commented Sep 5, 2023

Trainer.fit is missing the `optimizers` parameter #2503

Trainer.fit is missing the `optimizers` parameter #2503