You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TL;DR
Adam-mini should make it easier and faster to train models on home hardware.
In theory, it shouldn't be overly complicated to implement it, as it is very similar to AdamW
✔️ Solution
Implement Adam-Mini in Axolotl.
❓ Alternatives
Keep using AdamW
📝 Additional Context
Adam-mini should probably be 'sort-of' compatible with DeepSpeed right out of the box, greatly increasing training speed and reducing memory footprint.
Acknowledgements
My issue title is concise, descriptive, and in title casing.
I have searched the existing issues to make sure this feature has not been requested yet.
I have provided enough information for the maintainers to understand and evaluate this request.
The text was updated successfully, but these errors were encountered:
🔖 Feature description
Paper:
https://arxiv.org/abs/2406.16793
TL;DR
Adam-mini should make it easier and faster to train models on home hardware.
In theory, it shouldn't be overly complicated to implement it, as it is very similar to AdamW
✔️ Solution
Implement Adam-Mini in Axolotl.
❓ Alternatives
Keep using AdamW
📝 Additional Context
Adam-mini should probably be 'sort-of' compatible with DeepSpeed right out of the box, greatly increasing training speed and reducing memory footprint.
Acknowledgements
The text was updated successfully, but these errors were encountered: