Skip to content

This repository contains code for the MicroAdam paper.

License

Notifications You must be signed in to change notification settings

IST-DASLab/MicroAdam

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MicroAdam

This repository contains the code to reproduce the results for the paper MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence.

We provide code to reproduce the following experiments:

Installation

cd ~
git clone git@github.com:IST-DASLab/MicroAdam.git
cd ~/MicroAdam
source install.sh

Reproduce experiments for GLUE/MNLI

We provide the scripts run_hf_glue_mnli_OPTIM.sh, where OPTIM is the optimizer name, as follows: microadam, adamw, galore, came, adamw8b.

cd ~/MicroAdam/huggingface_glue_mnli
# bash run_hf_glue_mnli_adamw.sh
# bash run_hf_glue_mnli_adamw8b.sh
# bash run_hf_glue_mnli_came.sh
# bash run_hf_glue_mnli_galore.sh
bash run_hf_glue_mnli_microadam.sh

Reproduce experiments for Llama-2 7B on GSM-8k

We can run the experiments using the following commands:

Run MicroAdam

cd ~/MicroAdam/llm-foundry/scripts/train
bash run_llama2-7b_gsm8k_microadam.sh

Run AdamW-8bit

python3 train.py yamls/finetune/llama2-7b_microadam_gsm8k.yaml \
        task=gsm8k \
        optimizer.name=adamw8b \
        optimizer.defaults.lr=5e-5 \
        save_folder=./llama2_7b_gsm8k_adamw8b \
        seed=42

Run DecoupledAdamW

python3 train.py yamls/finetune/llama2-7b_microadam_gsm8k.yaml \
        task=gsm8k \
        optimizer.name=decoupled_adamw \
        optimizer.defaults.lr=5e-5 \
        save_folder=./llama2_7b_gsm8k_decoupled_adamw \
        seed=42

Changes compared to the original llm-foundry repository:

Citing

If you find our work useful, please consider citing:

@misc{modoranu2024microadam,
      title={MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence}, 
      author={Ionut-Vlad Modoranu and Mher Safaryan and Grigory Malinovsky and Eldar Kurtic and Thomas Robert and Peter Richtarik and Dan Alistarh},
      year={2024},
      eprint={2405.15593},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

About

This repository contains code for the MicroAdam paper.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages