Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepSpeed Zero3 save fixes #736

Closed
wants to merge 1 commit into from
Closed

Conversation

joey00072
Copy link

Related Issue

#705

Fix

To use accelerate's recommendation here to run stage3_gather_16bit_weights_on_model_save.

Test

Config file

base_model: gpt2
base_model_config: gpt2
load_in_8bit: false
load_in_4bit: false
strict: false
push_dataset_to_hub:
datasets:
  - path: wikitext
    name: wikitext-2-v1
    type: completion
    train_on_split: test
dataset_prepared_path:
val_set_size: 0.01
adapter:
lora_model_dir:
sequence_len: 1024
max_packed_sequence_len:
lora_r:
lora_alpha:
lora_dropout:
lora_target_modules:
lora_target_linear:
lora_fan_in_fan_out:
wandb_project: axolotl
wandb_entity:
wandb_watch:
wandb_run_id: wikitext-test-1
wandb_log_model:
output_dir: ./wikitext-test-1
gradient_accumulation_steps: 16
micro_batch_size: 6
eval_batch_size:
num_epochs: 1
optimizer: paged_adamw_8bit
torchdistx_path:
lr_scheduler: linear
learning_rate: 0.0001
train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: true
gradient_checkpointing: false
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention: true
flash_attention:
gptq_groupsize:
gptq_model_v1:
warmup_steps: 10
eval_steps: 500
save_steps:
debug:
deepspeed: axolotl/deepspeed/zero3.json
weight_decay: 0.1
fsdp:
fsdp_config:
special_tokens:
  pad_token: "<|endoftext|>"

Run

accelerate launch -m axolotl.cli.train config.yaml

Continuation to #709
CC: @tokestermw @winglian

@winglian
Copy link
Collaborator

@joey00072 have you tested this fix?

@winglian
Copy link
Collaborator

#709 has been merged

@winglian winglian closed this Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants