Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add layers_to_transform for lora_config #1118

Merged
merged 2 commits into from
Jan 16, 2024

Conversation

xzuyn
Copy link
Contributor

@xzuyn xzuyn commented Jan 14, 2024

PEFT gives the option to target modules, but it also gives the option to specify which layers the targets are from. So it's kind of like freezing, but for LoRA.

It takes either a single integer for one layer, or a list of integers for multiple layers.

This is what to expect with princeton-nlp/Sheared-LLaMA-1.3B

1 Layer

adapter: qlora
lora_model_dir:
lora_r: 64
lora_alpha: 64
lora_dropout: 0.125
lora_target_linear: true
lora_fan_in_fan_out:
lora_target_modules:
  - gate_proj
  - down_proj
  - up_proj
  - q_proj
  - v_proj
  - k_proj
  - o_proj
layers_to_transform: 0
trainable params: 2,498,560 || all params: 1,347,921,920 || trainable%: 0.18536385252938092

2 Layers

adapter: qlora
lora_model_dir:
lora_r: 64
lora_alpha: 64
lora_dropout: 0.125
lora_target_linear: true
lora_fan_in_fan_out:
lora_target_modules:
  - gate_proj
  - down_proj
  - up_proj
  - q_proj
  - v_proj
  - k_proj
  - o_proj
layers_to_transform: [0, 1]
trainable params: 4,997,120 || all params: 1,350,420,480 || trainable%: 0.37004178135687044

All Layers

adapter: qlora
lora_model_dir:
lora_r: 64
lora_alpha: 64
lora_dropout: 0.125
lora_target_linear: true
lora_fan_in_fan_out:
lora_target_modules:
  - gate_proj
  - down_proj
  - up_proj
  - q_proj
  - v_proj
  - k_proj
  - o_proj
layers_to_transform:
trainable params: 59,965,440 || all params: 1,405,388,800 || trainable%: 4.266822106451966

@@ -674,7 +674,8 @@ lora_target_modules:
# - gate_proj
# - down_proj
# - up_proj
lora_target_linear: # If true, will target all linear layers
lora_target_linear: # If true, will target all linear modules
peft_layers_to_transform: # The layer indices to transform, otherwise, apply to all layers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting the prefix for this to peft_ since we are eventually going to migrate the other lora_ options soon.

Copy link
Collaborator

@NanoCode012 NanoCode012 Jan 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of using peft_config dictionary which accepts the same keys as in the Peft config?

This would reduce the need for us to keep updating for each change.

The change would follow the same structure as cfg.model_config

@winglian winglian merged commit 8487b97 into axolotl-ai-cloud:main Jan 16, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants