Add `layers_to_transform` for `lora_config` #1118

xzuyn · 2024-01-14T01:17:55Z

PEFT gives the option to target modules, but it also gives the option to specify which layers the targets are from. So it's kind of like freezing, but for LoRA.

It takes either a single integer for one layer, or a list of integers for multiple layers.

This is what to expect with princeton-nlp/Sheared-LLaMA-1.3B

1 Layer

adapter: qlora
lora_model_dir:
lora_r: 64
lora_alpha: 64
lora_dropout: 0.125
lora_target_linear: true
lora_fan_in_fan_out:
lora_target_modules:
  - gate_proj
  - down_proj
  - up_proj
  - q_proj
  - v_proj
  - k_proj
  - o_proj
layers_to_transform: 0

trainable params: 2,498,560 || all params: 1,347,921,920 || trainable%: 0.18536385252938092

2 Layers

adapter: qlora
lora_model_dir:
lora_r: 64
lora_alpha: 64
lora_dropout: 0.125
lora_target_linear: true
lora_fan_in_fan_out:
lora_target_modules:
  - gate_proj
  - down_proj
  - up_proj
  - q_proj
  - v_proj
  - k_proj
  - o_proj
layers_to_transform: [0, 1]

trainable params: 4,997,120 || all params: 1,350,420,480 || trainable%: 0.37004178135687044

All Layers

adapter: qlora
lora_model_dir:
lora_r: 64
lora_alpha: 64
lora_dropout: 0.125
lora_target_linear: true
lora_fan_in_fan_out:
lora_target_modules:
  - gate_proj
  - down_proj
  - up_proj
  - q_proj
  - v_proj
  - k_proj
  - o_proj
layers_to_transform:

trainable params: 59,965,440 || all params: 1,405,388,800 || trainable%: 4.266822106451966

winglian · 2024-01-14T21:23:30Z

README.md

@@ -674,7 +674,8 @@ lora_target_modules:
 #  - gate_proj
 #  - down_proj
 #  - up_proj
-lora_target_linear: # If true, will target all linear layers
+lora_target_linear: # If true, will target all linear modules
+peft_layers_to_transform: # The layer indices to transform, otherwise, apply to all layers


setting the prefix for this to peft_ since we are eventually going to migrate the other lora_ options soon.

What do you think of using peft_config dictionary which accepts the same keys as in the Peft config?

This would reduce the need for us to keep updating for each change.

The change would follow the same structure as cfg.model_config

…sed w unfrozen_parameters

winglian requested a review from NanoCode012 January 14, 2024 19:14

winglian reviewed Jan 14, 2024

View reviewed changes

xzuyn and others added 2 commits January 15, 2024 19:43

add layers_to_transform support

40ef185

update docs for peft_layers_to_transform, add tests, raise exc when u…

18719af

…sed w unfrozen_parameters

winglian force-pushed the layers_to_transform branch from d7e9d4a to 18719af Compare January 16, 2024 00:43

winglian merged commit 8487b97 into axolotl-ai-cloud:main Jan 16, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `layers_to_transform` for `lora_config` #1118

Add `layers_to_transform` for `lora_config` #1118

xzuyn commented Jan 14, 2024

winglian Jan 14, 2024

NanoCode012 Jan 15, 2024 •

edited

Loading

Add layers_to_transform for lora_config #1118

Add layers_to_transform for lora_config #1118

Conversation

xzuyn commented Jan 14, 2024

1 Layer

2 Layers

All Layers

winglian Jan 14, 2024

Choose a reason for hiding this comment

NanoCode012 Jan 15, 2024 • edited Loading

Choose a reason for hiding this comment

Add `layers_to_transform` for `lora_config` #1118

Add `layers_to_transform` for `lora_config` #1118

NanoCode012 Jan 15, 2024 •

edited

Loading