Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add layers_to_transform for lora_config #1118

Merged
merged 2 commits into from
Jan 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -674,7 +674,8 @@ lora_target_modules:
# - gate_proj
# - down_proj
# - up_proj
lora_target_linear: # If true, will target all linear layers
lora_target_linear: # If true, will target all linear modules
peft_layers_to_transform: # The layer indices to transform, otherwise, apply to all layers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting the prefix for this to peft_ since we are eventually going to migrate the other lora_ options soon.

Copy link
Collaborator

@NanoCode012 NanoCode012 Jan 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of using peft_config dictionary which accepts the same keys as in the Peft config?

This would reduce the need for us to keep updating for each change.

The change would follow the same structure as cfg.model_config


# If you added new tokens to the tokenizer, you may need to save some LoRA modules because they need to know the new tokens.
# For LLaMA and Mistral, you need to save `embed_tokens` and `lm_head`. It may vary for other models.
Expand Down
5 changes: 5 additions & 0 deletions src/axolotl/utils/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,11 @@ def validate_config(cfg):
if cfg.adapter == "lora" and (cfg.flash_attn_fuse_qkv or cfg.flash_attn_fuse_mlp):
raise ValueError("Fused modules are not supported with LoRA")

if cfg.adapter and cfg.peft_layers_to_transform and cfg.unfrozen_parameters:
raise ValueError(
"`unfrozen_parameters` used with `peft_layers_to_transform` can have unexpected behavior."
)

if cfg.relora_steps:
if cfg.adapter not in ("lora", "qlora"):
raise ValueError("cfg.adapter must be lora or qlora to use ReLoRA")
Expand Down
1 change: 1 addition & 0 deletions src/axolotl/utils/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -733,6 +733,7 @@ def load_lora(model, cfg, inference=False):
r=cfg.lora_r,
lora_alpha=cfg.lora_alpha,
target_modules=lora_target_modules,
layers_to_transform=cfg.peft_layers_to_transform,
lora_dropout=cfg.lora_dropout,
fan_in_fan_out=cfg.lora_fan_in_fan_out,
modules_to_save=cfg.lora_modules_to_save if cfg.lora_modules_to_save else None,
Expand Down
15 changes: 15 additions & 0 deletions tests/test_validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -694,6 +694,21 @@ def test_warmup_step_no_conflict(self):

validate_config(cfg)

def test_unfrozen_parameters_w_peft_layers_to_transform(self):
cfg = DictDefault(
{
"adapter": "lora",
"unfrozen_parameters": ["model.layers.2[0-9]+.block_sparse_moe.gate.*"],
"peft_layers_to_transform": [0, 1],
}
)

with pytest.raises(
ValueError,
match=r".*can have unexpected behavior*",
):
validate_config(cfg)


class ValidationCheckModelConfig(BaseValidation):
"""
Expand Down