Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add finetuning doc #832

Merged
merged 15 commits into from
May 20, 2024
101 changes: 101 additions & 0 deletions docs/source/examples/finetuning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Finetuning

## Full Parameters

Full training updates all the parameters to finetune a language model.
Here is an example to finetune a GPT-2 base model.

```sh
cd data && ./download.sh alpaca && cd -

./scripts/run_finetune.sh \
--model_name_or_path gpt2 \
--dataset_path data/alpaca/train_conversation \
--output_model_path output_models/finetuned_gpt2
```

```{admonition} Conversation Template
:class: tip

For conversation dataset, specify a conversation template for better performance by adding `--conversation_template` to the command.
```

````{dropdown} Llama-3-8B conversation dataset example
```sh
cd data && ./download.sh alpaca && cd -

./scripts/run_finetune.sh \
--model_name_or_path meta-llama/Meta-Llama-3-8B \
--dataset_path data/alpaca/train_conversation \
--conversation_template llama3 \
--output_model_path output_models/finetuned_llama3_8b
```
````


## Layerwise Importance Sampled AdamW (LISA)

[LISA](https://arxiv.org/abs/2403.17919) is a memory-efficient finetuning algorithm that allows tradeoff between memory and the number of randomly unfreezed layers. This script currently is only tested in single gpus. Please stay tuned for our latest updates!

```sh
cd data && ./download.sh alpaca && cd -

./scripts/run_finetune_with_lisa.sh \
--model_name_or_path meta-llama/Llama-2-7b-hf \
--dataset_path data/alpaca/train_conversation \
--output_model_path output_models/finetuned_llama2_7b \
--lisa_activated_layers 1 \
--lisa_interval_steps 20
```

````{dropdown} Llama-2-7B conversation dataset example
```sh
cd data && ./download.sh alpaca && cd -

./scripts/run_finetune_with_lisa.sh \
--model_name_or_path meta-llama/Llama-2-7b-hf \
--dataset_path data/alpaca/train_conversation \
--conversation_template llama2 \
--output_model_path output_models/finetuned_llama2_7b_lisa \
--lisa_activated_layers 1 \
--lisa_interval_steps 20
```
````


## Low-Rank Adaptation (LoRA)

LoRA is a parameter-efficient finetuning algorithm and is more efficient than full finetuning.

```sh
cd data && ./download.sh alpaca && cd -

./scripts/run_finetune_with_lora.sh \
--model_name_or_path facebook/galactica-1.3b \
--dataset_path data/alpaca/train_conversation \
--output_lora_path output_models/finetuned_galactica_lora
```

````{admonition} Merge LoRA Weight
:class: tip

Merge LoRA weight and the base model into one using:
```sh
./scripts/run_merge_lora.sh \
--model_name_or_path Qwen/Qwen1.5-1.8B \
--lora_model_path output_models/lora \
--output_model_path output_models/lora_merged \
```
````

````{dropdown} Llama-2-7B conversation dataset example
```sh
cd data && ./download.sh alpaca && cd -

./scripts/run_finetune_with_lora.sh \
--model_name_or_path meta-llama/Llama-2-7b-hf \
--dataset_path data/alpaca/train_conversation \
--conversation_template llama2 \
--output_model_path output_models/finetuned_llama2_7b_lora \
```
````
8 changes: 7 additions & 1 deletion docs/source/examples/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,13 @@ checkpoints

## Finetuning

For SFT, Refer to [examples](https://github.com/OptimalScale/LMFlow/blob/main/examples).
For SFT,

```{toctree}
:maxdepth: 3

finetuning
```


For alignment process,
Expand Down
2 changes: 1 addition & 1 deletion docs/source/examples/reward_modeling.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ We prepare the dataset used for supervised finetuning by adding a prefix to the
}
```

See [Finetuning (Full)](../../../README.md#finetuning-full), [Finetuning (LISA)](../../../README.md#finetuning-lisa), and [Finetuning (LoRA)](../../../README.md#finetuning-lora) for more details on the finetuning process.
See [Finetuning (Full)](./finetuning.md#full-parameters), [Finetuning (LISA)](./finetuning.md#layerwise-importance-sampled-adamw-lisa), and [Finetuning (LoRA)](./finetuning.md#low-rank-adaptation-lora) for more details on the finetuning process.

## Step 2 Reward Modeling

Expand Down
Loading