Skip to content

Commit

Permalink
simplify by removing duplicate base_model_config (#772)
Browse files Browse the repository at this point in the history
  • Loading branch information
winglian committed Oct 23, 2023
1 parent 44c9d01 commit 2d8def6
Show file tree
Hide file tree
Showing 39 changed files with 10 additions and 42 deletions.
1 change: 0 additions & 1 deletion examples/cerebras/btlm-ft.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: cerebras/btlm-3b-8k-base
base_model_config: cerebras/btlm-3b-8k-base
model_type: AutoModelForCausalLM
tokenizer_type: GPT2Tokenizer
trust_remote_code: true
Expand Down
1 change: 0 additions & 1 deletion examples/cerebras/qlora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: cerebras/Cerebras-GPT-1.3B
base_model_config: cerebras/Cerebras-GPT-1.3B
load_in_8bit: false
load_in_4bit: true
strict: false
Expand Down
1 change: 0 additions & 1 deletion examples/code-llama/13b/lora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: codellama/CodeLlama-13b-hf
base_model_config: codellama/CodeLlama-13b-hf
model_type: LlamaForCausalLM
tokenizer_type: CodeLlamaTokenizer
is_llama_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/code-llama/13b/qlora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: codellama/CodeLlama-13b-hf
base_model_config: codellama/CodeLlama-13b-hf
model_type: LlamaForCausalLM
tokenizer_type: CodeLlamaTokenizer
is_llama_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/code-llama/34b/lora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: codellama/CodeLlama-34b-hf
base_model_config: codellama/CodeLlama-34b-hf
model_type: LlamaForCausalLM
tokenizer_type: CodeLlamaTokenizer
is_llama_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/code-llama/34b/qlora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: codellama/CodeLlama-34b-hf
base_model_config: codellama/CodeLlama-34b-hf
model_type: LlamaForCausalLM
tokenizer_type: CodeLlamaTokenizer
is_llama_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/code-llama/7b/lora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: codellama/CodeLlama-7b-hf
base_model_config: codellama/CodeLlama-7b-hf
model_type: LlamaForCausalLM
tokenizer_type: CodeLlamaTokenizer
is_llama_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/code-llama/7b/qlora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: codellama/CodeLlama-7b-hf
base_model_config: codellama/CodeLlama-7b-hf
model_type: LlamaForCausalLM
tokenizer_type: CodeLlamaTokenizer
is_llama_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/falcon/config-7b-lora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: tiiuae/falcon-7b
base_model_config: tiiuae/falcon-7b
trust_remote_code: true
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
Expand Down
1 change: 0 additions & 1 deletion examples/falcon/config-7b-qlora.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# 1b: tiiuae/falcon-rw-1b
# 40b: tiiuae/falcon-40b
base_model: tiiuae/falcon-7b
base_model_config: tiiuae/falcon-7b
# required by falcon custom model code: https://huggingface.co/tiiuae/falcon-7b/tree/main
trust_remote_code: true
model_type: AutoModelForCausalLM
Expand Down
1 change: 0 additions & 1 deletion examples/falcon/config-7b.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: tiiuae/falcon-7b
base_model_config: tiiuae/falcon-7b
trust_remote_code: true
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
Expand Down
1 change: 0 additions & 1 deletion examples/gptj/qlora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: EleutherAI/gpt-j-6b
base_model_config: EleutherAI/gpt-j-6b
load_in_8bit: false
load_in_4bit: true
strict: false
Expand Down
1 change: 0 additions & 1 deletion examples/jeopardy-bot/config.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: huggyllama/llama-7b
base_model_config: huggyllama/llama-7b
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
load_in_8bit: false
Expand Down
1 change: 0 additions & 1 deletion examples/llama-2/fft_optimized.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: NousResearch/Llama-2-7b-hf
base_model_config: NousResearch/Llama-2-7b-hf
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_llama_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/llama-2/gptq-lora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: TheBloke/Llama-2-7B-GPTQ
base_model_config: TheBloke/Llama-2-7B-GPTQ
is_llama_derived_model: false
gptq: true
gptq_disable_exllama: true
Expand Down
1 change: 0 additions & 1 deletion examples/llama-2/lora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: NousResearch/Llama-2-7b-hf
base_model_config: NousResearch/Llama-2-7b-hf
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_llama_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/llama-2/qlora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: NousResearch/Llama-2-7b-hf
base_model_config: NousResearch/Llama-2-7b-hf
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_llama_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/llama-2/relora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: NousResearch/Llama-2-7b-hf
base_model_config: NousResearch/Llama-2-7b-hf
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_llama_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/llama-2/tiny-llama.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: PY007/TinyLlama-1.1B-step-50K-105b
base_model_config: PY007/TinyLlama-1.1B-step-50K-105b

model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
Expand Down
1 change: 0 additions & 1 deletion examples/mistral/config.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: mistralai/Mistral-7B-v0.1
base_model_config: mistralai/Mistral-7B-v0.1
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/mistral/qlora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: mistralai/Mistral-7B-v0.1
base_model_config: mistralai/Mistral-7B-v0.1
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: true
Expand Down
1 change: 0 additions & 1 deletion examples/mpt-7b/config.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: mosaicml/mpt-7b
base_model_config: mosaicml/mpt-7b
tokenizer_type: AutoTokenizer
trust_remote_code: true # required for mpt as their model class is not merged into transformers yet
load_in_8bit: false
Expand Down
1 change: 0 additions & 1 deletion examples/openllama-3b/config.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: openlm-research/open_llama_3b_v2
base_model_config: openlm-research/open_llama_3b_v2
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
load_in_8bit: false
Expand Down
1 change: 0 additions & 1 deletion examples/openllama-3b/lora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: openlm-research/open_llama_3b_v2
base_model_config: openlm-research/open_llama_3b_v2
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
load_in_8bit: true
Expand Down
1 change: 0 additions & 1 deletion examples/openllama-3b/qlora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: openlm-research/open_llama_3b_v2
base_model_config: openlm-research/open_llama_3b_v2
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
load_in_8bit: false
Expand Down
1 change: 0 additions & 1 deletion examples/phi/phi-ft.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: microsoft/phi-1_5
base_model_config: microsoft/phi-1_5
model_type: MixFormerSequentialForCausalLM
tokenizer_type: AutoTokenizer
is_llama_derived_model: false
Expand Down
1 change: 0 additions & 1 deletion examples/phi/phi-qlora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: microsoft/phi-1_5
base_model_config: microsoft/phi-1_5
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
is_llama_derived_model: false
Expand Down
1 change: 0 additions & 1 deletion examples/pythia-12b/config.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: EleutherAI/pythia-12b-deduped
base_model_config: EleutherAI/pythia-12b-deduped
base_model_ignore_patterns: pytorch* # prefer safetensors
model_type: GPTNeoXForCausalLM
tokenizer_type: AutoTokenizer
Expand Down
1 change: 0 additions & 1 deletion examples/pythia/lora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: EleutherAI/pythia-1.4b-deduped
base_model_config: EleutherAI/pythia-1.4b-deduped
load_in_8bit: true
datasets:
- path: teknium/GPT4-LLM-Cleaned
Expand Down
1 change: 0 additions & 1 deletion examples/redpajama/config-3b.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: togethercomputer/RedPajama-INCITE-Chat-3B-v1
base_model_config: togethercomputer/RedPajama-INCITE-Chat-3B-v1
model_type: GPTNeoXForCausalLM
tokenizer_type: AutoTokenizer
trust_remote_code:
Expand Down
1 change: 0 additions & 1 deletion examples/replit-3b/config-lora.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
base_model: replit/replit-code-v1-3b
base_model_config: replit/replit-code-v1-3b
trust_remote_code: true
load_in_8bit: false
datasets:
Expand Down
1 change: 0 additions & 1 deletion examples/xgen-7b/xgen-7b-8k-qlora.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# An example finetuning Saleforce's XGen-7b model with 8k context using qlora
# on Tim Dettmer's Guanaco dataset.
base_model: Salesforce/xgen-7b-8k-base
base_model_config: Salesforce/xgen-7b-8k-base
trust_remote_code: true
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
Expand Down
3 changes: 3 additions & 0 deletions src/axolotl/utils/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,9 @@ def normalize_config(cfg):

cfg.dataset_processes = cfg.dataset_processes or os.cpu_count()

if not cfg.base_model_config:
cfg.base_model_config = cfg.base_model

model_config = load_model_config(cfg)
cfg.model_config_type = model_config.model_type

Expand Down
1 change: 0 additions & 1 deletion tests/e2e/test_fused_llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ def test_fft_packing(self):
cfg = DictDefault(
{
"base_model": "JackFram/llama-68m",
"base_model_config": "JackFram/llama-68m",
"flash_attention": True,
"flash_attn_fuse_qkv": True,
"flash_attn_fuse_mlp": True,
Expand Down
3 changes: 0 additions & 3 deletions tests/e2e/test_lora_llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ def test_lora(self):
cfg = DictDefault(
{
"base_model": "JackFram/llama-68m",
"base_model_config": "JackFram/llama-68m",
"tokenizer_type": "LlamaTokenizer",
"sequence_len": 1024,
"load_in_8bit": True,
Expand Down Expand Up @@ -72,7 +71,6 @@ def test_lora_packing(self):
cfg = DictDefault(
{
"base_model": "JackFram/llama-68m",
"base_model_config": "JackFram/llama-68m",
"tokenizer_type": "LlamaTokenizer",
"sequence_len": 1024,
"sample_packing": True,
Expand Down Expand Up @@ -117,7 +115,6 @@ def test_lora_gptq(self):
cfg = DictDefault(
{
"base_model": "TheBlokeAI/jackfram_llama-68m-GPTQ",
"base_model_config": "TheBlokeAI/jackfram_llama-68m-GPTQ",
"model_type": "AutoModelForCausalLM",
"tokenizer_type": "LlamaTokenizer",
"sequence_len": 1024,
Expand Down
2 changes: 0 additions & 2 deletions tests/e2e/test_mistral.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ def test_lora(self):
cfg = DictDefault(
{
"base_model": "openaccess-ai-collective/tiny-mistral",
"base_model_config": "openaccess-ai-collective/tiny-mistral",
"flash_attention": True,
"sequence_len": 1024,
"load_in_8bit": True,
Expand Down Expand Up @@ -77,7 +76,6 @@ def test_ft(self):
cfg = DictDefault(
{
"base_model": "openaccess-ai-collective/tiny-mistral",
"base_model_config": "openaccess-ai-collective/tiny-mistral",
"flash_attention": True,
"sequence_len": 1024,
"val_set_size": 0.1,
Expand Down
2 changes: 0 additions & 2 deletions tests/e2e/test_mistral_samplepack.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ def test_lora_packing(self):
cfg = DictDefault(
{
"base_model": "openaccess-ai-collective/tiny-mistral",
"base_model_config": "openaccess-ai-collective/tiny-mistral",
"flash_attention": True,
"sample_packing": True,
"sequence_len": 1024,
Expand Down Expand Up @@ -78,7 +77,6 @@ def test_ft_packing(self):
cfg = DictDefault(
{
"base_model": "openaccess-ai-collective/tiny-mistral",
"base_model_config": "openaccess-ai-collective/tiny-mistral",
"flash_attention": True,
"sample_packing": True,
"sequence_len": 1024,
Expand Down
2 changes: 0 additions & 2 deletions tests/e2e/test_phi.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ def test_ft(self):
cfg = DictDefault(
{
"base_model": "microsoft/phi-1_5",
"base_model_config": "microsoft/phi-1_5",
"trust_remote_code": True,
"model_type": "MixFormerSequentialForCausalLM",
"tokenizer_type": "AutoTokenizer",
Expand Down Expand Up @@ -71,7 +70,6 @@ def test_ft_packed(self):
cfg = DictDefault(
{
"base_model": "microsoft/phi-1_5",
"base_model_config": "microsoft/phi-1_5",
"trust_remote_code": True,
"model_type": "MixFormerSequentialForCausalLM",
"tokenizer_type": "AutoTokenizer",
Expand Down
7 changes: 7 additions & 0 deletions tests/test_normalize_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,10 @@ def test_lr_as_float(self):
normalize_config(cfg)

assert cfg.learning_rate == 0.00005

def test_base_model_config_set_when_empty(self):
cfg = self._get_base_cfg()
del cfg.base_model_config
normalize_config(cfg)

assert cfg.base_model_config == cfg.base_model

0 comments on commit 2d8def6

Please sign in to comment.