Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16 #844

Closed
6 of 8 tasks
griff4692 opened this issue Nov 10, 2023 · 5 comments
Closed
6 of 8 tasks
Labels
bug Something isn't working

Comments

@griff4692
Copy link

Please check that this issue hasn't been reported before.

  • I searched previous Bug Reports didn't find any similar reports.

Expected Behavior

I fine-tuned Mistral with axolotl using bf16 precision

I want to generate from this fine-tuned model: /path-to-my-fined-tuned-checkpoint/checkpoint-500

There is a dtype mismatch.

Current behaviour

  File "/home/ga2530/axolotl-bhc/scripts/sent_inference_utils.py", line 570, in run_prompt
    generated = model.generate(
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/transformers/generation/utils.py", line 1652, in generate
    return self.sample(
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/transformers/generation/utils.py", line 2734, in sample
    outputs = self(
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/accelerate/hooks.py", line 164, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py", line 1045, in forward
    outputs = self.model(
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py", line 932, in forward
    layer_outputs = decoder_layer(
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/accelerate/hooks.py", line 164, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py", line 621, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/accelerate/hooks.py", line 164, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py", line 342, in forward
    query_states = self.q_proj(hidden_states)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/accelerate/hooks.py", line 164, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/ga2530/miniconda3/envs/ax/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16

Steps to reproduce

    config = Path(os.path.expanduser('path-to-my-config.yml'))
    parsed_cfg = load_cfg(config)
    parsed_cfg.sample_packing = False
    # My fine-tuned checkpoint
    parsed_cfg.base_model_config = '/path-to-my-fined-tuned-checkpoint'
    parsed_cfg.base_model = '/path-to-my-fined-tuned-checkpoint/checkpoint-500'
    parser = transformers.HfArgumentParser((TrainerCliArgs))
    parsed_cli_args, _ = parser.parse_args_into_dataclasses(
        return_remaining_strings=True
    )
    parsed_cli_args.inference = True
    model, tokenizer = load_model_and_tokenizer(cfg=cfg, cli_args=cli_args)

    prompt = "DEBUG"
    batch = tokenizer(prompt, return_tensors="pt", add_special_tokens=True)

    model.eval()
    with torch.no_grad():
        generation_config = GenerationConfig(
            repetition_penalty=1.1,
            max_new_tokens=1024,
            temperature=0.9,
            top_p=0.95,
            top_k=40,
            bos_token_id=tokenizer.bos_token_id,
            eos_token_id=tokenizer.eos_token_id,
            pad_token_id=tokenizer.pad_token_id,
            do_sample=True,
            use_cache=True,
            return_dict_in_generate=True,
            output_attentions=False,
            output_hidden_states=False,
            output_scores=False,
        )
        generated = model.generate(
            inputs=batch["input_ids"].to(cfg.device),
            generation_config=generation_config,
        )

Config yaml

base_model: mistralai/Mistral-7B-Instruct-v0.1
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: true

load_in_8bit: false
load_in_4bit: false
strict: false

datasets:
  - path: /nlp/projects/summarization/bhc_data_cleanup/prompt_sent_frost_instruct.jsonl
    type: summarizetldr

dataset_prepared_path:
val_set_size: 0.005
output_dir: /nlp/projects/summarization/bhc_data_cleanup/mistral_weights/sent_frost_instruct

sequence_len: 8192
sample_packing: false
pad_to_sequence_len: true

wandb_project: mistral
wandb_entity: griffinadams
wandb_watch:
wandb_run_id: sent_frost_instruct
wandb_log_model:

gradient_accumulation_steps: 8
micro_batch_size: 1
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.000005

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
eval_steps: 100
eval_table_size:
eval_table_max_new_tokens: 128
save_steps: 100
save_strategy: steps
debug:
deepspeed: /home/ga2530/axolotl/deepspeed/zero2.json
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
  bos_token: "<s>"
  eos_token: "</s>"
  unk_token: "<unk>"

Possible solution

I tried

with torch.cuda.amp.autocast() but that did not work

Which Operating Systems are you using?

  • Linux
  • macOS
  • Windows

Python Version

3.9

axolotl branch-commit

main/f544ab2bed513bef269e6887d35c8aa12a852473

Acknowledgements

  • My issue title is concise, descriptive, and in title casing.
  • I have searched the existing issues to make sure this bug has not been reported yet.
  • I am using the latest version of axolotl.
  • I have provided enough information for the maintainers to reproduce and diagnose the issue.
@griff4692 griff4692 added the bug Something isn't working label Nov 10, 2023
@griff4692
Copy link
Author

I'm able to resolve the issue by casting the model to bf16

model = model.to(torch.bfloat16)

but not sure if this is the best way to do it in this codebase

@winglian
Copy link
Collaborator

@griff4692 my guess is the issue is in the deepspeed json configuration during training.

@mathiasesn
Copy link

mathiasesn commented Nov 22, 2023

Also a problem for me with NO deepspeed.json configuration used. A simple fix would be to:

if cfg.bf16:
    model = model.to(torch.bfloat16)

For example here and here.

@timothylimyl
Copy link
Contributor

Hi,

tested on 13/12/23, same issue still appears (tested with mistral):

RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16

This issue is caused in the linear layer in torch.nn.linear:

    def forward(self, input: Tensor) -> Tensor:
        return F.linear(input, self.weight, self.bias)

Basically, there's a mismatch here with self.weight dtype being bf16 while input dtype being torch.float32.

I think the fix needs to be done here:
https://github.com/OpenAccess-AI-Collective/axolotl/blob/main/src/axolotl/utils/models.py#L174

We can add a casting to appropriate dtype here via the model config. Let me know what you think, I can make a PR.

@NanoCode012
Copy link
Collaborator

Closed thanks to @taziksh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants