How to convert a AutoModelForCausalLM object to a dspy model object? #1018

pawanGithub10 · 2024-05-13T04:20:46Z

import dspy

llm = dspy.HFModel(model='model')

This method takes a string as input for the model if i have a quantized model object of the class AutoModelForCausalLM How i can convert the model object to dspy object?

direct assignment gives error on inference

llm = model #previously created as AutoModelForCausalLM class object

llm("Testing testing, is anyone out there?")

Error After Code Line 4 File /opt/conda/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py:623, in LlamaModel.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict) 621 raise ValueError("You cannot specify both decoder_input_ids and decoder_inputs_embeds at the same time") 622 elif input_ids is not None: --> 623 batch_size, seq_length = input_ids.shape 624 elif inputs_embeds is not None: 625 batch_size, seq_length, _ = inputs_embeds.shape

AttributeError: 'str' object has no attribute 'shape'

Anindyadeep · 2024-05-14T05:02:25Z

I see, but internally hf module internally uses AutoModel module to instantiate weights. So can you explain me why we require to use already loaded model to dspy instead of giving the weight path?

pawanGithub10 · 2024-05-14T06:11:48Z

I see, but internally hf module internally uses AutoModel module to instantiate weights. So can you explain me why we require to use already loaded model to dspy instead of giving the weight path?

Thanks for reply, The reason is I have a 4 bit quantized model and i want to use it directly. I have tried to save it first to hugging face so that i can load it from weight path but then error comes that the Hugging face does not support saving 4 bit quantized model.

Anindyadeep · 2024-05-14T06:15:22Z

can you please share the full code for the loading process and your approach? Would appreciate this.

pawanGithub10 · 2024-05-14T08:12:13Z

dspy_4bitquantized_llama2_error.zip
I have attached the jupyter notebook. In this notebook when I am converting the quantized model then it searches for the config.json as I am giving the AutoModel variable please suggest some workaround or API call to use the quantized model.

Anindyadeep · 2024-05-20T05:04:06Z

Hey @pawanGithub10 I have started to raise a PR by seeing the issue that you faced. Here are some of the cases of loading models would look like:

from dsp.modules.hf_new import HFLocalModel
from transformers import AutoTokenizer, BitsAndBytesConfig 
from transformers import AutoModelForCausalLM

model_path = "../models/llama-2-7b-chat-hf"

tokenizer = AutoTokenizer.from_pretrained(
    model_path,
)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"


def case1():
    model = HFLocalModel(
        model=model_path,
        tokenizer=tokenizer, 
        load_in_4bit=True ,
        bnb_config=BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_quant_type="nf4", 
            bnb_4bit_compute_dtype="float16", 
            bnb_4bit_use_double_quant=False
        )
    )

    response = model("hello", do_sample=True)
    print(response)


def case2():
    model = AutoModelForCausalLM.from_pretrained(
        model_path,
        quantization_config=BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_quant_type="nf4", 
            bnb_4bit_compute_dtype="float16", 
            bnb_4bit_use_double_quant=False
        )
    )

    model_ = HFLocalModel(
        model=model,
        tokenizer=tokenizer, 
    )
    response = model_("hello", do_sample=True)
    print(response)

if __name__ == "__main__":
    case1()
    print("---------------------------")
    case2()

Additionally, adding PEFT models are also now supported, with multi gpu support. Now the problem is, I would be able to test till PEFT, for multi gpu support, the tests are not possible, since have no access with multi gpu setting.

pawanGithub10 · 2024-05-22T04:01:11Z

@Anindyadeep thanks a lot for the detailed help but i feel that i have missed the documentation details.

nit signature:
dspy.HFModel(
model: str,
checkpoint: Optional[str] = None,
is_client: bool = False,
hf_device_map: Literal['auto', 'balanced', 'balanced_low_0', 'sequential'] = 'auto',
token: Optional[str] = None,
model_kwargs: Optional[dict] = {},
)
Docstring: Abstract class for language models.
Init docstring:

Args:
model (str): HF model identifier to load and use
checkpoint (str, optional): load specific checkpoints of the model. Defaults to None.
is_client (bool, optional): whether to access models via client. Defaults to False.
hf_device_map (str, optional): HF config strategy to load the model.
Recommeded to use "auto", which will help loading large models using accelerate. Defaults to "auto".
model_kwargs (dict, optional): additional kwargs to pass to the model constructor. Defaults to empty dict.
File: /opt/conda/lib/python3.11/site-packages/dsp/modules/hf.py
Type: ABCMeta
Subclasses: HFClientTGI, HFClientVLLM, Together, Anyscale, ChatModuleClient, HFClientSGLang

So after reading this i have made following changes and the code work

import dspy
model_specific_param = {"torch_dtype": torch.float16,'quantization_config':bnb_config}
model_name = '/tmp/models/llama2/7b'
llm = dspy.HFModel(model=model_name,model_kwargs = model_specific_param)

pawanGithub10 · 2024-05-22T04:02:47Z

As per the previous comment i think the issue can be closed

Anindyadeep mentioned this issue May 20, 2024

Refactor existing HuggingFace support to a more robust one #1044

Draft

8 tasks

pawanGithub10 closed this as completed May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to convert a AutoModelForCausalLM object to a dspy model object? #1018

How to convert a AutoModelForCausalLM object to a dspy model object? #1018

pawanGithub10 commented May 13, 2024

Anindyadeep commented May 14, 2024

pawanGithub10 commented May 14, 2024

Anindyadeep commented May 14, 2024

pawanGithub10 commented May 14, 2024 •

edited

Loading

Anindyadeep commented May 20, 2024 •

edited

Loading

pawanGithub10 commented May 22, 2024 •

edited

Loading

pawanGithub10 commented May 22, 2024

How to convert a AutoModelForCausalLM object to a dspy model object? #1018

How to convert a AutoModelForCausalLM object to a dspy model object? #1018

Comments

pawanGithub10 commented May 13, 2024

Anindyadeep commented May 14, 2024

pawanGithub10 commented May 14, 2024

Anindyadeep commented May 14, 2024

pawanGithub10 commented May 14, 2024 • edited Loading

Anindyadeep commented May 20, 2024 • edited Loading

pawanGithub10 commented May 22, 2024 • edited Loading

pawanGithub10 commented May 22, 2024

pawanGithub10 commented May 14, 2024 •

edited

Loading

Anindyadeep commented May 20, 2024 •

edited

Loading

pawanGithub10 commented May 22, 2024 •

edited

Loading