Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoModel abstraction fails for pre-training initialization #11953

Closed
g-karthik opened this issue May 31, 2021 · 3 comments
Closed

AutoModel abstraction fails for pre-training initialization #11953

g-karthik opened this issue May 31, 2021 · 3 comments

Comments

@g-karthik
Copy link

Environment info

  • transformers version: 4.5.1
  • Python version: 3.6
  • PyTorch version: 1.4+
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No

Who can help

@patrickvonplaten, @LysandreJik

Information

Model I am using: GPT-2

The problem arises when using:

  • [Y] my own modified scripts: (give details below)

To reproduce

Steps to reproduce the behavior:

from transformers import AutoConfig, AutoModelForCausalLM, GPT2LMHeadModel
config = AutoConfig.from_pretrained("gpt2", return_dict=True, gradient_checkpointing=False)

model_class = GPT2LMHeadModel
model = model_class(config)  # WORKS FINE

model_class = AutoModelForCausalLM
model = model_class(config)  # FAILS, stack trace below

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() takes 1 positional argument but 2 were given

Expected behavior

Both cases should work fine. The latter case should pull the former class internally.

@LysandreJik
Copy link
Member

Hello! We recommend you read the docs regarding the AutoModel. I have linked you the from_config method which should be used in this use case.

@LysandreJik
Copy link
Member

However, it is indeed unexpected for you to receive this error message. The message should be more explicit, investigating now.

@LysandreJik
Copy link
Member

Opened #11956 for a more explicit error, and opening your use case for discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants