Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using model-defined truncation in perplexity calculation #333

Merged
merged 3 commits into from
Nov 1, 2022

Conversation

mathemakitten
Copy link
Contributor

If the user doesn't explicitly pass in max_length, we shouldn't truncate the inputs to perplexity at all. model.config.max_length is unreliable since it's named differently in every model (and sometimes not at all).

I'd also be in support of removing the truncation option in entirety, but this seems like a good compromise for retaining previous existing functionality.

Closes #332.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 31, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for fixing!

@mathemakitten mathemakitten merged commit 9f0f888 into main Nov 1, 2022
@mathemakitten mathemakitten deleted the hn-perplexity-cutoff branch November 1, 2022 14:41
NimaBoscarino pushed a commit to NimaBoscarino/evaluate that referenced this pull request Nov 9, 2022
…gface#333)

* Stop using model-defined truncation

* Formatting

* If start token and also max length defined
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug in computing perplexity about max_length
3 participants