Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] QA Pipeline fails on SQuAD with seq_len=128 #889

Merged
merged 3 commits into from
Jan 27, 2023

Conversation

dbogunowicz
Copy link
Contributor

Fix for the: https://app.asana.com/0/1201735099598270/1203822912826533/f

This PR addresses an issue where the default value for the doc_stride argument was set too high.
According to the documentation for the transformers library, the value of doc_stride should be smaller than the difference between max_seq_length and the sum of the length of the truncated question and the number of special tokens (sequence_added_tokens).

doc_stride < max_seq_length - len(truncated_question) - sequence_added_tokens

Specifically, for a max_seq_length of 128, assuming not special tokens, the doc_stride should be less than the length of the question string. This PR reduces the value of doc_stride to align with this guideline.

@dbogunowicz dbogunowicz merged commit 6ba4f12 into main Jan 27, 2023
@dbogunowicz dbogunowicz deleted the fix/damian/doc_stride branch January 27, 2023 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants