Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove spurious warning #1258

Merged
merged 2 commits into from
Jun 6, 2024
Merged

Remove spurious warning #1258

merged 2 commits into from
Jun 6, 2024

Conversation

dakinggg
Copy link
Collaborator

@dakinggg dakinggg commented Jun 6, 2024

This warning isn't helpful (and is confusing for the finetuning case). Only text dataloader accepts eos/bos token id, to be used for per sequence attention masking on pretokenized and concatenated sequences. Essentially all tokenizers have an eos/bos, and if you use mpt with attn_uses_sequence_id (to enable per sequence masking) without an eos/bos specified, you will get an error separately from this one. finetuning dataloader does not accept this.

@dakinggg dakinggg marked this pull request as ready for review June 6, 2024 06:01
@dakinggg dakinggg requested a review from a team as a code owner June 6, 2024 06:01
@dakinggg dakinggg enabled auto-merge (squash) June 6, 2024 06:03
@dakinggg dakinggg merged commit 3966f0e into mosaicml:main Jun 6, 2024
9 checks passed
KuuCi pushed a commit that referenced this pull request Jun 7, 2024
(cherry picked from commit 3966f0e)
KuuCi pushed a commit that referenced this pull request Jun 7, 2024
KuuCi pushed a commit that referenced this pull request Jun 7, 2024
@dakinggg dakinggg deleted the spurious-warning branch June 22, 2024 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants