Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run text model inputs one-by-one through model to avoid shape mismatch errors #773

Merged
merged 3 commits into from
May 29, 2024

Conversation

loostrum
Copy link
Member

@loostrum loostrum commented May 29, 2024

There was a difference between the movie review model runner used in the RISE text tutorial, vs LIME text, the tests and the dashboard. The LIME/dashboard/tests version first tokenizes all inputs, then runs the model once. The RISE version runs sentences through the model one-by-one. The latter is slower, but avoids the issues we've been seeing with special chars. The faster version could work if one makes sure the inputs all have the same shape, but this can be complicated so I suggest we go with the slower, but correct, version.

Fixes #531
Fixes #771
Fixes #751

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@loostrum loostrum requested review from cwmeijer and elboyran and removed request for elboyran and cwmeijer May 29, 2024 09:29
Copy link
Contributor

@elboyran elboyran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good solution for now. I will also merge to use it for my branch.

@elboyran elboyran merged commit 209a0ed into main May 29, 2024
17 checks passed
@cwmeijer cwmeijer deleted the fix-lime-text-special-chars branch May 30, 2024 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants