feat: Add chunking function for sequence tagger training on sentences exceeding token limit #3520

MattGPT-ai · 2024-08-03T02:28:52Z

Adds a Sentence chunking function to allow SequenceTagger training on sentences exceeding the token limit.
Adds tests for this function

…ences exceeding the token limit, including tests

MattGPT-ai · 2024-08-09T17:54:51Z

Looks like 100% of my tests passed, but it still says my checks failed in the GitHub UI

alanakbik · 2024-08-09T18:38:48Z

We are getting a System.IO.IOException: No space left on device error for the unit tests as they seem to be taking up too much space. I tried removing some of the dataset downloads in the tests in #3526, but it seems its not enough to prevent this from happening.

MattGPT-ai · 2024-08-09T20:53:05Z

We are getting a System.IO.IOException: No space left on device error for the unit tests as they seem to be taking up too much space. I tried removing some of the dataset downloads in the tests in #3526, but it seems its not enough to prevent this from happening.

Is it possible to just download portions of the datasets? Like 100 samples or something sufficient for unit testing

MattGPT-ai force-pushed the GH-3519/add-sentence-chunking-method branch 5 times, most recently from de81c1f to 0b23ef6 Compare August 3, 2024 18:26

MattGPT-ai added 2 commits August 9, 2024 10:35

feat: add chunking function to allow sequence tagger training on sent…

683938f

…ences exceeding the token limit, including tests

--amend

7cf4d0f

MattGPT-ai force-pushed the GH-3519/add-sentence-chunking-method branch from b523769 to 7cf4d0f Compare August 9, 2024 17:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add chunking function for sequence tagger training on sentences exceeding token limit #3520

feat: Add chunking function for sequence tagger training on sentences exceeding token limit #3520

MattGPT-ai commented Aug 3, 2024

MattGPT-ai commented Aug 9, 2024

alanakbik commented Aug 9, 2024 •

edited

Loading

MattGPT-ai commented Aug 9, 2024

feat: Add chunking function for sequence tagger training on sentences exceeding token limit #3520

Are you sure you want to change the base?

feat: Add chunking function for sequence tagger training on sentences exceeding token limit #3520

Conversation

MattGPT-ai commented Aug 3, 2024

MattGPT-ai commented Aug 9, 2024

alanakbik commented Aug 9, 2024 • edited Loading

MattGPT-ai commented Aug 9, 2024

alanakbik commented Aug 9, 2024 •

edited

Loading