[MLOps 1.5] Expand the built-ins: NLP #865

dbogunowicz · 2023-01-16T15:05:07Z

Feature description

Added built-in functions: 'mean_score, percent_zero_labels (token classification) answer_length, answer_score, answer_found (question answering), sequence_length (general nlp function)`.
Added exhaustive unit tests.
Cleaned up the importing structure to avoid circular dependencies

Coverage:

As specified in the PRD:

nlp_sequence_length -->  covered by "sequence_length "
nlp_percent_unk_tokens --> not covered; see comments below
nlp_answer_found_qa - 1 if yes, 0 if no --> covered by "answer_found"
nlp_answer_length_qa - integer length --> covered by "answer_length"
nlp_score_qa - 0-1 float representing confidence in answer --> covered by "answer_score"
nlp_percent_label_0_token_classification - 0-1 float --> covered by percent_zero_labels
nlp_mean_score_token_classification - 0-1 float --> covered by  "mean score"

Few Comments:

We still need to think about a good naming framework, to avoid ambiguity and possible dupes of the functions naming across integrations. Let's discuss it.
For computing the percentage of the unknown tokens, I am afraid that currently we are heavily constrained. This information is present during the tokenization process but is not available inside the engine_inputs. Hence, we are currently not able to quickly compute this value in a simple, monadic function.

…agic/deepsparse into feature/damian/nlp_funcs

src/deepsparse/loggers/metric_functions/natural_language_processing/built_ins.py

...psparse/loggers/metric_functions/natural_language_processing/question_answering/built_ins.py

...parse/loggers/metric_functions/natural_language_processing/token_classification/built_ins.py

bfineran

LGTM for now, but absolutely need to get something in place to handle item-wise functions

KSGulin

LGTM!

tests/deepsparse/loggers/metric_functions/natural_language_processing/built_ins.py

bogunowicz@arrival.com and others added 9 commits January 11, 2023 17:21

initial commit

625ecc1

Merge branch 'main' of https://github.com/neuralmagic/deepsparse

98871bd

WIP

1db5bc4

Merge remote-tracking branch 'origin/main' into feature/damian/nlp_funcs

a246d9c

tests passing

f78dc9e

Delete proposal.md

6b26bdd

tests green

3fb247c

Merge branch 'feature/damian/nlp_funcs' of https://github.com/neuralm…

18cbbac

…agic/deepsparse into feature/damian/nlp_funcs

Merge branch 'main' into feature/damian/nlp_funcs

7dcf48b

dbogunowicz marked this pull request as ready for review January 17, 2023 08:23

dbogunowicz requested a review from bfineran January 17, 2023 08:23

ready for reviews

82ebc71

bfineran requested changes Jan 18, 2023

View reviewed changes

dbogunowicz and others added 2 commits January 18, 2023 18:43

Merge branch 'main' into feature/damian/nlp_funcs

d67b2c0

corrections

6028f1f

bfineran approved these changes Jan 19, 2023

View reviewed changes

KSGulin approved these changes Jan 20, 2023

View reviewed changes

tests/deepsparse/loggers/metric_functions/natural_language_processing/built_ins.py Show resolved Hide resolved

dbogunowicz merged commit e4a05c0 into main Jan 20, 2023

dbogunowicz deleted the feature/damian/nlp_funcs branch January 20, 2023 12:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLOps 1.5] Expand the built-ins: NLP #865

[MLOps 1.5] Expand the built-ins: NLP #865

dbogunowicz commented Jan 16, 2023 •

edited

Loading

bfineran left a comment

KSGulin left a comment

[MLOps 1.5] Expand the built-ins: NLP #865

[MLOps 1.5] Expand the built-ins: NLP #865

Conversation

dbogunowicz commented Jan 16, 2023 • edited Loading

Feature description

Coverage:

Few Comments:

bfineran left a comment

Choose a reason for hiding this comment

KSGulin left a comment

Choose a reason for hiding this comment

dbogunowicz commented Jan 16, 2023 •

edited

Loading