Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for two input columns for TextClassificationEvaluator #205

Merged

Conversation

fxmarty
Copy link
Contributor

@fxmarty fxmarty commented Jul 26, 2022

This PR:

  • Refactors the docstrings to avoid duplicates in the Evaluator subclasses
  • Put all arguments in the compute() signature for subclasses of Evaluator, so as to be able to call the method without keyword arguments (had the issue with issue classification since we redefined the default input_column="image")
  • Adds support for two inputs columns (as in glue/mnli) for TextClassificationEvaluator

@lvwerra I added a DatasetColumnTextClassification because of the input expected by the transformers TextClassificationPipeline: https://github.com/huggingface/transformers/blob/d0acc9537829e7d067edbb791473bbceb2ecf056/src/transformers/pipelines/text_classification.py#L109-L111 . I did not manage to use DatasetColumn for it, could we do better than what I did? It's a bit hacky here. Probably memory is not so much sensitive for text compared to images, so we could not use these wrappers altogether?

This stills misses a bit of documentation.

Partially closes #196

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 27, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@ola13 ola13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Some comments inline.

For the future, I'd split such a PR in two - one for docstrings only and one for changes in the logic.

src/evaluate/evaluator/text_classification.py Outdated Show resolved Hide resolved
src/evaluate/evaluator/text_classification.py Outdated Show resolved Hide resolved
src/evaluate/evaluator/question_answering.py Outdated Show resolved Hide resolved
src/evaluate/evaluator/utils.py Outdated Show resolved Hide resolved
src/evaluate/evaluator/utils.py Outdated Show resolved Hide resolved
src/evaluate/evaluator/utils.py Outdated Show resolved Hide resolved
src/evaluate/evaluator/question_answering.py Outdated Show resolved Hide resolved
src/evaluate/evaluator/text_classification.py Outdated Show resolved Hide resolved
src/evaluate/evaluator/utils.py Outdated Show resolved Hide resolved
@lvwerra
Copy link
Member

lvwerra commented Jul 28, 2022

Looks great! Thanks for adding this, I left a a few minor comments. I agree with @ola13 that splitting such a PR in two would make it much easier to review (for next time) :)

@fxmarty
Copy link
Contributor Author

fxmarty commented Aug 1, 2022

Hello, thank you for your feedback! I will be careful next to break PRs in unitary changes.

@fxmarty fxmarty force-pushed the support-several-cols-text-classification branch from e878bcb to da602ef Compare August 1, 2022 07:31
@fxmarty
Copy link
Contributor Author

fxmarty commented Aug 1, 2022

@lvwerra @ola13 Feel free to re-review! Sorry for the lot of code change for a minimal new feature, will break it up in the future.

@fxmarty fxmarty requested review from ola13 and lvwerra August 1, 2022 08:12
Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🚀

src/evaluate/utils/doc.py Outdated Show resolved Hide resolved
@fxmarty fxmarty force-pushed the support-several-cols-text-classification branch from e38d8a4 to f3f5100 Compare August 8, 2022 13:12
Copy link
Contributor

@ola13 ola13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@lvwerra lvwerra merged commit ecffa33 into huggingface:main Aug 15, 2022
mathemakitten pushed a commit that referenced this pull request Aug 15, 2022
* added support for two columns

* style

* add doc utils

* style

* style

* feedbacks

* feedbacks

* Update src/evaluate/evaluator/text_classification.py

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* column check

* remove duplicate code

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
mathemakitten pushed a commit that referenced this pull request Sep 23, 2022
* added support for two columns

* style

* add doc utils

* style

* style

* feedbacks

* feedbacks

* Update src/evaluate/evaluator/text_classification.py

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* column check

* remove duplicate code

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support multiple input columns for TextClassificationEvaluator, regression subtasks
4 participants