Automatically choose dataset split if none provided #232

mathemakitten · 2022-08-04T22:26:17Z

Previously, evaluator.compute(..., data='imdb', ....) would fail because it was returning an object of type dataset.DatasetDict. This automatically detects a split if none is given (i.e. user passes in the dataset by name and expects the Evaluator to load it, instead of preloading it themselves).

Closes #226.

HuggingFaceDocBuilderDev · 2022-08-04T22:29:22Z

The documentation is not available anymore as the PR was closed or merged.

lvwerra

That PR is in pretty good shape already!

I wonder if we should setup a dedicated load_dataset_from_string method that is called before prepare_data if a string is provided. Note that the question-answering and token classification have overwritten compute and prepare data so they would also need to be adapted.

Finally, it would be great to add a few tests to make sure this works and doesn't break in the future :)

src/evaluate/evaluator/base.py

src/evaluate/evaluator/utils.py

… server Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

…/evaluate into hn-dataset-as-string

src/evaluate/evaluator/question_answering.py

src/evaluate/evaluator/token_classification.py

mathemakitten · 2022-08-08T19:31:05Z

Hi @ola13, I actually put the original call to the parent class back in and returned the Dataset object instead, since I noticed that both the QA and token classification evaluators use columns from the dataset object in .compute() later. Let me know what you think!

src/evaluate/evaluator/utils.py

ola13

Looks good to me, although I'd wait for @lvwerra the weigh in before merging :)

src/evaluate/evaluator/utils.py

lvwerra

Thanks @mathemakitten for working on this! This looks good in general I have just a concern about the return of prepare_data. It feels a bit odd to return data although we don't always use it. I suggest decoupling the loading from the preparing a bit:

data = self.load_data(data, ...)
metric_inputs, pipeline_inputs = self.prepare_inputs(data, ...)

That way each method is more specialized and we can avoid the x, y, _ = and _, _, z = statements.

I think this will also make overwriting prepare_inputs easier as one can always assume it's a dataset. What do you think?

lvwerra

Awesome, generally looks good to me! Mostly left comments regarding the tests. What's left to do:

refactor tests
rebase
make sure CI runs

Then we can merge :)

tests/test_dataloaders.py

lhoestq

Cool ! a few comments:

src/evaluate/evaluator/base.py

tests/test_dataloaders.py

lhoestq · 2022-08-18T15:55:43Z

tests/test_dataloaders.py

+    e = evaluator("token-classification")
+
+    # Test passing in dataset object
+    data = load_dataset("conll2003", split="validation[:2]")


If you are using load_dataset in your CI, could you set the HF_UPDATE_DOWNLOAD_COUNTS=false environment variable in your CI ? This way it won't count in the download stats on the Hub

We have this in the conftest already:

evaluate/tests/conftest.py

Lines 38 to 41 in d5ecbe4

@pytest.fixture(autouse=True)

def set_update_download_counts_to_false(monkeypatch):

# don't take tests into account when counting downloads

monkeypatch.setattr("evaluate.config.HF_UPDATE_DOWNLOAD_COUNTS", False)

Do you think we need an extra one in the CI?

Edit: This should actually be datasets.config and not evaluate.config, right?

Looking at the other code in config.py just slightly above this, I think evaluate.config is correct?

You have to set both ideally ;)

src/evaluate/evaluator/base.py

lvwerra

Thanks for moving the tests! A few things we could add to the tests:

setup a few small dummy datasets with with different splits (only train, train/test, train/test/valid) and check if the right one is selected by the get_dataset_split method.
in addition to just checking if there is no error in the test_data_loading we could also check if the data that we get is actually the right one. I would make a differerence between test/train samples and then test if e.g. the first element is the one we expect.
if you don't mind, could you move the datasets from evaluate-ci to evaluate? there's nothing in that org except the links to the module orgs and the datasets with the images for the docs. so we don't have too many evaluate orgs.

mathemakitten · 2022-08-22T22:31:35Z

Getting the strangest error in CI here: TestQuestionAnsweringEvaluator.test_model_init is failing in pytest with AssertionError: 33.333333333333336 != 0 but running the test suite in debug mode clearly shows the correct test results as expected, and self.data shows the correct two examples as expected. It almost feels like an environment/configuration error... digging deeper.

edit: resolved by #272

lvwerra

Thanks for pushing this over the finish line! Looks great, just two small comments.

src/evaluate/evaluator/base.py

tests/test_evaluator.py

…ks for load_data

* Handle when data is passed as a string without a split selected * Cleanup * Add data_split kwarg for Evaluator.compute() * Change to use get_dataset_split_names instead of pinging the datasets server Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Cleanup * Docstring and cleanup * Incorporate changes into QA evaluator * Move SQuAD checking logic to prepare_data() * Token classification cases run * Remove debugging code * Refactor to move data object outside * Add tests + misc cleanup * Preferred order of splits * Move dev split before train split * Decouple data loading and preparation * Formatting * Remove data_split and cleanup tests * Fix docs * Cleanup and refactor tests * Datasets config attribute * Update tests to only use first example from dataset and refactor checks for load_data Co-authored-by: mathemakitten <helen@huggingface.co> Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

mathemakitten added 2 commits August 4, 2022 15:19

Handle when data is passed as a string without a split selected

058db42

Cleanup

eb48a5e

mathemakitten requested a review from lvwerra August 4, 2022 22:26

Add data_split kwarg for Evaluator.compute()

d2de2d2

lvwerra reviewed Aug 5, 2022

View reviewed changes

src/evaluate/evaluator/base.py Outdated Show resolved Hide resolved

src/evaluate/evaluator/base.py Outdated Show resolved Hide resolved

src/evaluate/evaluator/utils.py Outdated Show resolved Hide resolved

mathemakitten and others added 7 commits August 5, 2022 10:20

Change to use get_dataset_split_names instead of pinging the datasets…

1d587bb

… server Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Cleanup

307b830

Merge branch 'hn-dataset-as-string' of https://github.com/huggingface…

ec9bea0

…/evaluate into hn-dataset-as-string

Docstring and cleanup

fe7c0d5

Incorporate changes into QA evaluator

6122aef

Move SQuAD checking logic to prepare_data()

183ff8e

Token classification cases run

a433da5

mathemakitten commented Aug 5, 2022

View reviewed changes

src/evaluate/evaluator/question_answering.py Outdated Show resolved Hide resolved

Remove debugging code

d19aacc

mathemakitten commented Aug 5, 2022

View reviewed changes

src/evaluate/evaluator/token_classification.py Outdated Show resolved Hide resolved

mathemakitten commented Aug 5, 2022

View reviewed changes

src/evaluate/evaluator/token_classification.py Outdated Show resolved Hide resolved

ola13 suggested changes Aug 8, 2022

View reviewed changes

src/evaluate/evaluator/token_classification.py Outdated Show resolved Hide resolved

mathemakitten added 2 commits August 8, 2022 09:24

Refactor to move data object outside

aab8810

Add tests + misc cleanup

0cbb5db

mathemakitten requested review from ola13 and lhoestq August 8, 2022 19:31

ola13 reviewed Aug 10, 2022

View reviewed changes

src/evaluate/evaluator/utils.py Show resolved Hide resolved

src/evaluate/evaluator/utils.py Outdated Show resolved Hide resolved

Preferred order of splits

6940ae2

ola13 approved these changes Aug 11, 2022

View reviewed changes

src/evaluate/evaluator/utils.py Show resolved Hide resolved

Move dev split before train split

c6bcf77

lvwerra reviewed Aug 15, 2022

View reviewed changes

mathemakitten force-pushed the hn-dataset-as-string branch 2 times, most recently from 110f2a3 to 7bfa6d0 Compare August 15, 2022 18:56

mathemakitten force-pushed the hn-dataset-as-string branch from ec12464 to c6bcf77 Compare August 15, 2022 21:38

mathemakitten added 2 commits August 15, 2022 15:04

Decouple data loading and preparation

0d072c0

Formatting

b1bdcd2

mathemakitten requested a review from lvwerra August 15, 2022 22:12

lvwerra reviewed Aug 17, 2022

View reviewed changes

tests/test_dataloaders.py Outdated Show resolved Hide resolved

tests/test_dataloaders.py Outdated Show resolved Hide resolved

tests/test_dataloaders.py Outdated Show resolved Hide resolved

lhoestq reviewed Aug 18, 2022

View reviewed changes

mathemakitten added 2 commits August 18, 2022 10:06

Merge main

676b54b

Remove data_split and cleanup tests

ac5153f

mathemakitten force-pushed the hn-dataset-as-string branch from cee21f3 to ac5153f Compare August 19, 2022 02:36

Fix docs

fd65211

lvwerra reviewed Aug 19, 2022

View reviewed changes

src/evaluate/evaluator/base.py Show resolved Hide resolved

lvwerra reviewed Aug 19, 2022

View reviewed changes

mathemakitten force-pushed the hn-dataset-as-string branch from ebb1c5c to 11e42e9 Compare August 20, 2022 00:35

Cleanup and refactor tests

23bfbbd

mathemakitten force-pushed the hn-dataset-as-string branch from 1411801 to 23bfbbd Compare August 20, 2022 01:21

Merge branch 'main' into hn-dataset-as-string

f00fdb8

Merge branch 'main' into hn-dataset-as-string

cfa98fc

mathemakitten force-pushed the hn-dataset-as-string branch 3 times, most recently from 7ea0958 to cfa98fc Compare August 24, 2022 17:22

Datasets config attribute

fb3edea

mathemakitten requested a review from lvwerra August 25, 2022 01:52

lvwerra approved these changes Aug 26, 2022

View reviewed changes

src/evaluate/evaluator/base.py Outdated Show resolved Hide resolved

tests/test_evaluator.py Outdated Show resolved Hide resolved

Update tests to only use first example from dataset and refactor chec…

639fc93

…ks for load_data

mathemakitten merged commit 6239112 into main Aug 26, 2022

mathemakitten deleted the hn-dataset-as-string branch August 26, 2022 18:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically choose dataset split if none provided #232

Automatically choose dataset split if none provided #232

mathemakitten commented Aug 4, 2022

HuggingFaceDocBuilderDev commented Aug 4, 2022 •

edited

Loading

lvwerra left a comment

mathemakitten commented Aug 8, 2022

ola13 left a comment

lvwerra left a comment

lvwerra left a comment

lhoestq left a comment

lhoestq Aug 18, 2022

lvwerra Aug 19, 2022 •

edited

Loading

mathemakitten Aug 24, 2022

lhoestq Aug 24, 2022

mathemakitten Aug 24, 2022

lvwerra left a comment

mathemakitten commented Aug 22, 2022 •

edited

Loading

lvwerra left a comment

	@pytest.fixture(autouse=True)
	def set_update_download_counts_to_false(monkeypatch):
	# don't take tests into account when counting downloads
	monkeypatch.setattr("evaluate.config.HF_UPDATE_DOWNLOAD_COUNTS", False)

Automatically choose dataset split if none provided #232

Automatically choose dataset split if none provided #232

Conversation

mathemakitten commented Aug 4, 2022

HuggingFaceDocBuilderDev commented Aug 4, 2022 • edited Loading

lvwerra left a comment

Choose a reason for hiding this comment

mathemakitten commented Aug 8, 2022

ola13 left a comment

Choose a reason for hiding this comment

lvwerra left a comment

Choose a reason for hiding this comment

lvwerra left a comment

Choose a reason for hiding this comment

lhoestq left a comment

Choose a reason for hiding this comment

lhoestq Aug 18, 2022

Choose a reason for hiding this comment

lvwerra Aug 19, 2022 • edited Loading

Choose a reason for hiding this comment

mathemakitten Aug 24, 2022

Choose a reason for hiding this comment

lhoestq Aug 24, 2022

Choose a reason for hiding this comment

mathemakitten Aug 24, 2022

Choose a reason for hiding this comment

lvwerra left a comment

Choose a reason for hiding this comment

mathemakitten commented Aug 22, 2022 • edited Loading

lvwerra left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Aug 4, 2022 •

edited

Loading

lvwerra Aug 19, 2022 •

edited

Loading

mathemakitten commented Aug 22, 2022 •

edited

Loading