Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Wilcoxon's signed rank test #237

Merged
merged 3 commits into from
Aug 11, 2022
Merged

Add Wilcoxon's signed rank test #237

merged 3 commits into from
Aug 11, 2022

Conversation

douwekiela
Copy link
Contributor

Figured it'd be good to add a few more comparisons

@douwekiela douwekiela requested a review from lvwerra August 9, 2022 06:27

```python
wilcoxon = evaluate.load("wilcoxon")
results = wilcoxon.compute(predictions1=[-7, 123, 43, 4, 5], predictions2=[1337, -9, 1, 2, 3])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm should probably make some of these floats to make clear that those are allowed too

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 9, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @douwekiela, thanks for the super clean PR. Left a comment regarding the feature types related to your own comment.

Comment on lines 64 to 69
features=datasets.Features(
{
"predictions1": datasets.Value("int64"),
"predictions2": datasets.Value("int64"),
}
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want it to work for both int and float you can pass a list of dataset.Features and it then automatically detects which one works. You can have a look at BLEU. Alternatively, I think floats would probably work both cases anyway, no?

@douwekiela douwekiela merged commit 3cd38e2 into main Aug 11, 2022
@douwekiela douwekiela deleted the wilcoxon branch August 11, 2022 13:01
mathemakitten pushed a commit that referenced this pull request Aug 15, 2022
Add Wilcoxon's signed rank test for comparing model predictions, e.g. for testing whether the difference in BLEU score between two models is significant.
mathemakitten pushed a commit that referenced this pull request Sep 23, 2022
Add Wilcoxon's signed rank test for comparing model predictions, e.g. for testing whether the difference in BLEU score between two models is significant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants