attempting to let meteor handle multiple references per prediction #164

sashavor · 2022-06-28T15:56:47Z

I'm not sure I'm doing the word_tokenize() correctly in line 27 -- @lvwerra can you please help?

HuggingFaceDocBuilderDev · 2022-06-28T16:02:03Z

The documentation is not available anymore as the PR was closed or merged.

metrics/meteor/meteor.py

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

sashavor · 2022-06-28T19:30:49Z

Thank you @lvwerra !
One thing I can't seem to replicate is the example from NLTK:

    >>> reference1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that', 'ensures', 'that', 'the', 'military', 'will', 'forever', 'heed', 'Party', 'commands']
    >>> reference2 = ['It', 'is', 'the', 'guiding', 'principle', 'which', 'guarantees', 'the', 'military', 'forces', 'always', 'being', 'under', 'the', 'command', 'of', 'the', 'Party']
    >>> reference3 = ['It', 'is', 'the', 'practical', 'guide', 'for', 'the', 'army', 'always', 'to', 'heed', 'the', 'directions', 'of', 'the', 'party']

    >>> round(meteor_score([reference1, reference2, reference3], hypothesis1),4)
    0.7398

When I do:

hypothesis1=  ['It is a guide to action which ensures that the military always obeys the commands of the party']
>>> references = [['It is a guide to action that ensures that the military will forever heed Party commands', 'It is the guiding principle which guarantees the military forces always being under the command of the Party', 'It is the practical guide for the army always to heed the directions of the party']]
>>> results = meteor.compute(predictions=hypothesis1, references=references)
>>> results
{'meteor': 0.6944444444444445}

The alpha, beta and gamma are all the same as NLTK.

Do you have any ideas? do you think it's the tokenizer?...

lvwerra · 2022-06-29T08:07:42Z

If you execute the first NLTK example yourself you get the same results? Or are they also different?

sashavor · 2022-06-29T15:12:00Z

Nope 😕

>>> hypothesis1 = ['It is a guide to action which ensures that the military always obeys the commands of the party']
>>> reference1 = ['It is a guide to action that ensures that the military will forever heed Party commands']
>>> results = meteor.compute(predictions=hypothesis1, references=reference1)
>>> results
{'meteor': 0.6944444444444445}

Whereas the result given by NLTK is 0.7398

This is weird, right? since we're using it under the hood?

lvwerra · 2022-06-29T15:39:32Z

I just checked the NLTK example. It seems to be an issue with the version on their side: They switched from sentences to word tokens between 3.6 and 3.7 which is where the change must have happened:

with nltk==3.6.0 (note we need to join the examples):

hypothesis1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which', 'ensures', 'that', 'the', 'military', 'always', 'obeys', 'the', 'commands', 'of', 'the', 'party']

reference1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that', 'ensures', 'that', 'the', 'military', 'will', 'forever', 'heed', 'Party', 'commands']
reference2 = ['It', 'is', 'the', 'guiding', 'principle', 'which', 'guarantees', 'the', 'military', 'forces', 'always', 'being', 'under', 'the', 'command', 'of', 'the', 'Party']
reference3 = ['It', 'is', 'the', 'practical', 'guide', 'for', 'the', 'army', 'always', 'to', 'heed', 'the', 'directions', 'of', 'the', 'party']

round(meteor_score([" ".join(reference1), " ".join(reference2), " ".join(reference3)], " ".join(hypothesis1)),4)

>>> 0.7298

in nltk==3.7.0:

hypothesis1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which', 'ensures', 'that', 'the', 'military', 'always', 'obeys', 'the', 'commands', 'of', 'the', 'party']

reference1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that', 'ensures', 'that', 'the', 'military', 'will', 'forever', 'heed', 'Party', 'commands']
reference2 = ['It', 'is', 'the', 'guiding', 'principle', 'which', 'guarantees', 'the', 'military', 'forces', 'always', 'being', 'under', 'the', 'command', 'of', 'the', 'Party']
reference3 = ['It', 'is', 'the', 'practical', 'guide', 'for', 'the', 'army', 'always', 'to', 'heed', 'the', 'directions', 'of', 'the', 'party']

round(meteor_score([reference1, reference2, reference3], hypothesis1),4)

>>> 0.6944

So I would not worry about it in this PR but maybe open an issue on NLTK.

sashavor · 2022-06-30T19:35:28Z

Ok great! I'll open a PR with NLTK then 😄

lvwerra

Thanks a lot for working on this! A few minor comments. In addition, I think we should also document this in the README, right?

metrics/meteor/meteor.py

lvwerra · 2022-07-01T13:27:14Z

metrics/meteor/meteor.py

-                meteor_score.single_meteor_score(ref, pred, alpha=alpha, beta=beta, gamma=gamma)
-                for ref, pred in zip(references, predictions)
-            ]
+            if any(isinstance(el, list) for el in references):


same as above. maybe you can also just check at the beginning once:

multiple_refs = isinstance(references[0], list)

I made the change (I think) -- is this what you had in mind?

Also updated the README!

adding comment about NLTK version

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

checking first element

Updating README to reflect changes

lvwerra

Just a minor comment about reusing multiple_refs. LGTM 🚀

metrics/meteor/meteor.py

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

attempting to let meteor handle multiple references per prediction

90595b1

sashavor requested a review from lvwerra June 28, 2022 15:56

running make

987995c

lvwerra reviewed Jun 28, 2022

View reviewed changes

metrics/meteor/meteor.py Outdated Show resolved Hide resolved

Update metrics/meteor/meteor.py

47f8f91

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

sashavor marked this pull request as ready for review June 30, 2022 19:35

lvwerra reviewed Jul 1, 2022

View reviewed changes

sashavor requested a review from lvwerra July 4, 2022 17:14

Sasha Luccioni and others added 6 commits July 4, 2022 14:07

Update meteor.py

d408c83

adding comment about NLTK version

Update metrics/meteor/meteor.py

f2cf3c1

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Update meteor.py

a56935f

checking first element

Update README.md

7d7555b

Updating README to reflect changes

running make

e9e795c

Merge branch 'main' into meteor-modification

c4f7462

lvwerra approved these changes Jul 6, 2022

View reviewed changes

metrics/meteor/meteor.py Outdated Show resolved Hide resolved

metrics/meteor/meteor.py Outdated Show resolved Hide resolved

metrics/meteor/meteor.py Outdated Show resolved Hide resolved

Sasha Luccioni and others added 3 commits July 6, 2022 12:02

Update metrics/meteor/meteor.py

f2d7dad

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Update metrics/meteor/meteor.py

1690a59

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Update metrics/meteor/meteor.py

87ffe68

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

sashavor merged commit f62ad7c into main Jul 6, 2022

sashavor deleted the meteor-modification branch July 6, 2022 16:13

lvwerra mentioned this pull request Jul 7, 2022

Rouge and Meteor for multiple references #118

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

attempting to let meteor handle multiple references per prediction #164

attempting to let meteor handle multiple references per prediction #164

sashavor commented Jun 28, 2022

HuggingFaceDocBuilderDev commented Jun 28, 2022 •

edited

Loading

sashavor commented Jun 28, 2022

lvwerra commented Jun 29, 2022

sashavor commented Jun 29, 2022

lvwerra commented Jun 29, 2022

sashavor commented Jun 30, 2022

lvwerra left a comment

lvwerra Jul 1, 2022

sashavor Jul 4, 2022

sashavor Jul 4, 2022

lvwerra left a comment

attempting to let meteor handle multiple references per prediction #164

attempting to let meteor handle multiple references per prediction #164

Conversation

sashavor commented Jun 28, 2022

HuggingFaceDocBuilderDev commented Jun 28, 2022 • edited Loading

sashavor commented Jun 28, 2022

lvwerra commented Jun 29, 2022

sashavor commented Jun 29, 2022

lvwerra commented Jun 29, 2022

sashavor commented Jun 30, 2022

lvwerra left a comment

Choose a reason for hiding this comment

lvwerra Jul 1, 2022

Choose a reason for hiding this comment

sashavor Jul 4, 2022

Choose a reason for hiding this comment

sashavor Jul 4, 2022

Choose a reason for hiding this comment

lvwerra left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jun 28, 2022 •

edited

Loading