Update comet #443

ricardorei · 2023-04-05T14:08:23Z

COMET metric was recently updated to v2.0 and the predict interface returns a single class instead of two floats.

This pull request adds a simple verification to the package version and changes the behaviour accordingly.

Also, we updated the metric README which was slightly outdated. We recently released better and improved metrics that we developed for the WMT22 Metrics shared task. For versions above 2.0 the default metric is wmt22-comet-da instead of wmt20-comet-da. This new model performs better across language pairs and domain while being more interpretable. You can check the blogpost here.

manueldeprada · 2023-04-13T14:32:34Z

I am not a maintainer, but I have just tested this, and it works as advertised.

Thanks, @ricardorei, I hope this gets merged soon

metrics/comet/comet.py

BramVanroy · 2023-04-24T09:55:34Z

@lvwerra Friendly ping. Can this be merged (and maybe even with a pip upgrade)? Thanks!

joao-alves97 · 2023-05-03T14:39:28Z

This is not working as expected. I run the example:

source = ["Dem Feuer konnte Einhalt geboten werden", "Schulen und Kindergärten wurden eröffnet."]
hypothesis = ["The fire could be stopped", "Schools and kindergartens were open"]
reference = ["They were able to control the fire.", "Schools and kindergartens opened"]
comet_score = comet_metric.compute(predictions=hypothesis, references=reference, sources=source)

I expected to see a Comet score, but the output was:

>>> comet_score 
{'mean_score': 'system_score', 'scores': 'scores'}

This output clearly differs from what I expected. @ricardorei can you help? Thanks

BramVanroy · 2023-06-18T10:20:59Z

This is not working as expected. I run the example:

source = ["Dem Feuer konnte Einhalt geboten werden", "Schulen und Kindergärten wurden eröffnet."]
hypothesis = ["The fire could be stopped", "Schools and kindergartens were open"]
reference = ["They were able to control the fire.", "Schools and kindergartens opened"]
comet_score = comet_metric.compute(predictions=hypothesis, references=reference, sources=source)

I expected to see a Comet score, but the output was:

>>> comet_score 
{'mean_score': 'system_score', 'scores': 'scores'}

This output clearly differs from what I expected. @ricardorei can you help? Thanks

That is the expected score: mean_score is the COMET score (the mean of all sentence scores).

ricardorei · 2023-06-20T09:51:12Z

@BramVanroy @joao-alves97 what comet version are you using? I was not able to replicate what you are referring.

I tested unbabel-comet==1.1.3, 2.0.0 and 2.0.1

BramVanroy · 2023-06-20T09:57:31Z

@ricardorei I think what @joao-alves97 is saying that the output is a dictionary with keys mean_score and scores but they expected the output to be a float. But I clarified that that is not how it works, and that the mean_score is the aggreggated corpus score that they should use.

RicardoRei added 2 commits April 5, 2023 13:57

COMET Metric update for latest release

70ca582

update documentation for COMET metric

3fecf5d

ricardorei mentioned this pull request Apr 5, 2023

Unbabel/wmt22-comet-da model not working as part of Huggingface evaluate Unbabel/COMET#125

Open

manueldeprada approved these changes Apr 13, 2023

View reviewed changes

lvwerra reviewed Apr 13, 2023

View reviewed changes

metrics/comet/comet.py Show resolved Hide resolved

BramVanroy mentioned this pull request Jun 18, 2023

Move away from using evaluate BramVanroy/mateo-demo#7

Open

lvwerra approved these changes Jun 24, 2023

View reviewed changes

lvwerra merged commit af3c305 into huggingface:main Jun 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update comet #443

Update comet #443

ricardorei commented Apr 5, 2023

manueldeprada commented Apr 13, 2023

BramVanroy commented Apr 24, 2023

joao-alves97 commented May 3, 2023 •

edited

Loading

BramVanroy commented Jun 18, 2023

ricardorei commented Jun 20, 2023

BramVanroy commented Jun 20, 2023

Update comet #443

Update comet #443

Conversation

ricardorei commented Apr 5, 2023

manueldeprada commented Apr 13, 2023

BramVanroy commented Apr 24, 2023

joao-alves97 commented May 3, 2023 • edited Loading

BramVanroy commented Jun 18, 2023

ricardorei commented Jun 20, 2023

BramVanroy commented Jun 20, 2023

joao-alves97 commented May 3, 2023 •

edited

Loading