Rouge score not backward compatible as recall and precision are no longer returned #260

AndreaSottana · 2022-08-18T16:43:24Z

Hello

I have seen that in this PR #158
you have removed the recall and precision from the ROUGE score calculation, which now only returns F1 score.
May I ask why was this decision made, and why there doesn't seem to be an option to keep recall and precision in the returned output?

This is also a breaking change (in the sense that if I have some code written for evaluate==0.1.2 it will no longer work in evaluate==0.2.2)
Shouldn't a backward incompatible change require a major version bump according to https://semver.org ?

Thanks for the clarification

The text was updated successfully, but these errors were encountered:

lvwerra · 2022-08-18T17:11:54Z

Hi @AndreaSottana

Yes, this was a breaking change - we planned to do it before the initial release but it went under. There are a number of advantages moving from the RougeScore object that was returned to a pure python dict. If you find the recall and precision is useful we could add an option (e.g. detailed=True) to the compute call to return those as well.

We haven't had a full major release yet, so there might be some breaking changes here and there, but there are none planned for the core of metrics and we really want to avoid it.

Sorry for the inconvenience!

AndreaSottana · 2022-08-18T17:28:46Z

Thanks @lvwerra for your quick reply.

I definitely agree a pure python dictionary is much better, however I believe it would be possible to add recall and precision in a python dict without necessarily using the old RougeScore object.
Overall many summarization papers seem to report ROUGE scores based on F1, but some also use scores such as recall (for example for content selection) therefore I believe for researchers it would be valuable to have an option to see recall and precision (maybe in a pure python dict).
I'm happy to use the older version now that I've realised the issue, but perhaps if there is more demand for this detailed=True feature then it would be worth considering for the future.

Thanks again

hanane-djeddal · 2023-09-20T13:02:24Z

I was wondering if this thread was taken into consideration? Because with evaluate, the Rouge score still only reports one score and not the precision/recall/Fscore.

Thanks!

lvwerra closed this as completed Dec 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rouge score not backward compatible as recall and precision are no longer returned #260

Rouge score not backward compatible as recall and precision are no longer returned #260

AndreaSottana commented Aug 18, 2022 •

edited

Loading

lvwerra commented Aug 18, 2022

AndreaSottana commented Aug 18, 2022

hanane-djeddal commented Sep 20, 2023

Rouge score not backward compatible as recall and precision are no longer returned #260

Rouge score not backward compatible as recall and precision are no longer returned #260

Comments

AndreaSottana commented Aug 18, 2022 • edited Loading

lvwerra commented Aug 18, 2022

AndreaSottana commented Aug 18, 2022

hanane-djeddal commented Sep 20, 2023

AndreaSottana commented Aug 18, 2022 •

edited

Loading