Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CharCut: another character-based MT evaluation metric #290

Merged
merged 18 commits into from
Dec 8, 2022
Merged

CharCut: another character-based MT evaluation metric #290

merged 18 commits into from
Dec 8, 2022

Conversation

BramVanroy
Copy link
Contributor

Similar to CharacTER, CharCut implements a character-based evaluation metric. First proposed in CHARCUT: Human-Targeted Character-Based MT Evaluation with Loose Differences.

Specifically, this implementation uses the repackaged version of the original for usability reasons.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 11, 2022

The documentation is not available anymore as the PR was closed or merged.

@BramVanroy
Copy link
Contributor Author

@lvwerra I fixed the doctest, and also updated the underlying charcut library so that we do not get annoying outputs printed to stdout anymore.

Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor things, then it's good to go 🚀

metrics/charcut_mt/.gitattributes Outdated Show resolved Hide resolved
metrics/charcut_mt/requirements.txt Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
Bram Vanroy and others added 4 commits December 6, 2022 12:31
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
@lvwerra lvwerra mentioned this pull request Dec 6, 2022
@lvwerra
Copy link
Member

lvwerra commented Dec 7, 2022

Could you also merge main into your branch again, the new CI is merged :)

@BramVanroy
Copy link
Contributor Author

@lvwerra So I had a look to convert this to a multi-reference format, but I am not sure how to handle this. CharCUT is calculated on the document-level. So I do not think it is feasible to add multiple references here.

@lvwerra
Copy link
Member

lvwerra commented Dec 8, 2022

Ok, then let's leave it as is.

Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, you could also delete the tests here.

@BramVanroy
Copy link
Contributor Author

Done!

@lvwerra lvwerra merged commit 83129c0 into huggingface:main Dec 8, 2022
@BramVanroy BramVanroy deleted the charcut branch December 8, 2022 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants