Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fingerprint for each EvaluationModule #206

Merged
merged 4 commits into from
Jul 29, 2022
Merged

Conversation

mathemakitten
Copy link
Contributor

In order to support #126 we need to fingerprint whichever EvaluationModule (metric, measurement, etc) we're using for later reproducibility.

This extracts the already-computed hash from each EvaluationModule and makes it easy to access in the evaluation_cls via module._fingerprint, similar to how in datasets you can do ds._fingerprint.

Test via

module = evaluate.load("lvwerra/element_count", module_type="measurement")
print(f"Module fingerprint: {module._fingerprint}")

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 26, 2022

The documentation is not available anymore as the PR was closed or merged.

@lvwerra lvwerra requested a review from lhoestq July 28, 2022 09:31
Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Also interested to hear what @lhoestq thinks. We plan to use this to cache evaluator computations.

Please wait with merging - want to do a minor release first.

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In datasets we use fingerprint to identify data, while we use hash to identify dataset scripts. For example a DatasetBuilder has a hash that identifies the code of the dataset script it is going to run.

Not a strong opinion but I think you can also name it hash for consistency

@lvwerra
Copy link
Member

lvwerra commented Jul 28, 2022

Sounds good to me - also have no strong opinion :)

@mathemakitten
Copy link
Contributor Author

Renamed, thanks for the clarification on hash vs. fingerprint!

@lvwerra lvwerra merged commit 9a10e58 into main Jul 29, 2022
@lvwerra lvwerra deleted the hn-fingerprint-evalmodule branch July 29, 2022 09:30
mathemakitten added a commit that referenced this pull request Aug 3, 2022
* Add fingerprint for Hub modules

* Rename evaluation module fingerprint to _hash

* fix typo

Co-authored-by: helen <helen@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants