Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include eval metrics on metadata card #1020

Open
5 tasks done
ittailup opened this issue Dec 29, 2023 · 1 comment
Open
5 tasks done

Include eval metrics on metadata card #1020

ittailup opened this issue Dec 29, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@ittailup
Copy link
Contributor

ittailup commented Dec 29, 2023

⚠️ Please check that this feature request hasn't been suggested before.

  • I searched previous Ideas in Discussions didn't find any similar feature requests.
  • I searched previous Issues didn't find any similar feature requests.

🔖 Feature description

We already have base_model, tags datasets added automatically as metadata on the model card. A few more things are supported:

image

✔️ Solution

We should include eval metrics results as per the example:

# Optional. Add this if you want to encode your eval results in a structured way.
model-index:
- name: {model_id}
  results:
  - task:
      type: {task_type}             # Required. Example: automatic-speech-recognition
      name: {task_name}             # Optional. Example: Speech Recognition
    dataset:
      type: {dataset_type}          # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
      name: {dataset_name}          # Required. A pretty name for the dataset. Example: Common Voice (French)
      config: {dataset_config}      # Optional. The name of the dataset configuration used in `load_dataset()`. Example: fr in `load_dataset("common_voice", "fr")`. See the `datasets` docs for more info: https://huggingface.co/docs/datasets/package_reference/loading_methods#datasets.load_dataset.name
      split: {dataset_split}        # Optional. Example: test
      revision: {dataset_revision}  # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb
      args:
        {arg_0}: {value_0}          # Optional. Additional arguments to `load_dataset()`. Example for wikipedia: language: en
        {arg_1}: {value_1}          # Optional. Example for wikipedia: date: 20220301
    metrics:
      - type: {metric_type}         # Required. Example: wer. Use metric id from https://hf.co/metrics
        value: {metric_value}       # Required. Example: 20.90
        name: {metric_name}         # Optional. Example: Test WER
        config: {metric_config}     # Optional. The name of the metric configuration used in `load_metric()`. Example: bleurt-large-512 in `load_metric("bleurt", "bleurt-large-512")`. See the `datasets` docs for more info: https://huggingface.co/docs/datasets/v2.1.0/en/loading#load-configurations
        args:
          {arg_0}: {value_0}        # Optional. The arguments passed during `Metric.compute()`. Example for `bleu`: max_order: 4
        verifyToken: {verify_token} # Optional. If present, this is a signature that can be used to prove that evaluation was generated by Hugging Face (vs. self-reported).
    source:                         # Optional. The source for this result.
      name: {source_name}           # Optional. The name of the source. Example: Open LLM Leaderboard.
      url: {source_url}             # Required if source is provided. A link to the source. Example: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard.
---

❓ Alternatives

No response

📝 Additional Context

No response

Acknowledgements

  • My issue title is concise, descriptive, and in title casing.
  • I have searched the existing issues to make sure this feature has not been requested yet.
  • I have provided enough information for the maintainers to understand and evaluate this request.
@ittailup ittailup added the enhancement New feature or request label Dec 29, 2023
@lingchensanwen
Copy link

I'm wondering if there's any updates about this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants