You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We should include eval metrics results as per the example:
# Optional. Add this if you want to encode your eval results in a structured way.
model-index:
- name: {model_id}
results:
- task:
type: {task_type} # Required. Example: automatic-speech-recognition
name: {task_name} # Optional. Example: Speech Recognition
dataset:
type: {dataset_type} # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
name: {dataset_name} # Required. A pretty name for the dataset. Example: Common Voice (French)
config: {dataset_config} # Optional. The name of the dataset configuration used in `load_dataset()`. Example: fr in `load_dataset("common_voice", "fr")`. See the `datasets` docs for more info: https://huggingface.co/docs/datasets/package_reference/loading_methods#datasets.load_dataset.name
split: {dataset_split} # Optional. Example: test
revision: {dataset_revision} # Optional. Example: 5503434ddd753f426f4b38109466949a1217c2bb
args:
{arg_0}: {value_0} # Optional. Additional arguments to `load_dataset()`. Example for wikipedia: language: en
{arg_1}: {value_1} # Optional. Example for wikipedia: date: 20220301
metrics:
- type: {metric_type} # Required. Example: wer. Use metric id from https://hf.co/metrics
value: {metric_value} # Required. Example: 20.90
name: {metric_name} # Optional. Example: Test WER
config: {metric_config} # Optional. The name of the metric configuration used in `load_metric()`. Example: bleurt-large-512 in `load_metric("bleurt", "bleurt-large-512")`. See the `datasets` docs for more info: https://huggingface.co/docs/datasets/v2.1.0/en/loading#load-configurations
args:
{arg_0}: {value_0} # Optional. The arguments passed during `Metric.compute()`. Example for `bleu`: max_order: 4
verifyToken: {verify_token} # Optional. If present, this is a signature that can be used to prove that evaluation was generated by Hugging Face (vs. self-reported).
source: # Optional. The source for this result.
name: {source_name} # Optional. The name of the source. Example: Open LLM Leaderboard.
url: {source_url} # Required if source is provided. A link to the source. Example: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard.
---
❓ Alternatives
No response
📝 Additional Context
No response
Acknowledgements
My issue title is concise, descriptive, and in title casing.
I have searched the existing issues to make sure this feature has not been requested yet.
I have provided enough information for the maintainers to understand and evaluate this request.
The text was updated successfully, but these errors were encountered:
🔖 Feature description
We already have base_model, tags datasets added automatically as metadata on the model card. A few more things are supported:
✔️ Solution
We should include eval metrics results as per the example:
❓ Alternatives
No response
📝 Additional Context
No response
Acknowledgements
The text was updated successfully, but these errors were encountered: