-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UX improvements for the new benchmark pipeline #12215
Comments
Dump some thoughts about the tracability of using SHA ID and naming issues on artifacts and benchmarks BackgroundsE2E Test Artifacts
Each IREE Benchmark SuitesThere are two types of benchmarks: execution and compilation. Execution To run the benchmarks, Issues and SolutionsTracability in E2E Test ArtifactsSince the build rules and output artifacts only have the SHA256 ID in the names, Using SHA256 ID as cmake target and file names has advantage of avoiding the SolutionsTo make them trackable, first The second step is to annotate the names on build rules and output artifacts. We Searchablilty and Tracability in IREE Benchmark SuitesThe same problem also exists in the IREE benchmark suites, as the benchmark ID In addition, we decided not to dump all execution benchmark flagfiles into the However, the format of config is not really readable becuase the serializer uses "iree_e2e_model_run_configs:<execution_benchmark_id_1>": {
"module_generation_config": <id to its module generation config>
"run_flags": [...]
},
"iree_e2e_model_run_configs:<execution_benchmark_id_2>": {
"module_generation_config": <id to its module generation config>
"run_flags": [...]
},
"iree_module_generation_configs:<compilation_benchmark_id_3>": {
"imported_model": <id to its imported model>
"compile_flags": [...]
},
"iree_module_generation_configs:<compilation_benchmark_id_4>": {
"imported_model": <id to its imported model>
"compile_flags": [...]
},
"iree_imported_models:<imported_model_id>": {
"model": <id to its model>
},
"models:<model_id>": {
"name": <model name>,
...
},
... The format requires people to back-and-forth search with IDs to gather all {
"e2e_model_run_configs": {
"<execution_benchmark_id>": {
"run_flags": [...],
"tags": [...],
"module_generation_config": {
"compile_flags": [...],
"tags": [...],
"imported_model": {
"import_flags": [...],
"model": {
"name": <model_name>,
"tags": [...],
}
}
},
}
},
...
} SolutionsWe can consider dump all compilation and run flags into a single file with benchmark names, tags, and ids. So it is easier to use text editors to search benchmarks by keywords. We might also need to re-evaluate the decision not to dump all run flags. 300~400 flag files might not be that problematic. |
The new improvements are great, especially the docs (https://github.com/openxla/iree/blob/main/docs/developers/developing_iree/benchmark_suites.md#3-fetch-the-benchmark-artifacts) and summaries (e.g. https://github.com/openxla/iree/actions/runs/4832269569#summary-13105536036). One thing I'd like is a way to see how large the files are and filter to only download "small" programs. When I run This was working well for me before (266MB): I'm sure there's a way to do that with the gcloud CLI, so consider this a selfish request for an alternate command to copy/paste, a script to run, or some directory structure changes :p |
Yeah, it's unhealthy to upload 50GB of artifacts to the GCS for every presubmit commit. It's a new problem after we added a new A trick to only download small files can be: gcloud storage ls -l "gs://iree-github-actions-presubmit-artifacts/4927200698/1/e2e-test-artifacts/*.mlir*" | sort -h then pipe it to some bash tools that could give you a list of smaller files. I'll update the doc if I find something suitable (or feel free to update with your commands). (And I also realized we are now uploading Another way could be putting |
Major improvements are considered to be done |
Since we move to use artificial ids in the new benchmark pipeline, naturally it isn't very friendly for users to figure out the details of benchmarks with those meaningless ids.
To improve the UX, some information can be added to the benchmark config files and some tools can help users to retrieve the benchmark information.
P0
run_benchmarks_on_*.py
should support filter + run_config to run benchmarks locally with the new benchmark suiteP1
run_benchmarks_on_*.py
(Use benchmark name from benchmark config object #12723)P2
The text was updated successfully, but these errors were encountered: