-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split benchamrk suite build target into default and long-running #13129
Conversation
7fb4f72
to
76a9004
Compare
a3135d6
to
9e8ba25
Compare
5fd8e1f
to
e1252da
Compare
0d3373e
to
09c4c2c
Compare
imported_model=iree_definitions.ImportedModel.from_model(model)) | ||
for model in models | ||
imported_model=iree_definitions.ImportedModel.from_model(model), | ||
tags=tags) for model in models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm but that means each model only belongs to exactly one preset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The presets can be a list, so each benchmark can still belong to multiple presets, if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uuuuuh are we uploading 100GB per run on postsubmit also? Because then we're going to have to start looking at storage costs, not just latency.
Also mentioned/discussed a bit here: #12215 (comment) |
imported_model=iree_definitions.ImportedModel.from_model(model)) | ||
for model in models | ||
imported_model=iree_definitions.ImportedModel.from_model(model), | ||
tags=tags) for model in models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm but that means each model only belongs to exactly one preset
@@ -88,15 +88,15 @@ def generate( | |||
) -> Tuple[List[iree_definitions.ModuleGenerationConfig], | |||
List[iree_definitions.E2EModelRunConfig]]: | |||
"""Generates IREE compile and run configs.""" | |||
# The `vulkan-nvidia` tag is required to put them into the Vulkan NVIDIA | |||
# The `vulkan-nvidia`` tag is required to put them into the Vulkan NVIDIA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# The `vulkan-nvidia`` tag is required to put them into the Vulkan NVIDIA | |
# The `vulkan-nvidia` tag is required to put them into the Vulkan NVIDIA |
Revert extra backtick?
COMPILE_STATS = "compile-stats" | ||
|
||
# Tag for long-running benchmarks. | ||
LONG_RUNNING = "long-running" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially when speaking about compilation of the models, is "long running" really the right name? That to me implies execution time. It seems like "large" would be simpler, shorter, and more accurate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked to @mariecwhite, I think we can rename this to large
. I can do in a follow-up change.
(reminder that unfoldable_constant exists and there are other mechanisms we can use to strip/compress weights - we shouldn't really be using real weights on presubmits unless we're 100% confident there's a material difference) |
We can also not build the large benchmarks in postsubmit (or not upload them), if the cost is a concern. |
09c4c2c
to
70d345f
Compare
…e-org#13129) Split `iree-benchmark-suites` into `iree-benchmark-suites` and `iree-benchmark-suites-long`. The former one doesn't include the `long-running` benchmarks and the latter one only includes those.
Split
iree-benchmark-suites
intoiree-benchmark-suites
andiree-benchmark-suites-long
. The former one doesn't include thelong-running
benchmarks and the latter one only includes those.The long-running benchmarks take > 100GB to build and waste lots of GCS space (we upload them) and time in presubmit.
This change still build all targets in
build_e2e_test_artifacts
. The follow-up changes will only buildiree-benchmark-suites
by default in presubmit.