Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Pytorch support to perf.iree.dev #12537

Merged
merged 1 commit into from
Mar 20, 2023

Conversation

mariecwhite
Copy link
Contributor

@mariecwhite mariecwhite commented Mar 7, 2023

Adds ClipTextModel and Unet2d to the benchmarking suite.

benchmarks: x86_64, cuda

@github-actions
Copy link

github-actions bot commented Mar 7, 2023

@mariecwhite mariecwhite marked this pull request as ready for review March 7, 2023 23:26
Comment on lines +16 to +22
# `ClipTextModel` encodes text into an embedding.
#
# Used in Stable Diffusion to convert a text prompt into an embedding for input to the `Unet2d` model.
#
# Converted from https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPTextModel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this model pretrained / does it have real weights?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All models are pretrained and pulled from HuggingFace.

Comment on lines +26 to +31
source_url=
"https://storage.googleapis.com/iree-model-artifacts/pytorch/torch_models_20230307.103_1678163233/SD_CLIP_TEXT_MODEL_SEQLEN64/linalg.mlir",
entry_function="forward",
input_types=["1x77xi64", "1x77xi64"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we ever add correctness testing to our benchmark suites? If so, I'd expect some sort of "expected outputs" file (for iree-run-module, or iree-run-trace?) to also be hosted in the model artifacts bucket here. (We should be testing correctness for everything we benchmark, but we don't need to benchmark everything that we test for correctness)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for correctness testing. The model implementations in iree-samples save the input and output arrays in .npy format. I believe there is an open item to add correctness testing here. @pzread

Comment on lines 3868 to +3871
${PACKAGE_NAME}_iree-imported-model-d4a10c6d3e8a11d808baf398822ea8b61be07673517ff9be30fbe199b7fdd960
${PACKAGE_NAME}_iree-imported-model-a122dabcac56c201a4c98d3474265f15adba14bff88353f421b1a11cadcdea1f
${PACKAGE_NAME}_model-9a9515c7-cb68-4c34-b1d2-0e8c0a3620b8
${PACKAGE_NAME}_model-340553d1-e6fe-41b6-b2c7-687c74ccec56
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these names different?

Copy link
Contributor

@pzread pzread Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because they are the already-imported MLIR files instead of using an importer to import PyTorch models every time we build the benchmark suite. So there is no middle iree-imported-model-* target to import the model.

For the reason we directly use MLIR files:

@mariecwhite discussed with me before. We think it's fine to directly use the imported MLIR for PyTorch benchmarks because:

  1. The script to import PyTorch models requires lots of extra dependencies. We probably don't want to ask people to install them when building IREE benchmark suite
  2. MLIR compatibility is the major reason we import TF/TFLite models in every build, but since we have no integration with torch-mlir in IREE repo, there is no benefit to importing PyTorch models to MLIR in every build.
  3. I remembered Marie mentioned that the imported MLIR for PyTorch model is relatively stable. So it will only need to update the MLIR once a while when it breaks.

This is a mid-term solution before we can move the benchmark suite out of the IREE build system, then we can have extra dependencies to run torch-mlir for every build.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to comments on #12017 (comment), if we're using artifacts using an unstable format they need to be stored with version information and the process for regenerating them needs to be clearly documented in a discoverable way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version info is encoded in the GCS bucket hosting the mlir files: https://pantheon.corp.google.com/storage/browser/iree-model-artifacts/pytorch/torch_models_20230307.103_1678163233. 20230307.103 is the version of torch-mlir used. Inside the gcs bucket is a version_info.txt file that stores the output of pip list. It may be difficult to exactly reproduce the mlir files from scratch on a different machine since the packages may no longer be hosted. If we want exact reproducibility, we'll need to create images with the pip environment saved.

In terms of encoding this version info in the dashboards and/or database, I'll defer to @pzread.

Comment on lines 80 to 85
MODEL_BERT_LARGE_TF_FP32_SEQLEN384 = "8871f602-571c-4eb8-b94d-554cc8ceec5a"
MODEL_CLIP_TEXT_SEQLEN64_FP32_TORCH = "9a9515c7-cb68-4c34-b1d2-0e8c0a3620b8"
MODEL_UNET_2D_FP32_TORCH = "340553d1-e6fe-41b6-b2c7-687c74ccec56"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we group these by framework?

# Models
#   TF
MODEL_FOO = ...
#   TFLite
MODEL_BAR = ...
#   PyTorch
MODEL_BAZ = ...

# Model input data
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grouped

Comment on lines +32 to +34
torch_models.MODEL_CLIP_TEXT_SEQLEN64_FP32_TORCH,
# Disabled due to https://github.com/openxla/iree/issues/11447.
#torch_models.MODEL_UNET_2D_FP32_TORCH,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we tracking any "small" PyTorch models? It would be nice to have coverage for a variety of model architectures

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll be adding more PyTorch models in the coming weeks. I'll keep this in mind and make sure we add small torch models (probably an EfficientNet small since we have a TF version of this).


# Implementations of the models listed below can be found in `https://github.com/iree-org/iree-samples/tree/main/iree-torch/importer`.
# We import the PyTorch models offline and make the .mlir available here for benchmarking.
# If the mlir artifacts need to be updated, please run [update_torch_models.sh](https://github.com/iree-org/iree-samples/blob/main/iree-torch/importer/update_torch_models.sh)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script at https://github.com/iree-org/iree-samples/blob/main/iree-torch/importer/update_torch_models.sh could use more documentation

  • Sample command line showing expected usage (including any CLI args or env vars)
  • Prerequisites (Linux only? pip install or build some packages from source first? auth for gcloud)
  • What the script does (how long it takes, how much disk/compute it needs, etc.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a PR with updates: iree-org/iree-experimental#111

I didn't include how long it takes and how much disk/compute it needs since it will change as we add more models.

@ScottTodd ScottTodd added integrations Relating to high-level frontend integrations infrastructure/benchmark Relating to benchmarking infrastructure labels Mar 7, 2023
Adds ClipTextModel and Unet2d to the benchmarking suite.

benchmarks: x86_64, cuda
# Converted from https://huggingface.co/docs/diffusers/api/models#diffusers.UNet2DConditionModel
MODEL_UNET_2D_FP32_TORCH = common_definitions.Model(
id=unique_ids.MODEL_UNET_2D_FP32_TORCH,
name="Unet2dPT",
Copy link
Contributor

@pzread pzread Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be int8 or fp16 version of these models? If so maybe we can append _fp32 as other models. It's better to make sure model names are unique (even they are not the primary keys of benchmarks anymore)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This model already has FP32 in its name. Let's close on a naming convention and I'll update the model names in a separate PR.

Copy link
Contributor

@GMNGeoffrey GMNGeoffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Marie!

@@ -31,12 +32,12 @@ def generate(
) -> Tuple[List[iree_definitions.ModuleGenerationConfig],
List[iree_definitions.E2EModelRunConfig]]:
"""Generates IREE compile and run configs."""

models = model_groups.LARGE + [torch_models.MODEL_UNET_2D_FP32_TORCH]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define this as a group instead? @pzread

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see in the RISC-V benchmarks for instance, we define a constant with the relevant models. Let's do that here too (just slightly more obvious than having it inline IMO):

https://github.com/openxla/iree/blob/main/build_tools/python/benchmark_suites/iree/riscv_benchmarks.py#L17

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the small, large model groups turn out not to be a good design to handle different model sets for different architectures. I'm still thinking about how to have a better organization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, the model groups are tied heavily to the backend and what is supported in the frontend dialects so it might make sense to group based on backend.

@mariecwhite mariecwhite merged commit bb95333 into iree-org:main Mar 20, 2023
pzread pushed a commit that referenced this pull request Mar 21, 2023
#12693 is merged after #12690 and #12537 without regenerating the cmake
files (the presubmit was passed before the other two PRs merged).
Regenerated the cmake files.
qedawkins pushed a commit to qedawkins/iree that referenced this pull request Apr 2, 2023
Adds ClipTextModel and Unet2d to the benchmarking suite.

benchmarks: x86_64, cuda
qedawkins pushed a commit to qedawkins/iree that referenced this pull request Apr 2, 2023
iree-org#12693 is merged after iree-org#12690 and iree-org#12537 without regenerating the cmake
files (the presubmit was passed before the other two PRs merged).
Regenerated the cmake files.
@jpienaar jpienaar mentioned this pull request Apr 3, 2023
jpienaar pushed a commit that referenced this pull request May 1, 2023
Adds ClipTextModel and Unet2d to the benchmarking suite.

benchmarks: x86_64, cuda
jpienaar pushed a commit that referenced this pull request May 1, 2023
#12693 is merged after #12690 and #12537 without regenerating the cmake
files (the presubmit was passed before the other two PRs merged).
Regenerated the cmake files.
NatashaKnk pushed a commit to NatashaKnk/iree that referenced this pull request Jul 6, 2023
Adds ClipTextModel and Unet2d to the benchmarking suite.

benchmarks: x86_64, cuda
NatashaKnk pushed a commit to NatashaKnk/iree that referenced this pull request Jul 6, 2023
iree-org#12693 is merged after iree-org#12690 and iree-org#12537 without regenerating the cmake
files (the presubmit was passed before the other two PRs merged).
Regenerated the cmake files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure/benchmark Relating to benchmarking infrastructure integrations Relating to high-level frontend integrations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants