Dump run and compile flags into benchmark JSON config #12397

pzread · 2023-02-27T19:30:45Z

Currently run and compilation flags are generated in run time with the metadata in the benchmark config. This is not ideal since the serialized config should be the source of truth to reproduce compile and run benchmarks. Relying on the code to generate flags in run time means the code version can affect the benchmark configuration.

It's also hard to get those flags for manual investigation without calling the generation functions from the benchmark framework, if the flags are not serialized.

This change includes 3 commits:

Serialize the full compilation flags into the ModuleGenerationConfig.composite_flags field
Serialize the full run flags into the E2EModelRunConfig.composite_flags field
Serialize the composite id, so the composite objects can be searchable in the JSON config with their id (e.g. obtained from the IREE perf dashboard).

Below is a sample of benchmark config. To get the run and compile flags for a benchmark, the steps are:

Search for iree_e2e_model_run_configs:<benchmark id from perf dashboard> to get the run flags and generation config id.
Search iree_module_generation_config:<generation config id> to get the compile flags and imported model id.
Compiled module can be found at the path ${ARTIFACTS_DIR}/*_module_<generation config id>/module.vmfb.
Imported MLIR can be found at the path ${ARTIFACTS_DIR}/*_<imported model id>.mlir.

The process is still tedious, but should be able to automate with only jq and shell scripts.

...
  "iree_e2e_model_run_configs:fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0": {
      "composite_id": "fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0",
      "module_generation_config": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "module_execution_config": "13fc65a9-e5dc-4cbb-9c09-25b0b08f4c03",
      "target_device_spec": "9a4804f1-b1b9-46cd-b251-7f16a655f782",
      "input_data": "8d4a034e-944d-4725-8402-d6f6e61be93c",
      "composite_flags": [
        "--function=main",
        "--input=1x257x257x3xf32=0",
        "--device_allocator=caching",
        "--device=local-sync"
      ]
    },
    "iree_module_generation_configs:87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6": {
      "composite_id": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "imported_model": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "compile_config": "e7e18b0f-c72d-4f1c-89b1-5afee70df6e9",
      "composite_flags": [
        "--iree-hal-target-backends=llvm-cpu",
        "--iree-input-type=tosa",
        "--iree-llvm-target-triple=x86_64-unknown-linux-gnu",
        "--iree-llvm-target-cpu=cascadelake"
      ]
    },
    "iree_imported_models:05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d": {
      "composite_id": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "model": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "import_config": "16280d67-7ce0-4807-ab4b-0cb3c771d206"
    },
    "models:c36c63b0-220a-4d78-8ade-c45ce47d89d3": {
      "id": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "name": "DeepLabV3_fp32",
      "tags": [
        "fp32"
      ],
      "source_type": "EXPORTED_TFLITE",
      "source_url": "https://storage.googleapis.com/iree-model-artifacts/deeplabv3.tflite",
      "entry_function": "main",
      "input_types": [
        "1x257x257x3xf32"
      ]
    }
...

There is another attempt to achieve the same goal by dumping run and compilation flag files as test artifacts (as the legacy benchmark suite). The downside is that the number of execution benchmarks might grow fast (335 right now) for different combinations of model/compile/runtime flags. It will generate and upload lots of small flag files to GCS.

Each commit should be review-able independently as separate PR.

Based on #12388

benchmarks: x86_64, cuda

pzread · 2023-02-27T19:42:41Z

build_tools/python/e2e_test_framework/definitions/iree_definitions.py

@@ -207,3 +224,64 @@ def composite_id(self):
        self.module_execution_config.id, self.target_device_spec.id,
        self.input_data.id
    ])
+


Moved from build_tools/python/e2e_test_artifacts/cmake_generator/iree_rule_generator.py with small fixes (dialect_type: str -> dialect_type: MLIRDialectType, --iree-hal-cuda-llvm-target-arch=sm_80 -> --iree-hal-cuda-llvm-target-arch={arch_info.microarchitecture})

While moving this part, I think some compile flags should be moved into their CompileConfig.extra_flags, like those specialized flags for RISCV. Will do a refactor later.

pzread · 2023-02-27T19:46:29Z

build_tools/python/e2e_test_framework/definitions/iree_definitions.py

+            module_execution_config=module_execution_config,
+            gpu_id=E2E_MODEL_RUN_CONFIG_GPU_ID_PLACEHOLDER))
+
+


Moved from run_module_utils and merge build_run_flags_for_model, build_run_flags_for_execution_config into one.

GMNGeoffrey · 2023-02-27T20:23:13Z

build_tools/python/e2e_test_framework/definitions/iree_definitions.py

+  # unmaterialized placeholders. Allows the compile flags to be persisted and
+  # decouple from the generation code. Also serves as useful information in the
+  # serialized JSON.
+  composite_flags: List[str]


I'm confused by the "composite" aspect of this. The flags can only be for one tool and AFAICT they're all compiler flags. If they were for two separate tools, I'd want them separate anyway

It's the full flag list for iree-compile (including args for driver, target architecture, ..). Maybe full_flags is a better field name?

Well why not "compile_flags"? But also, I think it would make more sense for these to be part of the compile_config, no?

It is because part of the compile flags come from the model (actually only one --iree-input-type=)

compile_flags SGTM

Ah, ok that was the thing I was missing. It's a flag for the compiler but it's determined by the model. Makes sense, thanks

GMNGeoffrey

I'm not sure where in the code to comment, but rather than concatenating name and config id, would it make more sense to make the key the config id in an object containing multiple configs? So reworking your example

{
  // ...
  "iree_e2e_model_run_configs": {
    "fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0": {
      "composite_id": "fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0",
      "module_generation_config": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "module_execution_config": "13fc65a9-e5dc-4cbb-9c09-25b0b08f4c03",
      "target_device_spec": "9a4804f1-b1b9-46cd-b251-7f16a655f782",
      "input_data": "8d4a034e-944d-4725-8402-d6f6e61be93c",
      "composite_flags": [
        "--function=main",
        "--input=1x257x257x3xf32=0",
        "--device_allocator=caching",
        "--device=local-sync"
      ]
    }
    // ...
  },
  "iree_module_generation_configs": {
    "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6": {
      "composite_id": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "imported_model": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "compile_config": "e7e18b0f-c72d-4f1c-89b1-5afee70df6e9",
      "composite_flags": [
        "--iree-hal-target-backends=llvm-cpu",
        "--iree-input-type=tosa",
        "--iree-llvm-target-triple=x86_64-unknown-linux-gnu",
        "--iree-llvm-target-cpu=cascadelake"
      ]
    }
    // ...
  },
  "iree_imported_models": {
    "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d": {
      "composite_id": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "model": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "import_config": "16280d67-7ce0-4807-ab4b-0cb3c771d206"
    }
  },
  "models": {
    "c36c63b0-220a-4d78-8ade-c45ce47d89d3": {
      "id": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "name": "DeepLabV3_fp32",
      "tags": [
        "fp32"
      ],
      "source_type": "EXPORTED_TFLITE",
      "source_url": "https://storage.googleapis.com/iree-model-artifacts/deeplabv3.tflite",
      "entry_function": "main",
      "input_types": [
        "1x257x257x3xf32"
      ]
    }
  }
  // ...
}

pzread · 2023-02-27T20:33:04Z

I'm not sure where in the code to comment, but rather than concatenating name and config id, would it make more sense to make the key the config id in an object containing multiple configs? So reworking your example

{
  // ...
  "iree_e2e_model_run_configs": {
    "fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0": {
      "composite_id": "fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0",
      "module_generation_config": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "module_execution_config": "13fc65a9-e5dc-4cbb-9c09-25b0b08f4c03",
      "target_device_spec": "9a4804f1-b1b9-46cd-b251-7f16a655f782",
      "input_data": "8d4a034e-944d-4725-8402-d6f6e61be93c",
      "composite_flags": [
        "--function=main",
        "--input=1x257x257x3xf32=0",
        "--device_allocator=caching",
        "--device=local-sync"
      ]
    }
    // ...
  },
  "iree_module_generation_configs": {
    "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6": {
      "composite_id": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "imported_model": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "compile_config": "e7e18b0f-c72d-4f1c-89b1-5afee70df6e9",
      "composite_flags": [
        "--iree-hal-target-backends=llvm-cpu",
        "--iree-input-type=tosa",
        "--iree-llvm-target-triple=x86_64-unknown-linux-gnu",
        "--iree-llvm-target-cpu=cascadelake"
      ]
    }
    // ...
  },
  "iree_imported_models": {
    "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d": {
      "composite_id": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "model": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "import_config": "16280d67-7ce0-4807-ab4b-0cb3c771d206"
    }
  },
  "models": {
    "c36c63b0-220a-4d78-8ade-c45ce47d89d3": {
      "id": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "name": "DeepLabV3_fp32",
      "tags": [
        "fp32"
      ],
      "source_type": "EXPORTED_TFLITE",
      "source_url": "https://storage.googleapis.com/iree-model-artifacts/deeplabv3.tflite",
      "entry_function": "main",
      "input_types": [
        "1x257x257x3xf32"
      ]
    }
  }
  // ...
}

The format comes from the serializer. It serializes and keyed the object by object_type:object_id. I actually thought about grouping them as you mentioned. It can be done later (but also breaks the backward compatibility, should be fine if it doesn't happen often)

github-actions · 2023-02-27T21:57:27Z

Abbreviated Benchmark Summary

@ commit 1cce9a3308e1c51fc87cd2252b84358f397fe503 (vs. base e2cfc9976b87e9d634fd5ebeb6a7ee347fed11cc)

Improved Latencies 🎉

Benchmark Name	Average Latency (ms)	Median Latency (ms)	Latency Standard Deviation (ms)
MobileNetV2\_fp32 [fp32,imagenet] (exported\_tflite) [experimental-flags,fuse-padding][1-thread,full-inference,default-flags] with IREE-LLVM-CPU @ GCP-c2-standard-16 (CPU-x86\_64-CascadeLake)	10.679 (vs. 11.813, 9.60%↓)	10.707	0.128
MobileNetV1\_fp32 [fp32,imagenet] (exported\_tflite) [experimental-flags,fuse-padding][4-thread,full-inference,default-flags] with IREE-LLVM-CPU @ GCP-c2-standard-16 (CPU-x86\_64-CascadeLake)	7.500 (vs. 8.254, 9.13%↓)	7.504	0.045
MobileNetV2\_fp32 [fp32,imagenet] (exported\_tflite) [experimental-flags,fuse-padding][full-inference,default-flags] with IREE-LLVM-CPU-Sync @ GCP-c2-standard-16 (CPU-x86\_64-CascadeLake)	10.663 (vs. 11.714, 8.97%↓)	10.656	0.091

[Top 3 out of 26 results showed]

No improved or regressed compilation metrics 🏖️

For more information:

Source Workflow Run

iree-github-actions-bot · 2023-02-27T23:06:13Z

Abbreviated Android Benchmark Summary

@ commit 9e6c560c8ca48ab24e620c80eaa7c8ed7d3bbb4d (vs. base 96b61ba6fe20e6cf9a647cd4019612940fc879ec)

Improved Latencies 🎉

Benchmark Name	Average Latency (ms)	Median Latency (ms)	Latency Standard Deviation (ms)
MobileBertSquad [fp16] (TFLite) full-inference,experimental-flags with IREE-Vulkan @ Pixel-6-Pro (GPU-Mali-G78)	119.024 (vs. 129.855, 8.34%↓)	114.587	10.218

For more information:

pzread commented Feb 27, 2023

View reviewed changes

pzread force-pushed the bench-jsonflags branch from e63e402 to 9a86f3e Compare February 27, 2023 19:44

pzread commented Feb 27, 2023

View reviewed changes

pzread marked this pull request as ready for review February 27, 2023 19:52

pzread requested review from GMNGeoffrey, antiagainst and ScottTodd as code owners February 27, 2023 19:52

pzread mentioned this pull request Feb 27, 2023

Fix e2e test artifacts to use composite id #12388

Merged

pzread added the (deprecated) buildkite:benchmark-android Deprecated. Please use benchmarks:android-* label Feb 27, 2023

GMNGeoffrey reviewed Feb 27, 2023

View reviewed changes

pzread force-pushed the bench-jsonflags branch from 9a86f3e to 7ffce50 Compare February 27, 2023 20:40

Che-Yu Wu added 3 commits February 27, 2023 20:48

Store compilation flag dump in ModuleGenerationConfig.

3dbf356

Store run flag dump in E2EModelRunConfig.

9c3e9be

Make composite id a field so the value can be reused and serialized

9e6c560

pzread force-pushed the bench-jsonflags branch from 7ffce50 to 9e6c560 Compare February 27, 2023 20:52

GMNGeoffrey approved these changes Feb 27, 2023

View reviewed changes

pzread merged commit 0246bbf into iree-org:main Feb 27, 2023

pzread mentioned this pull request Feb 27, 2023

UX improvements for the new benchmark pipeline #12215

Closed

19 tasks

qedawkins pushed a commit to qedawkins/iree that referenced this pull request Apr 2, 2023

Dump run and compile flags into benchmark JSON config (iree-org#12397)

28c0f4d

jpienaar pushed a commit that referenced this pull request May 1, 2023

Dump run and compile flags into benchmark JSON config (#12397)

1c666c3

rengolin pushed a commit to plaidml/iree that referenced this pull request May 2, 2023

Dump run and compile flags into benchmark JSON config (iree-org#12397)

fcbd5c1

NatashaKnk pushed a commit to NatashaKnk/iree that referenced this pull request Jul 6, 2023

Dump run and compile flags into benchmark JSON config (iree-org#12397)

6942e8a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dump run and compile flags into benchmark JSON config #12397

Dump run and compile flags into benchmark JSON config #12397

pzread commented Feb 27, 2023 •

edited

Loading

pzread Feb 27, 2023 •

edited

Loading

pzread Feb 27, 2023

GMNGeoffrey Feb 27, 2023

pzread Feb 27, 2023 •

edited

Loading

GMNGeoffrey Feb 27, 2023

pzread Feb 27, 2023 •

edited

Loading

GMNGeoffrey Feb 27, 2023

GMNGeoffrey left a comment

pzread commented Feb 27, 2023 •

edited

Loading

github-actions bot commented Feb 27, 2023

iree-github-actions-bot commented Feb 27, 2023

		module_execution_config=module_execution_config,
		gpu_id=E2E_MODEL_RUN_CONFIG_GPU_ID_PLACEHOLDER))

Dump run and compile flags into benchmark JSON config #12397

Dump run and compile flags into benchmark JSON config #12397

Conversation

pzread commented Feb 27, 2023 • edited Loading

pzread Feb 27, 2023 • edited Loading

Choose a reason for hiding this comment

pzread Feb 27, 2023

Choose a reason for hiding this comment

GMNGeoffrey Feb 27, 2023

Choose a reason for hiding this comment

pzread Feb 27, 2023 • edited Loading

Choose a reason for hiding this comment

GMNGeoffrey Feb 27, 2023

Choose a reason for hiding this comment

pzread Feb 27, 2023 • edited Loading

Choose a reason for hiding this comment

GMNGeoffrey Feb 27, 2023

Choose a reason for hiding this comment

GMNGeoffrey left a comment

Choose a reason for hiding this comment

pzread commented Feb 27, 2023 • edited Loading

github-actions bot commented Feb 27, 2023

Abbreviated Benchmark Summary

Improved Latencies 🎉

iree-github-actions-bot commented Feb 27, 2023

Abbreviated Android Benchmark Summary

Improved Latencies 🎉

pzread commented Feb 27, 2023 •

edited

Loading

pzread Feb 27, 2023 •

edited

Loading

pzread Feb 27, 2023 •

edited

Loading

pzread Feb 27, 2023 •

edited

Loading

pzread commented Feb 27, 2023 •

edited

Loading