Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dump run and compile flags into benchmark JSON config #12397

Merged
merged 3 commits into from
Feb 27, 2023

Conversation

pzread
Copy link
Contributor

@pzread pzread commented Feb 27, 2023

Currently run and compilation flags are generated in run time with the metadata in the benchmark config. This is not ideal since the serialized config should be the source of truth to reproduce compile and run benchmarks. Relying on the code to generate flags in run time means the code version can affect the benchmark configuration.

It's also hard to get those flags for manual investigation without calling the generation functions from the benchmark framework, if the flags are not serialized.

This change includes 3 commits:

  1. Serialize the full compilation flags into the ModuleGenerationConfig.composite_flags field
  2. Serialize the full run flags into the E2EModelRunConfig.composite_flags field
  3. Serialize the composite id, so the composite objects can be searchable in the JSON config with their id (e.g. obtained from the IREE perf dashboard).

Below is a sample of benchmark config. To get the run and compile flags for a benchmark, the steps are:

  1. Search for iree_e2e_model_run_configs:<benchmark id from perf dashboard> to get the run flags and generation config id.
  2. Search iree_module_generation_config:<generation config id> to get the compile flags and imported model id.
  3. Compiled module can be found at the path ${ARTIFACTS_DIR}/*_module_<generation config id>/module.vmfb.
  4. Imported MLIR can be found at the path ${ARTIFACTS_DIR}/*_<imported model id>.mlir.

The process is still tedious, but should be able to automate with only jq and shell scripts.

...
  "iree_e2e_model_run_configs:fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0": {
      "composite_id": "fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0",
      "module_generation_config": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "module_execution_config": "13fc65a9-e5dc-4cbb-9c09-25b0b08f4c03",
      "target_device_spec": "9a4804f1-b1b9-46cd-b251-7f16a655f782",
      "input_data": "8d4a034e-944d-4725-8402-d6f6e61be93c",
      "composite_flags": [
        "--function=main",
        "--input=1x257x257x3xf32=0",
        "--device_allocator=caching",
        "--device=local-sync"
      ]
    },
    "iree_module_generation_configs:87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6": {
      "composite_id": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "imported_model": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "compile_config": "e7e18b0f-c72d-4f1c-89b1-5afee70df6e9",
      "composite_flags": [
        "--iree-hal-target-backends=llvm-cpu",
        "--iree-input-type=tosa",
        "--iree-llvm-target-triple=x86_64-unknown-linux-gnu",
        "--iree-llvm-target-cpu=cascadelake"
      ]
    },
    "iree_imported_models:05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d": {
      "composite_id": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "model": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "import_config": "16280d67-7ce0-4807-ab4b-0cb3c771d206"
    },
    "models:c36c63b0-220a-4d78-8ade-c45ce47d89d3": {
      "id": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "name": "DeepLabV3_fp32",
      "tags": [
        "fp32"
      ],
      "source_type": "EXPORTED_TFLITE",
      "source_url": "https://storage.googleapis.com/iree-model-artifacts/deeplabv3.tflite",
      "entry_function": "main",
      "input_types": [
        "1x257x257x3xf32"
      ]
    }
...

There is another attempt to achieve the same goal by dumping run and compilation flag files as test artifacts (as the legacy benchmark suite). The downside is that the number of execution benchmarks might grow fast (335 right now) for different combinations of model/compile/runtime flags. It will generate and upload lots of small flag files to GCS.

Each commit should be review-able independently as separate PR.

Based on #12388

benchmarks: x86_64, cuda

@@ -207,3 +224,64 @@ def composite_id(self):
self.module_execution_config.id, self.target_device_spec.id,
self.input_data.id
])

Copy link
Contributor Author

@pzread pzread Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from build_tools/python/e2e_test_artifacts/cmake_generator/iree_rule_generator.py with small fixes (dialect_type: str -> dialect_type: MLIRDialectType, --iree-hal-cuda-llvm-target-arch=sm_80 -> --iree-hal-cuda-llvm-target-arch={arch_info.microarchitecture})

While moving this part, I think some compile flags should be moved into their CompileConfig.extra_flags, like those specialized flags for RISCV. Will do a refactor later.

module_execution_config=module_execution_config,
gpu_id=E2E_MODEL_RUN_CONFIG_GPU_ID_PLACEHOLDER))


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from run_module_utils and merge build_run_flags_for_model, build_run_flags_for_execution_config into one.

@pzread pzread marked this pull request as ready for review February 27, 2023 19:52
@pzread pzread added the (deprecated) buildkite:benchmark-android Deprecated. Please use benchmarks:android-* label Feb 27, 2023
# unmaterialized placeholders. Allows the compile flags to be persisted and
# decouple from the generation code. Also serves as useful information in the
# serialized JSON.
composite_flags: List[str]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by the "composite" aspect of this. The flags can only be for one tool and AFAICT they're all compiler flags. If they were for two separate tools, I'd want them separate anyway

Copy link
Contributor Author

@pzread pzread Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the full flag list for iree-compile (including args for driver, target architecture, ..). Maybe full_flags is a better field name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well why not "compile_flags"? But also, I think it would make more sense for these to be part of the compile_config, no?

Copy link
Contributor Author

@pzread pzread Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is because part of the compile flags come from the model (actually only one --iree-input-type=)

compile_flags SGTM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok that was the thing I was missing. It's a flag for the compiler but it's determined by the model. Makes sense, thanks

Copy link
Contributor

@GMNGeoffrey GMNGeoffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where in the code to comment, but rather than concatenating name and config id, would it make more sense to make the key the config id in an object containing multiple configs? So reworking your example

{
  // ...
  "iree_e2e_model_run_configs": {
    "fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0": {
      "composite_id": "fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0",
      "module_generation_config": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "module_execution_config": "13fc65a9-e5dc-4cbb-9c09-25b0b08f4c03",
      "target_device_spec": "9a4804f1-b1b9-46cd-b251-7f16a655f782",
      "input_data": "8d4a034e-944d-4725-8402-d6f6e61be93c",
      "composite_flags": [
        "--function=main",
        "--input=1x257x257x3xf32=0",
        "--device_allocator=caching",
        "--device=local-sync"
      ]
    }
    // ...
  },
  "iree_module_generation_configs": {
    "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6": {
      "composite_id": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "imported_model": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "compile_config": "e7e18b0f-c72d-4f1c-89b1-5afee70df6e9",
      "composite_flags": [
        "--iree-hal-target-backends=llvm-cpu",
        "--iree-input-type=tosa",
        "--iree-llvm-target-triple=x86_64-unknown-linux-gnu",
        "--iree-llvm-target-cpu=cascadelake"
      ]
    }
    // ...
  },
  "iree_imported_models": {
    "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d": {
      "composite_id": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "model": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "import_config": "16280d67-7ce0-4807-ab4b-0cb3c771d206"
    }
  },
  "models": {
    "c36c63b0-220a-4d78-8ade-c45ce47d89d3": {
      "id": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "name": "DeepLabV3_fp32",
      "tags": [
        "fp32"
      ],
      "source_type": "EXPORTED_TFLITE",
      "source_url": "https://storage.googleapis.com/iree-model-artifacts/deeplabv3.tflite",
      "entry_function": "main",
      "input_types": [
        "1x257x257x3xf32"
      ]
    }
  }
  // ...
}

@pzread
Copy link
Contributor Author

pzread commented Feb 27, 2023

I'm not sure where in the code to comment, but rather than concatenating name and config id, would it make more sense to make the key the config id in an object containing multiple configs? So reworking your example

{
  // ...
  "iree_e2e_model_run_configs": {
    "fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0": {
      "composite_id": "fcc2eb7748902acc86b82e71de537c9f38bd0baccb9ff8da2688a806278116a0",
      "module_generation_config": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "module_execution_config": "13fc65a9-e5dc-4cbb-9c09-25b0b08f4c03",
      "target_device_spec": "9a4804f1-b1b9-46cd-b251-7f16a655f782",
      "input_data": "8d4a034e-944d-4725-8402-d6f6e61be93c",
      "composite_flags": [
        "--function=main",
        "--input=1x257x257x3xf32=0",
        "--device_allocator=caching",
        "--device=local-sync"
      ]
    }
    // ...
  },
  "iree_module_generation_configs": {
    "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6": {
      "composite_id": "87aead729018ce5f114501cecefb6315086eb2a21ae1b30984b1794f619871c6",
      "imported_model": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "compile_config": "e7e18b0f-c72d-4f1c-89b1-5afee70df6e9",
      "composite_flags": [
        "--iree-hal-target-backends=llvm-cpu",
        "--iree-input-type=tosa",
        "--iree-llvm-target-triple=x86_64-unknown-linux-gnu",
        "--iree-llvm-target-cpu=cascadelake"
      ]
    }
    // ...
  },
  "iree_imported_models": {
    "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d": {
      "composite_id": "05c50f54ffea1fce722d07588e7de026ce10324eccc5d83d1eac2c5a9f5d639d",
      "model": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "import_config": "16280d67-7ce0-4807-ab4b-0cb3c771d206"
    }
  },
  "models": {
    "c36c63b0-220a-4d78-8ade-c45ce47d89d3": {
      "id": "c36c63b0-220a-4d78-8ade-c45ce47d89d3",
      "name": "DeepLabV3_fp32",
      "tags": [
        "fp32"
      ],
      "source_type": "EXPORTED_TFLITE",
      "source_url": "https://storage.googleapis.com/iree-model-artifacts/deeplabv3.tflite",
      "entry_function": "main",
      "input_types": [
        "1x257x257x3xf32"
      ]
    }
  }
  // ...
}

The format comes from the serializer. It serializes and keyed the object by object_type:object_id. I actually thought about grouping them as you mentioned. It can be done later (but also breaks the backward compatibility, should be fine if it doesn't happen often)

@iree-github-actions-bot
Copy link
Contributor

Abbreviated Android Benchmark Summary

@ commit 9e6c560c8ca48ab24e620c80eaa7c8ed7d3bbb4d (vs. base 96b61ba6fe20e6cf9a647cd4019612940fc879ec)

Improved Latencies 🎉

Benchmark Name Average Latency (ms) Median Latency (ms) Latency Standard Deviation (ms)
MobileBertSquad [fp16] (TFLite) full-inference,experimental-flags with IREE-Vulkan @ Pixel-6-Pro (GPU-Mali-G78) 119.024 (vs. 129.855, 8.34%↓) 114.587 10.218

For more information:

@pzread pzread merged commit 0246bbf into iree-org:main Feb 27, 2023
qedawkins pushed a commit to qedawkins/iree that referenced this pull request Apr 2, 2023
rengolin pushed a commit to plaidml/iree that referenced this pull request May 2, 2023
NatashaKnk pushed a commit to NatashaKnk/iree that referenced this pull request Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
(deprecated) buildkite:benchmark-android Deprecated. Please use benchmarks:android-*
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants