-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVTX: name threads, CUDA devices and CUDA streams #13603
Conversation
@cheshire can you take a look? I'm not super familiar with nsys profiler, and cannot tell whether this improves internal profiling experience or not. |
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b67bd9db8b3f69ba1e0d00c7881f767f037 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf7700cbe7ce72070c25ce3d21e2dfeb54f by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a40add79f108cb89987724c35f82cd727e4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13603 from olupton:name-devices-streams-and-threads 98a80a40add79f108cb89987724c35f82cd727e4 PiperOrigin-RevId: 643001157
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b67bd9db8b3f69ba1e0d00c7881f767f037 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf7700cbe7ce72070c25ce3d21e2dfeb54f by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a40add79f108cb89987724c35f82cd727e4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13603 from olupton:name-devices-streams-and-threads 98a80a40add79f108cb89987724c35f82cd727e4 PiperOrigin-RevId: 643001157
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b67bd9db8b3f69ba1e0d00c7881f767f037 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf7700cbe7ce72070c25ce3d21e2dfeb54f by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a40add79f108cb89987724c35f82cd727e4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13603 from olupton:name-devices-streams-and-threads 98a80a40add79f108cb89987724c35f82cd727e4 PiperOrigin-RevId: 643290582
Imported from GitHub PR #13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b6 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=#13603 from olupton:name-devices-streams-and-threads 98a80a4 PiperOrigin-RevId: 643290582
Imported from GitHub PR #13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b6 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=#13603 from olupton:name-devices-streams-and-threads 98a80a4 PiperOrigin-RevId: 643290582
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b67bd9db8b3f69ba1e0d00c7881f767f037 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf7700cbe7ce72070c25ce3d21e2dfeb54f by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a40add79f108cb89987724c35f82cd727e4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13603 from olupton:name-devices-streams-and-threads 98a80a40add79f108cb89987724c35f82cd727e4 PiperOrigin-RevId: 643290582
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b67bd9db8b3f69ba1e0d00c7881f767f037 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf7700cbe7ce72070c25ce3d21e2dfeb54f by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a40add79f108cb89987724c35f82cd727e4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13603 from olupton:name-devices-streams-and-threads 98a80a40add79f108cb89987724c35f82cd727e4 PiperOrigin-RevId: 643001157
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b67bd9db8b3f69ba1e0d00c7881f767f037 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf7700cbe7ce72070c25ce3d21e2dfeb54f by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a40add79f108cb89987724c35f82cd727e4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13603 from olupton:name-devices-streams-and-threads 98a80a40add79f108cb89987724c35f82cd727e4 PiperOrigin-RevId: 643001157
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this seems very useful. I'm concerned about portability and the interactions with the thread interface in Env.
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b67bd9db8b3f69ba1e0d00c7881f767f037 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf7700cbe7ce72070c25ce3d21e2dfeb54f by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a40add79f108cb89987724c35f82cd727e4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13603 from olupton:name-devices-streams-and-threads 98a80a40add79f108cb89987724c35f82cd727e4 PiperOrigin-RevId: 643290582
Imported from GitHub PR #13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b6 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=#13603 from olupton:name-devices-streams-and-threads 98a80a4 PiperOrigin-RevId: 643290582
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 12a02b67bd9db8b3f69ba1e0d00c7881f767f037 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- bdf8dbf7700cbe7ce72070c25ce3d21e2dfeb54f by Olli Lupton <olupton@nvidia.com>: Add missing header -- 98a80a40add79f108cb89987724c35f82cd727e4 by Olli Lupton <olupton@nvidia.com>: add stubs Merging this change closes #13603 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13603 from olupton:name-devices-streams-and-threads 98a80a40add79f108cb89987724c35f82cd727e4 PiperOrigin-RevId: 643290582
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. Copybara import of the project: -- 5b3121c58db8aa1b6529f0aeb8573be8bf2cde80 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams -- d973674de6218fcee88473d85bb43ba345652fdf by Olli Lupton <olupton@nvidia.com>: Address review comments -- 918cf3e7b87150e9d666b218bbd9aca0cae606a4 by Olli Lupton <olupton@nvidia.com>: Alternative for @jbaiocchi -- 1d1978437e64c0dac97e97ea4320a6dcb3945296 by Olli Lupton <olupton@nvidia.com>: Address more review comments Merging this change closes #13603 PiperOrigin-RevId: 644901234
@@ -116,6 +116,7 @@ cc_library( | |||
"@tsl//tsl/platform:status", | |||
"@tsl//tsl/platform:statusor", | |||
"@tsl//tsl/profiler/lib:connected_traceme", | |||
"@tsl//tsl/profiler/lib:nvtx_utils", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems adding this dependency unconditionally here causes a linker error if a target links in both nvtx_utils and nvtx_utils_libtpu. I could avoid this problem when I added this dependency inside if_cuda guard and guarded the include and the new code, but not sure this is the solution you would choose yourself.
We will revert this for now.
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf0... PiperOrigin-RevId: 644915138
Imported from GitHub PR openxla/xla#13603 This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf0... PiperOrigin-RevId: 644957493
This PR was rolled back in 6cd3399! |
Second attempt at openxla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile.
Second attempt at openxla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile.
Second attempt at openxla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile.
Imported from GitHub PR openxla/xla#14092 See openxla/xla#13603, which landed and got rolled back. f75962e80d387f32dc9055cd1fff9029d97f0026 attempts to fix the issue described in openxla/xla#13603 (comment). Copybara import of the project: -- c2f947687ecc1ce8844ba7d0b258b5fd1f3b8afd by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at openxla/xla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75b2f934a1d4fe06d07519b891fbaa7f88a by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407bea90c486fd15cfffba80ba2391b1a4e5c by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f9a77c12968459770faf4bd7d0cf5db27a by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 PiperOrigin-RevId: 649062057
Imported from GitHub PR #14092 See #13603, which landed and got rolled back. f75962e attempts to fix the issue described in #13603 (comment). Copybara import of the project: -- c2f9476 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at #13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75 by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407b by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800 PiperOrigin-RevId: 649062057
Imported from GitHub PR #14092 See #13603, which landed and got rolled back. f75962e attempts to fix the issue described in #13603 (comment). Copybara import of the project: -- c2f9476 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at #13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75 by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407b by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800 PiperOrigin-RevId: 649062057
Imported from GitHub PR openxla/xla#14092 See openxla/xla#13603, which landed and got rolled back. f75962e80d387f32dc9055cd1fff9029d97f0026 attempts to fix the issue described in openxla/xla#13603 (comment). Copybara import of the project: -- c2f947687ecc1ce8844ba7d0b258b5fd1f3b8afd by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at openxla/xla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75b2f934a1d4fe06d07519b891fbaa7f88a by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407bea90c486fd15cfffba80ba2391b1a4e5c by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f9a77c12968459770faf4bd7d0cf5db27a by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 PiperOrigin-RevId: 649062057
Imported from GitHub PR openxla/xla#14092 See openxla/xla#13603, which landed and got rolled back. f75962e80d387f32dc9055cd1fff9029d97f0026 attempts to fix the issue described in openxla/xla#13603 (comment). Copybara import of the project: -- c2f947687ecc1ce8844ba7d0b258b5fd1f3b8afd by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at openxla/xla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75b2f934a1d4fe06d07519b891fbaa7f88a by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407bea90c486fd15cfffba80ba2391b1a4e5c by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f9a77c12968459770faf4bd7d0cf5db27a by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 PiperOrigin-RevId: 649062057
Imported from GitHub PR #14092 See #13603, which landed and got rolled back. f75962e attempts to fix the issue described in #13603 (comment). Copybara import of the project: -- c2f9476 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at #13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75 by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407b by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800 PiperOrigin-RevId: 649062057
Imported from GitHub PR openxla/xla#14092 See openxla/xla#13603, which landed and got rolled back. f75962e80d387f32dc9055cd1fff9029d97f0026 attempts to fix the issue described in openxla/xla#13603 (comment). Copybara import of the project: -- c2f947687ecc1ce8844ba7d0b258b5fd1f3b8afd by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at openxla/xla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75b2f934a1d4fe06d07519b891fbaa7f88a by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407bea90c486fd15cfffba80ba2391b1a4e5c by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f9a77c12968459770faf4bd7d0cf5db27a by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 PiperOrigin-RevId: 649062057
Imported from GitHub PR openxla/xla#14092 See openxla/xla#13603, which landed and got rolled back. f75962e80d387f32dc9055cd1fff9029d97f0026 attempts to fix the issue described in openxla/xla#13603 (comment). Copybara import of the project: -- c2f947687ecc1ce8844ba7d0b258b5fd1f3b8afd by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at openxla/xla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75b2f934a1d4fe06d07519b891fbaa7f88a by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407bea90c486fd15cfffba80ba2391b1a4e5c by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f9a77c12968459770faf4bd7d0cf5db27a by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 PiperOrigin-RevId: 649062057
Imported from GitHub PR #14092 See #13603, which landed and got rolled back. f75962e attempts to fix the issue described in #13603 (comment). Copybara import of the project: -- c2f9476 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at #13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75 by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407b by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800 PiperOrigin-RevId: 649062057
Imported from GitHub PR #14092 See #13603, which landed and got rolled back. f75962e attempts to fix the issue described in #13603 (comment). Copybara import of the project: -- c2f9476 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at #13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75 by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407b by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800 PiperOrigin-RevId: 649062057
Imported from GitHub PR openxla/xla#14092 See openxla/xla#13603, which landed and got rolled back. f75962e80d387f32dc9055cd1fff9029d97f0026 attempts to fix the issue described in openxla/xla#13603 (comment). Copybara import of the project: -- c2f947687ecc1ce8844ba7d0b258b5fd1f3b8afd by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at openxla/xla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75b2f934a1d4fe06d07519b891fbaa7f88a by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407bea90c486fd15cfffba80ba2391b1a4e5c by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f9a77c12968459770faf4bd7d0cf5db27a by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 PiperOrigin-RevId: 649062057
Imported from GitHub PR #14092 See #13603, which landed and got rolled back. f75962e attempts to fix the issue described in #13603 (comment). Copybara import of the project: -- c2f9476 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at #13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75 by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407b by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 FUTURE_COPYBARA_INTEGRATE_REVIEW=#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800 PiperOrigin-RevId: 649062057
Imported from GitHub PR openxla/xla#14092 See openxla/xla#13603, which landed and got rolled back. f75962e80d387f32dc9055cd1fff9029d97f0026 attempts to fix the issue described in openxla/xla#13603 (comment). Copybara import of the project: -- c2f947687ecc1ce8844ba7d0b258b5fd1f3b8afd by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at openxla/xla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75b2f934a1d4fe06d07519b891fbaa7f88a by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407bea90c486fd15cfffba80ba2391b1a4e5c by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f9a77c12968459770faf4bd7d0cf5db27a by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 PiperOrigin-RevId: 649377094
Imported from GitHub PR #14092 See #13603, which landed and got rolled back. f75962e attempts to fix the issue described in #13603 (comment). Copybara import of the project: -- c2f9476 by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at #13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75 by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407b by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 COPYBARA_INTEGRATE_REVIEW=#14092 from olupton:name-devices-streams-and-threads-v2 7aa0800 PiperOrigin-RevId: 649377094
Imported from GitHub PR openxla/xla#14092 See openxla/xla#13603, which landed and got rolled back. f75962e80d387f32dc9055cd1fff9029d97f0026 attempts to fix the issue described in openxla/xla#13603 (comment). Copybara import of the project: -- c2f947687ecc1ce8844ba7d0b258b5fd1f3b8afd by Olli Lupton <olupton@nvidia.com>: NVTX: name threads, CUDA devices and CUDA streams Second attempt at openxla/xla#13603, which was rolled back. This aims to improve the profiling experience. These names are shown in the Nsight Systems UI. Device names: ![Screenshot 2024-06-10 at 14 52 37](https://github.com/openxla/xla/assets/6459623/d889d37e-ca2e-4f5e-b5bd-240bbb625b4c) Stream names: ![Screenshot 2024-06-10 at 14 53 25](https://github.com/openxla/xla/assets/6459623/4bfc4ffa-8fdf-4b93-b23e-95bf056799f3) Thread names: ![Screenshot 2024-06-10 at 14 54 04](https://github.com/openxla/xla/assets/6459623/8852ca9e-f2f4-4a45-8334-a18f8ab5ce18) This also provides a missing link between replica IDs in the HLO and the physical devices in the profile. -- ac4af75b2f934a1d4fe06d07519b891fbaa7f88a by Olli Lupton <olupton@nvidia.com>: Work around nvtx_utils_libtpu error -- 2b3407bea90c486fd15cfffba80ba2391b1a4e5c by Olli Lupton <olupton@nvidia.com>: Set visibility -- a79d09f9a77c12968459770faf4bd7d0cf5db27a by Olli Lupton <olupton@nvidia.com>: add missing ifdef -- 7aa0800429fbbf4033f8a4da54d4114d1bd4d228 by Olli Lupton <olupton@nvidia.com>: Move device/thread naming into separate function Merging this change closes #14092 PiperOrigin-RevId: 649377094
This aims to improve the profiling experience. These names are shown in the Nsight Systems UI.
Device names:
Stream names:
Thread names:
This also provides a missing link between replica IDs in the HLO and the physical devices in the profile.