Fix fp16 (`--half`) support for `TritonRemoteModel` model type #10787

fabito · 2023-01-18T02:15:20Z

Fixes: #10786

🛠️ PR Summary

_{Made with ❤️ by Ultralytics Actions}

🌟 Summary

Enhancement of mixed-precision inference support.

📊 Key Changes

Allow FP16 (half-precision) inference on Triton backends.

🎯 Purpose & Impact

🚀 Purpose: To enable more efficient memory usage and potentially faster inference times when using the Triton Inference Server by supporting half-precision (FP16).
⏱ Impact: Users deploying models with Triton can take advantage of reduced memory footprint and increased performance due to FP16 computation. This can be particularly beneficial for GPUs with Tensor Cores that are optimized for FP16 operations.

for more information, see https://pre-commit.ci

github-actions

👋 Hello @fabito, thank you for submitting a YOLOv5 🚀 PR! To allow your work to be integrated as seamlessly as possible, we advise you to:

✅ Verify your PR is up-to-date with ultralytics/yolov5 master branch. If your PR is behind you can update your code by clicking the 'Update branch' button or by running git pull and git merge master locally.
✅ Verify all YOLOv5 Continuous Integration (CI) checks are passing.
✅ Reduce changes to the absolute minimum required for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." — Bruce Lee

Fix fp16 (`--half`) support for `TritonRemoteModel` model type (ultralytics#10787)

…lytics#10787) * Fix fp16 (--half) support for TritonRemoteModel * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>

fabito and others added 2 commits January 18, 2023 15:00

Fix fp16 (--half) support for TritonRemoteModel

83a2a01

[pre-commit.ci] auto fixes from pre-commit.com hooks

0984f6f

for more information, see https://pre-commit.ci

github-actions bot reviewed Jan 18, 2023

View reviewed changes

fabito and others added 2 commits March 29, 2023 21:22

Merge branch 'master' into bugfix/triton-half-precision

c9904f3

Merge branch 'master' into bugfix/triton-half-precision

11e6ef1

glenn-jocher merged commit ec2b853 into ultralytics:master May 16, 2023

NagatoYuki0943 added a commit to NagatoYuki0943/yolov5-ultralytics that referenced this pull request May 17, 2023

Merge pull request #40 from ultralytics/master

2df34f5

Fix fp16 (`--half`) support for `TritonRemoteModel` model type (ultralytics#10787)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix fp16 (`--half`) support for `TritonRemoteModel` model type #10787

Fix fp16 (`--half`) support for `TritonRemoteModel` model type #10787

fabito commented Jan 18, 2023 •

edited by UltralyticsAssistant

Loading

github-actions bot left a comment

Fix fp16 (--half) support for TritonRemoteModel model type #10787

Fix fp16 (--half) support for TritonRemoteModel model type #10787

Conversation

fabito commented Jan 18, 2023 • edited by UltralyticsAssistant Loading

🛠️ PR Summary

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

github-actions bot left a comment

Choose a reason for hiding this comment

Fix fp16 (`--half`) support for `TritonRemoteModel` model type #10787

Fix fp16 (`--half`) support for `TritonRemoteModel` model type #10787

fabito commented Jan 18, 2023 •

edited by UltralyticsAssistant

Loading