Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fp16 (--half) support for TritonRemoteModel model type #10787

Merged
merged 4 commits into from
May 16, 2023

Conversation

fabito
Copy link
Contributor

@fabito fabito commented Jan 18, 2023

Fixes: #10786

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Enhancement of mixed-precision inference support.

📊 Key Changes

  • Allow FP16 (half-precision) inference on Triton backends.

🎯 Purpose & Impact

  • 🚀 Purpose: To enable more efficient memory usage and potentially faster inference times when using the Triton Inference Server by supporting half-precision (FP16).
  • Impact: Users deploying models with Triton can take advantage of reduced memory footprint and increased performance due to FP16 computation. This can be particularly beneficial for GPUs with Tensor Cores that are optimized for FP16 operations.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Hello @fabito, thank you for submitting a YOLOv5 🚀 PR! To allow your work to be integrated as seamlessly as possible, we advise you to:

  • ✅ Verify your PR is up-to-date with ultralytics/yolov5 master branch. If your PR is behind you can update your code by clicking the 'Update branch' button or by running git pull and git merge master locally.
  • ✅ Verify all YOLOv5 Continuous Integration (CI) checks are passing.
  • ✅ Reduce changes to the absolute minimum required for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." — Bruce Lee

@glenn-jocher glenn-jocher merged commit ec2b853 into ultralytics:master May 16, 2023
NagatoYuki0943 added a commit to NagatoYuki0943/yolov5-ultralytics that referenced this pull request May 17, 2023
Fix fp16 (`--half`) support for `TritonRemoteModel` model type (ultralytics#10787)
bandakopi pushed a commit to irajcode/yolov5 that referenced this pull request Jul 20, 2023
…lytics#10787)

* Fix fp16 (--half) support for TritonRemoteModel

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
pleb631 pushed a commit to pleb631/yolov5 that referenced this pull request Jan 6, 2024
…lytics#10787)

* Fix fp16 (--half) support for TritonRemoteModel

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

detect.py --half flag not working for TritonRemoteModel
2 participants