Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INT 8 quantization is not implemented when selected to export TensorFlow Lite #748

Open
1 task done
franciscocostela opened this issue Jun 28, 2024 · 4 comments
Open
1 task done
Assignees
Labels
bug Something isn't working

Comments

@franciscocostela
Copy link

Search before asking

  • I have searched the HUB issues and found no similar bug report.

HUB Component

Export

Bug

I trained an Object Detection Model using both YoloV5n and YoloV8n. In the Deploy tab, I selected TensorFlow Lite - advanced and selected 'Int 8 Quantization'. Then I clicked Export and downloaded the model when the button Download became available.
Screenshot 2024-06-06 at 9 25 05 AM
However, when I inspect the file, it looks like the quantization is never applied. This happens for both YoloV5n and YoloV8n
image
Is there anything that I am not doing correctly or is this really a bug?

Environment

No response

Minimal Reproducible Example

No response

Additional

No response

@franciscocostela franciscocostela added the bug Something isn't working label Jun 28, 2024
Copy link

👋 Hello @franciscocostela, thank you for raising an issue about Ultralytics HUB 🚀! Please visit our HUB Docs to learn more:

  • Quickstart. Start training and deploying YOLO models with HUB in seconds.
  • Datasets: Preparing and Uploading. Learn how to prepare and upload your datasets to HUB in YOLO format.
  • Projects: Creating and Managing. Group your models into projects for improved organization.
  • Models: Training and Exporting. Train YOLOv5 and YOLOv8 models on your custom datasets and export them to various formats for deployment.
  • Integrations. Explore different integration options for your trained models, such as TensorFlow, ONNX, OpenVINO, CoreML, and PaddlePaddle.
  • Ultralytics HUB App. Learn about the Ultralytics App for iOS and Android, which allows you to run models directly on your mobile device.
    • iOS. Learn about YOLO CoreML models accelerated on Apple's Neural Engine on iPhones and iPads.
    • Android. Explore TFLite acceleration on mobile devices.
  • Inference API. Understand how to use the Inference API for running your trained models in the cloud to generate predictions.

If this is a 🐛 Bug Report, please provide screenshots and steps to reproduce your problem to help us get started working on a fix.

If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response.

We try to respond to all issues as promptly as possible. Thank you for your patience!

@ultralytics ultralytics deleted a comment from pderrenger Jul 1, 2024
@sergiuwaxmann
Copy link
Member

@franciscocostela Hello!
Can you check if the exported model size is 4x smaller than the fp32 one?

@franciscocostela
Copy link
Author

Hi Sergiu,

Yes - It is about 4x smaller. These are the sizes of the files:
Original FP32 - 11.6MB
Fp16 - 5.8MB
Int8 - 3.0MB

I am trying to run the TFLite file through a conversion pipeline to deploy it into a camera but it fails with an error message about the file not being quantized. When I inspect it with Netron, I see that the quantization bias is FLOAT32. INT8 is used in some of the convolution layers but not all of them (see screenshot). This seems to trigger the error message using the conversion pipeline.

@ultralytics ultralytics deleted a comment from pderrenger Jul 2, 2024
@sergiuwaxmann
Copy link
Member

@franciscocostela Based on the file size, the quantization is applied.

@sergiuwaxmann sergiuwaxmann self-assigned this Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants