Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object Detection MLModel for iOS with output configuration of confidence scores & coordinates for the bounding box. #535

Closed
ajurav1 opened this issue Jul 28, 2020 · 12 comments
Labels
question Further information is requested Stale

Comments

@ajurav1
Copy link

ajurav1 commented Jul 28, 2020

I have exported the mlmodel from export.py, and the model exported have a type of Neural Network and output configuration is
[<VNCoreMLFeatureValueObservation: 0x281d80240> 4702BA0E-857D-4CE6-88C1-4E47186E751F requestRevision=1 confidence=1.000000 "2308" - "MultiArray : Float32 1 x 3 x 20 x 20 x 85 array" (1.000000), <VNCoreMLFeatureValueObservation: 0x281d802d0> AE0A0580-7DA2-4991-98BB-CD26EE257C7A requestRevision=1 confidence=1.000000 "2327" - "MultiArray : Float32 1 x 3 x 40 x 40 x 85 array" (1.000000), <VNCoreMLFeatureValueObservation: 0x281d80330> 0253FD2B-10B0-4047-A001-624D1864D27C requestRevision=1 confidence=1.000000 "2346" - "MultiArray : Float32 1 x 3 x 80 x 80 x 85 array" (1.000000)]

I was looking for output type VNRecognizedObjectObservation in yoloV5 instead of VNCoreMLFeatureValueObservation.

So, my question is what information does this VNCoreMLFeatureValueObservation MultiArray hold (is it something like a UIImage or CGRect?, or something different?) and how can I convert this Multidimensional Array into a useful set of data that I can actually see confidence scores & coordinates?

@ajurav1 ajurav1 added the question Further information is requested label Jul 28, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Jul 28, 2020

Hello @ajurav1, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook Open In Colab, Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

  • Cloud-based AI systems operating on hundreds of HD video streams in realtime.
  • Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
  • Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

@ajurav1 ajurav1 changed the title Object Detection MLModel for iOS of type Pipeline instead Neural Network Object Detection MLModel for iOS with output configuration of confidence scores & coordinates for the bounding box. Jul 28, 2020
@dlawrences
Copy link
Contributor

dlawrences commented Jul 28, 2020

Hi @ajurav1

Those tensors store network predictions that are not decoded. There are a few recommendations I made in #343 to decode these and there is even a NumPy sample code that does this for the ONNX model.

There's also guidance here: https://docs.ultralytics.com/yolov5/tutorials/model_export

Take a look in there and let me know if you require further support.

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@ShreshthSaxena
Copy link

ShreshthSaxena commented Sep 4, 2020

Hi @dlawrences ,
I'm in the same boat trying to figure out how to convert these raw predictions to bbox coordinates. Is there a swift implementation available to interpret these outputs or someway I can add the post-processing to a CoreML pipeline and get the final outputs in Swift?

@dlawrences
Copy link
Contributor

@ShreshthSaxena , there is a lot of useful information which you will be able to use on this blog: https://machinethink.net/blog/

I recommend acquiring the book as well.

@maidmehic
Copy link

maidmehic commented Nov 27, 2020

Hi @ajurav1 and @ShreshthSaxena, have you managed to convert VNCoreMLFeatureValueObservation MultiArrays to some useful info on Swift side?

@kir486680
Copy link

@dlawrences In the article, the author says that the "Ultralytics YOLOv5 model has a Core ML version but it requires changes before you can use it with Vision." Did you manage to make it work with Vision?

@dlawrences
Copy link
Contributor

dlawrences commented Feb 1, 2021

@kir486680 the CoreML model, as exported from this repository, presents the final feature maps, thus the predictions are not decoded, nor any NMS is applied.

I have been able to create the required steps locally, yes, and it works directly with Vision.

@kir486680
Copy link

@dlawrences could you please share at least some clues pls? I implemented NMS but I do not know what to do with the output from the model. I seen some implementations here . Do you have something similar?

@wmcnally
Copy link

wmcnally commented Aug 3, 2021

@kir486680 @dlawrences any update on this?

@Workshed
Copy link

If you come across this issue, there's a script here for creating a CoreML model which outputs the expected values https://github.com/Workshed/yolov5-coreml-export

@glenn-jocher
Copy link
Member

@Workshed thanks for sharing your script for creating a CoreML model that outputs the expected values with YOLOv5. This will be helpful for others who are looking to work with YOLOv5 in CoreML. Keep up the great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

8 participants