Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Alternative runtime for AI model #787

Closed
e-fominov opened this issue May 4, 2023 · 3 comments
Closed

[Feature] Alternative runtime for AI model #787

e-fominov opened this issue May 4, 2023 · 3 comments

Comments

@e-fominov
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Current implementation of AI is using Yolo v2 with Darkent runtime. Which is (was) great for development/experiments, but now there are many better ways to make AI inference. For Nvidia GPU this is TensorRT. For Intel CPU, Intel GPU and accelerator it is OpenVINO. Mobile devices and Google TPUs work great with TensorFlowLite. Tiny single-board computers with Arm processors are fast with NCNN. Many new chips have dedicated AI chips onboard with dedicated libraries. And user may want to decide which he wants to use.
I have seen many discussions in Discord since 2019 and here in issues like (#21, #96) where people tried to improve or at least to focus the problem, but I can't find if this is really on a roadmap.

Describe the solution you'd like
We can't simply train a new model with existing data, because data is not public (some small set of videos without any annotations is published here: #310 which is not enough to train a model). But we can convert existing model into a new format compatible with modern NN inference libraries. With my experience, the most straightforward way is first to convert Darknet format into ONNX and then convert ONNX into anything. Here is an example how this can be done, but that example does not work out-of-the box and requires some coding.
After we have converted models, we can make current ML code to detect the model file format (extension) and use the right runtime for it.
I am expecting 2x-10x performance improvement after this work being done. It should reduce the load/price of current cloud servers and will allow running better/complex models in a future making Obico product even better.
This will also allow us to drop precompiled Darknet binaries from repository and extend the set of compatible processors/architectures (like Apple M1 here #434).

Describe alternatives you've considered
A better way is to train new/better models. Even if training data can't be published, a good way is to publish testing/validation data to allow people use their own training data and publish better models if any.

Additional context
Before submitting this issue, I just tried to run ONNX-converted model with Onnxruntime. On my desktop PC with Intel I7 CPU (i7-6700K), current code takes 1.1 sec to make an inference of a single image while ONNX makes it do the same work of the same model with same input image with only 0.078 sec (14x) on CPU and 0.013 sec if run on GPU. This was a quick and dirty benchmark, but it proves that there is something to improve.

So the question: is there a place this change? May be it's on the roadmap or implemented somewhere already?

@kennethjiang
Copy link
Contributor

Sorry just saw this issue and your PR. Let's continue our conversation here.

We will be happy to provide the full training set if you can sign an NDA with us. We are just not authorized to make most of the data public.

@jfrilot
Copy link

jfrilot commented May 19, 2023

A Tensor Flow Lite version would be a great addition. Google Edge TPU devices are widely available now(except USB) and inexpensive. The Home Assistant community is already using Google Coral devices for Frigate on Raspberry Pi and home servers.

@kennethjiang
Copy link
Contributor

PR merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants