Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error compile yolov8 to hef #111

Open
Will-UEA opened this issue Jul 18, 2024 · 8 comments
Open

Error compile yolov8 to hef #111

Will-UEA opened this issue Jul 18, 2024 · 8 comments

Comments

@Will-UEA
Copy link

I'm trying to compile a custom model trained with YOLOv8s so I can use it on the Raspberry Pi 5. But when it gets to "Starting Layer Noise Analysis," it throws an error. Any idea what could be wrong? I've tried searching but couldn't find anything specific.

hailo_model_optimization.acceleras.utils.acceleras_exceptions.SubprocessTracebackFailure: Subprocess failed with traceback Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/tools/subprocess_wrapper.py", line 73, in child_wrapper func(self, *args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/flows/optimization_flow.py", line 347, in step3 self.finalize_optimization() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped result = method(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/flows/optimization_flow.py", line 405, in finalize_optimization self.noise_analysis() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped result = method(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/flows/optimization_flow.py", line 585, in noise_analysis algo.run() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/algorithms/optimization_algorithm.py", line 50, in run return super().run() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/algorithms/algorithm_base.py", line 151, in run self.run_int() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/algorithms/hailo_layer_noise_analysis.py", line 83, in run_int self.analyze_full_quant_net() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/algorithms/hailo_layer_noise_analysis.py", line 197, in analyze_full_quant_net lat_model.predict_on_batch(inputs) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2603, in predict_on_batch outputs = self.predict_function(iterator) File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "/tmp/autograph_generated_filez7ousod7.py", line 15, in tf__predict_function retval = ag.converted_call(ag.ld(step_function), (ag_.ld(self), ag__.ld(iterator)), None, fscope) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2155, in step_function outputs = model.distribute_strategy.run(run_step, args=(data,)) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2143, in run_step outputs = model.predict_step(data) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2111, in predict_step return self(x, training=False) File "/usr/local/lib/python3.10/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler raise e.with_traceback(filtered_tb) from None File "/tmp/autograph_generated_file6wtrfjh0.py", line 188, in tf__call ag.for_stmt(ag__.converted_call(ag__.ld(self).model.flow.toposort, (), None, fscope), None, loop_body_5, get_state_9, set_state_9, (), {'iterate_names': 'lname'}) File "/tmp/autograph_generated_file6wtrfjh0.py", line 167, in loop_body_5 ag.if_stmt(ag_.not_(continue__1), if_body_3, else_body_3, get_state_8, set_state_8, (), 0) File "/tmp/autograph_generated_file6wtrfjh0.py", line 94, in if_body_3 n_ancestors = ag.converted_call(ag__.ld(self).native_model.flow.ancestors, (ag_.ld(lname),), None, fscope) File "/tmp/autograph_generated_fileh91llgie.py", line 12, in tf__ancestors retval = ag_.converted_call(ag__.ld(nx).ancestors, (ag__.ld(self), ag__.ld(source)), None, fscope) TypeError: in user code: File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2169, in predict_function * return step_function(self, iterator) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2155, in step_function ** outputs = model.distribute_strategy.run(run_step, args=(data,)) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2143, in run_step ** outputs = model.predict_step(data) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2111, in predict_step return self(x, training=False) File "/usr/local/lib/python3.10/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler raise e.with_traceback(filtered_tb) from None File "/tmp/autograph_generated_file6wtrfjh0.py", line 188, in tf__call ag.for_stmt(ag__.converted_call(ag__.ld(self).model.flow.toposort, (), None, fscope), None, loop_body_5, get_state_9, set_state_9, (), {'iterate_names': 'lname'}) File "/tmp/autograph_generated_file6wtrfjh0.py", line 167, in loop_body_5 ag.if_stmt(ag_.not_(continue__1), if_body_3, else_body_3, get_state_8, set_state_8, (), 0) File "/tmp/autograph_generated_file6wtrfjh0.py", line 94, in if_body_3 n_ancestors = ag.converted_call(ag__.ld(self).native_model.flow.ancestors, (ag_.ld(lname),), None, fscope) File "/tmp/autograph_generated_fileh91llgie.py", line 12, in tf__ancestors retval = ag_.converted_call(ag__.ld(nx).ancestors, (ag__.ld(self), ag__.ld(source)), None, fscope) TypeError: Exception encountered when calling layer 'lat_model' (type LATModel). in user code: File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/algorithms/lat_utils/lat_model.py", line 340, in call * n_ancestors = self._native_model.flow.ancestors(lname) File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/acceleras/model/hailo_model/model_flow.py", line 31, in ancestors * return nx.ancestors(self, source) TypeError: outer_factory..inner_factory..tf__func() missing 1 required keyword-only argument: '__wrapper' Call arguments received by layer 'lat_model' (type LATModel): • inputs=tf.Tensor(shape=(8, 640, 640, 3), dtype=float32)

@omerwer
Copy link

omerwer commented Jul 18, 2024

Hi @Will-UEA,
It's difficult to know what's the issue without examining the model and the command you ran.
First, try to run the optimization process with optimization level of 0 (you can disable the GPU by adding CUDA_VISIBLE_DEVICES=999 before the command).
In either case, if you can please open a ticket in out ticketing system in the Hailo website with the relevant info + files (the ONNX you used, for example), or contact me via email at omerw@hailo.ai with the relevant info.

Regards,

@nadaved1
Copy link

nadaved1 commented Jul 18, 2024 via email

@Will-UEA
Copy link
Author

Hi @Will-UEA, It's difficult to know what's the issue without examining the model and the command you ran. First, try to run the optimization process with optimization level of 0 (you can disable the GPU by adding CUDA_VISIBLE_DEVICES=999 before the command). In either case, if you can please open a ticket in out ticketing system in the Hailo website with the relevant info + files (the ONNX you used, for example), or contact me via email at omerw@hailo.ai with the relevant info.

Regards,

I'll try to do that as soon as I get back from work. Once I do it, I'll come back here

@Will-UEA
Copy link
Author

Please consult on the community forum בתאריך יום ה׳, 18 ביולי 2024, 08:42, מאת omerwer @.***

: Hi @Will-UEA https://github.com/Will-UEA, It's difficult to know what's the issue without examining the model and the command you ran. First, try to run the optimization process with optimization level of 0 (you can disable the GPU by adding CUDA_VISIBLE_DEVICES=999 before the command). In either case, if you can please open a ticket in out ticketing system in the Hailo website with the relevant info + files (the ONNX you used, for example), or contact me via email at @.*** with the relevant info. Regards, — Reply to this email directly, view it on GitHub <#111 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADBIQYD2QBA5TJVT65SGFCDZM5IT7AVCNFSM6AAAAABLBYUFTWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZVGU2DOOJTGI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Should I start a thread there?

@Will-UEA
Copy link
Author

The CLI command I was using was:
hailomz compile yolov8s --ckpt yolov8s.onnx --hw_arch hailo8l --calib-path /home/hailo_model_zoo/Retreino/images --classes 3 --perfomance
This command was giving me an error in the section I mentioned earlier.
I tried using the Python code available from DFC and managed to get past the optimization (didn't encounter the same error).
However, when compiling with the Python code, I got an error (forgot to print) and will try again when I get home.

@Armtronix2021
Copy link

Armtronix2021 commented Jul 20, 2024

Since the past two or three days I have been trying to compile one custom model to hef. I am facing similar issue
I have followed the procedure as mentioned in the links below
(https://github.com/hailo-ai/hailo_model_zoo/tree/833ae6175c06dbd6c3fc8faeb23659c9efaa2dbe/training/yolov8)
(https://github.com/hailo-ai/hailo-rpi5-examples/blob/main/doc/retraining-example.md#using-yolov8-retraining-docker)

I have used docker to do the training on my dataset. Sharing the commands below which i have run on the docker
For Training :
yolo detect train data=/home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/MF-object-detection-4/data.yaml model=yolov8s.pt name=MF_yolov8s_n epochs=300 batch=16
For Export to ONNX:
yolo export model=/workspace/ultralytics/runs/detect/MF_yolov8s_n/weights/best.pt imgsz=640 format=onnx opset=11
Copying the Model to the regular system from the docker :

cp runs/detect/MF_yolov8s_n/weights/best.onnx /home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/yolov8s.onnx

plz note i have renamed it as yolov8s.onnx as i read this somewhere on github ,link to this
(#85)
(#94)

I exited the docker and then entered the following command
"hailomz compile yolov8s --ckpt /home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/yolov8s.onnx --calib-path /home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/MF-object-detection-4/test/images --hw-arch hailo8l --classes 2 --performance"

Once i do this i get the following error

hailo_model_optimization.acceleras.utils.acceleras_exceptions.NegativeSlopeExponentNonFixable: Quantization failed in layer yolov8s/conv42 due to unsupported required slope. Desired shift is 14.0, but op has only 8 data bits. This error raises when the data or weight range are not balanced. Mostly happens when using random calibration-set/weights, the calibration-set is not normalized properly or batch-normalization was not used during training.

I have tried using the model directly in pt format on my system (not on pi5); it works without any issue . I am just a beginner so I am not sure what I am doing incorrectly.
Any one who can point me in the correct direction would be of gr8 help
attaching copy of error
Complete error.txt

@nadaved1
Copy link

nadaved1 commented Jul 20, 2024 via email

@Will-UEA
Copy link
Author

Good morning, nadaved

I did the optimization process by adding that parameter you mentioned. Here are the results I got:

yoloteste/output_layer2 SNR: 4.574 db

yoloteste/output_layer1 SNR: -37.9 db

Is it correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants