Error compile yolov8 to hef #111

Will-UEA · 2024-07-18T03:00:12Z

I'm trying to compile a custom model trained with YOLOv8s so I can use it on the Raspberry Pi 5. But when it gets to "Starting Layer Noise Analysis," it throws an error. Any idea what could be wrong? I've tried searching but couldn't find anything specific.

hailo_model_optimization.acceleras.utils.acceleras_exceptions.SubprocessTracebackFailure: Subprocess failed with traceback Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/tools/subprocess_wrapper.py", line 73, in child_wrapper func(self, *args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/flows/optimization_flow.py", line 347, in step3 self.finalize_optimization() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped result = method(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/flows/optimization_flow.py", line 405, in finalize_optimization self.noise_analysis() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped result = method(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/flows/optimization_flow.py", line 585, in noise_analysis algo.run() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/algorithms/optimization_algorithm.py", line 50, in run return super().run() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/algorithms/algorithm_base.py", line 151, in run self.run_int() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/algorithms/hailo_layer_noise_analysis.py", line 83, in run_int self.analyze_full_quant_net() File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/algorithms/hailo_layer_noise_analysis.py", line 197, in analyze_full_quant_net lat_model.predict_on_batch(inputs) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2603, in predict_on_batch outputs = self.predict_function(iterator) File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "/tmp/autograph_generated_filez7ousod7.py", line 15, in tf__predict_function retval = ag.converted_call(ag.ld(step_function), (ag_.ld(self), ag__.ld(iterator)), None, fscope) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2155, in step_function outputs = model.distribute_strategy.run(run_step, args=(data,)) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2143, in run_step outputs = model.predict_step(data) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2111, in predict_step return self(x, training=False) File "/usr/local/lib/python3.10/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler raise e.with_traceback(filtered_tb) from None File "/tmp/autograph_generated_file6wtrfjh0.py", line 188, in tf__call ag.for_stmt(ag__.converted_call(ag__.ld(self).model.flow.toposort, (), None, fscope), None, loop_body_5, get_state_9, set_state_9, (), {'iterate_names': 'lname'}) File "/tmp/autograph_generated_file6wtrfjh0.py", line 167, in loop_body_5 ag.if_stmt(ag_.not_(continue__1), if_body_3, else_body_3, get_state_8, set_state_8, (), 0) File "/tmp/autograph_generated_file6wtrfjh0.py", line 94, in if_body_3 n_ancestors = ag.converted_call(ag__.ld(self).native_model.flow.ancestors, (ag_.ld(lname),), None, fscope) File "/tmp/autograph_generated_fileh91llgie.py", line 12, in tf__ancestors retval = ag_.converted_call(ag__.ld(nx).ancestors, (ag__.ld(self), ag__.ld(source)), None, fscope) TypeError: in user code: File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2169, in predict_function * return step_function(self, iterator) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2155, in step_function ** outputs = model.distribute_strategy.run(run_step, args=(data,)) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2143, in run_step ** outputs = model.predict_step(data) File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 2111, in predict_step return self(x, training=False) File "/usr/local/lib/python3.10/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler raise e.with_traceback(filtered_tb) from None File "/tmp/autograph_generated_file6wtrfjh0.py", line 188, in tf__call ag.for_stmt(ag__.converted_call(ag__.ld(self).model.flow.toposort, (), None, fscope), None, loop_body_5, get_state_9, set_state_9, (), {'iterate_names': 'lname'}) File "/tmp/autograph_generated_file6wtrfjh0.py", line 167, in loop_body_5 ag.if_stmt(ag_.not_(continue__1), if_body_3, else_body_3, get_state_8, set_state_8, (), 0) File "/tmp/autograph_generated_file6wtrfjh0.py", line 94, in if_body_3 n_ancestors = ag.converted_call(ag__.ld(self).native_model.flow.ancestors, (ag_.ld(lname),), None, fscope) File "/tmp/autograph_generated_fileh91llgie.py", line 12, in tf__ancestors retval = ag_.converted_call(ag__.ld(nx).ancestors, (ag__.ld(self), ag__.ld(source)), None, fscope) TypeError: Exception encountered when calling layer 'lat_model' (type LATModel). in user code: File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/algorithms/lat_utils/lat_model.py", line 340, in call * n_ancestors = self._native_model.flow.ancestors(lname) File "/usr/local/lib/python3.10/dist-packages/hailo_model_optimization/acceleras/model/hailo_model/model_flow.py", line 31, in ancestors * return nx.ancestors(self, source) TypeError: outer_factory..inner_factory..tf__func() missing 1 required keyword-only argument: '__wrapper' Call arguments received by layer 'lat_model' (type LATModel): • inputs=tf.Tensor(shape=(8, 640, 640, 3), dtype=float32)

omerwer · 2024-07-18T05:41:59Z

Hi @Will-UEA,
It's difficult to know what's the issue without examining the model and the command you ran.
First, try to run the optimization process with optimization level of 0 (you can disable the GPU by adding CUDA_VISIBLE_DEVICES=999 before the command).
In either case, if you can please open a ticket in out ticketing system in the Hailo website with the relevant info + files (the ONNX you used, for example), or contact me via email at omerw@hailo.ai with the relevant info.

Regards,

nadaved1 · 2024-07-18T14:59:13Z

Please consult on the community forum בתאריך יום ה׳, 18 ביולי 2024, 08:42, מאת omerwer ***@***.***

…

: Hi @Will-UEA <https://github.com/Will-UEA>, It's difficult to know what's the issue without examining the model and the command you ran. First, try to run the optimization process with optimization level of 0 (you can disable the GPU by adding CUDA_VISIBLE_DEVICES=999 before the command). In either case, if you can please open a ticket in out ticketing system in the Hailo website with the relevant info + files (the ONNX you used, for example), or contact me via email at ***@***.*** with the relevant info. Regards, — Reply to this email directly, view it on GitHub <#111 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADBIQYD2QBA5TJVT65SGFCDZM5IT7AVCNFSM6AAAAABLBYUFTWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZVGU2DOOJTGI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Will-UEA · 2024-07-18T16:45:15Z

Hi @Will-UEA, It's difficult to know what's the issue without examining the model and the command you ran. First, try to run the optimization process with optimization level of 0 (you can disable the GPU by adding CUDA_VISIBLE_DEVICES=999 before the command). In either case, if you can please open a ticket in out ticketing system in the Hailo website with the relevant info + files (the ONNX you used, for example), or contact me via email at omerw@hailo.ai with the relevant info.

Regards,

I'll try to do that as soon as I get back from work. Once I do it, I'll come back here

Will-UEA · 2024-07-18T16:46:21Z

Please consult on the community forum בתאריך יום ה׳, 18 ביולי 2024, 08:42, מאת omerwer @.***
…
: Hi @Will-UEA https://github.com/Will-UEA, It's difficult to know what's the issue without examining the model and the command you ran. First, try to run the optimization process with optimization level of 0 (you can disable the GPU by adding CUDA_VISIBLE_DEVICES=999 before the command). In either case, if you can please open a ticket in out ticketing system in the Hailo website with the relevant info + files (the ONNX you used, for example), or contact me via email at @.*** with the relevant info. Regards, — Reply to this email directly, view it on GitHub <#111 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADBIQYD2QBA5TJVT65SGFCDZM5IT7AVCNFSM6AAAAABLBYUFTWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZVGU2DOOJTGI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Should I start a thread there?

Will-UEA · 2024-07-19T10:18:28Z

The CLI command I was using was:
hailomz compile yolov8s --ckpt yolov8s.onnx --hw_arch hailo8l --calib-path /home/hailo_model_zoo/Retreino/images --classes 3 --perfomance
This command was giving me an error in the section I mentioned earlier.
I tried using the Python code available from DFC and managed to get past the optimization (didn't encounter the same error).
However, when compiling with the Python code, I got an error (forgot to print) and will try again when I get home.

Armtronix2021 · 2024-07-20T05:56:03Z

Since the past two or three days I have been trying to compile one custom model to hef. I am facing similar issue
I have followed the procedure as mentioned in the links below
(https://github.com/hailo-ai/hailo_model_zoo/tree/833ae6175c06dbd6c3fc8faeb23659c9efaa2dbe/training/yolov8)
(https://github.com/hailo-ai/hailo-rpi5-examples/blob/main/doc/retraining-example.md#using-yolov8-retraining-docker)

I have used docker to do the training on my dataset. Sharing the commands below which i have run on the docker
For Training :
yolo detect train data=/home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/MF-object-detection-4/data.yaml model=yolov8s.pt name=MF_yolov8s_n epochs=300 batch=16
For Export to ONNX:
yolo export model=/workspace/ultralytics/runs/detect/MF_yolov8s_n/weights/best.pt imgsz=640 format=onnx opset=11
Copying the Model to the regular system from the docker :

cp runs/detect/MF_yolov8s_n/weights/best.onnx /home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/yolov8s.onnx

plz note i have renamed it as yolov8s.onnx as i read this somewhere on github ,link to this
(#85)
(#94)

I exited the docker and then entered the following command
"hailomz compile yolov8s --ckpt /home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/yolov8s.onnx --calib-path /home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/MF-object-detection-4/test/images --hw-arch hailo8l --classes 2 --performance"

Once i do this i get the following error

hailo_model_optimization.acceleras.utils.acceleras_exceptions.NegativeSlopeExponentNonFixable: Quantization failed in layer yolov8s/conv42 due to unsupported required slope. Desired shift is 14.0, but op has only 8 data bits. This error raises when the data or weight range are not balanced. Mostly happens when using random calibration-set/weights, the calibration-set is not normalized properly or batch-normalization was not used during training.

I have tried using the model directly in pt format on my system (not on pi5); it works without any issue . I am just a beginner so I am not sure what I am doing incorrectly.
Any one who can point me in the correct direction would be of gr8 help
attaching copy of error
Complete error.txt

nadaved1 · 2024-07-20T10:09:32Z

There's a solution in th forum that might help https://community.hailo.ai/t/problem-with-model-optimization/1648/31?u=nadav בתאריך שבת, 20 ביולי 2024, 08:56, מאת Armtronix2021 ‏< ***@***.***>:

…

Since the past two or three days I have been trying to compile one custom model to hef. I am facing similar issue I have followed the procedure as mentioned in the links below ( https://github.com/hailo-ai/hailo_model_zoo/tree/833ae6175c06dbd6c3fc8faeb23659c9efaa2dbe/training/yolov8 ) ( https://github.com/hailo-ai/hailo-rpi5-examples/blob/main/doc/retraining-example.md#using-yolov8-retraining-docker ) I have used docker to do the training on my dataset. Sharing the commands below which i have run on the docker For Training : yolo detect train data=/home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/MF-object-detection-4/data.yaml model=[yolov8s.pt](http://yolov8s.pt/) name=MF_yolov8s_n epochs=300 batch=16 For Export to ONNX: yolo export model=/workspace/ultralytics/runs/detect/MF_yolov8s_n/weights/[ best.pt](http://best.pt/) imgsz=640 format=onnx opset=11 Copying the Model to the regular system from the docker : cp runs/detect/MF_yolov8s_n/weights/best.onnx /home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/yolov8s.onnx plz note i have renamed it as yolov8s.onnx as i read this somewhere on github ,link to this (#85 <#85>) (#94 <#94>) I exited the docker and then entered the following command "hailomz compile yolov8s --ckpt /home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/yolov8s.onnx --calib-path /home/abc/Image_Processing_Code/Image_Processing_MF_Form_Hailo/MF-object-detection-4/test/images --hw-arch hailo8l --classes 2 --performance" Once i do this i get the following error hailo_model_optimization.acceleras.utils.acceleras_exceptions.NegativeSlopeExponentNonFixable: Quantization failed in layer yolov8s/conv42 due to unsupported required slope. Desired shift is 14.0, but op has only 8 data bits. This error raises when the data or weight range are not balanced. Mostly happens when using random calibration-set/weights, the calibration-set is not normalized properly or batch-normalization was not used during training. I have tried using the model directly in pt format on my system (not on pi5); it works without any issue . I am just a beginner so I am not sure what I am doing incorrectly. Any one who can point me in the correct direction would be of gr8 help attaching copy of error Complete error.txt <https://github.com/user-attachments/files/16318989/Complete.error.txt> — Reply to this email directly, view it on GitHub <#111 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADBIQYC26ENCTUNJ6LONMULZNH3YRAVCNFSM6AAAAABLBYUFTWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBQHE2DAOBRGI> . You are receiving this because you commented.Message ID: ***@***.***>

Will-UEA · 2024-07-20T10:16:21Z

Good morning, nadaved

I did the optimization process by adding that parameter you mentioned. Here are the results I got:

yoloteste/output_layer2 SNR: 4.574 db

yoloteste/output_layer1 SNR: -37.9 db

Is it correct?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error compile yolov8 to hef #111

Error compile yolov8 to hef #111

Will-UEA commented Jul 18, 2024

omerwer commented Jul 18, 2024

nadaved1 commented Jul 18, 2024 via email

Will-UEA commented Jul 18, 2024

Will-UEA commented Jul 18, 2024

Will-UEA commented Jul 19, 2024

Armtronix2021 commented Jul 20, 2024 •

edited

Loading

nadaved1 commented Jul 20, 2024 via email

Will-UEA commented Jul 20, 2024

Error compile yolov8 to hef #111

Error compile yolov8 to hef #111

Comments

Will-UEA commented Jul 18, 2024

omerwer commented Jul 18, 2024

nadaved1 commented Jul 18, 2024 via email

Will-UEA commented Jul 18, 2024

Will-UEA commented Jul 18, 2024

Will-UEA commented Jul 19, 2024

Armtronix2021 commented Jul 20, 2024 • edited Loading

nadaved1 commented Jul 20, 2024 via email

Will-UEA commented Jul 20, 2024

Armtronix2021 commented Jul 20, 2024 •

edited

Loading