yolov5-lite #3168

debapriyamaji · 2021-05-14T14:42:19Z

🚀 Feature

Yolov5 lite models: Making yolov5 more embedded friendly

Motivation

In line with the efficientdet-lite models that are more embedded friendly compared to efficientdet, is there any similar plan for yolov5-lite models.

Pitch

Following layers in the yolov5 are not embedded friendly:

slice layer in the beginning.
SiLU activation function.

If we can make suitable changes to these layers, these models can be deployed much more efficiently in embedded devices.

I have trained some models with similar changes for the above mentioned layers and got accuracy within 2% of the original model.

If interested, I would love to share those results.

Additional context

Efficientdet-lite models as compared to efficientdet.

github-actions · 2021-05-14T14:44:42Z

👋 Hello @debapriyamaji, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Google Colab and Kaggle notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher · 2021-05-16T11:58:00Z

@debapriyamaji sounds interesting, can you share some quantitative results before and after the changes? We have an activations study here that compares SiLU against some alternatives: https://wandb.ai/glenn-jocher/activations?workspace=user-glenn-jocher

I also recently created some documentation on the Focus() layer here that might interest you in #3181

debapriyamaji · 2021-05-17T12:15:26Z

@glenn-jocher Thanks for the reply and the insights. My ReLU results were in line with yours.

Following are the changes that I have made step by step for yolov5s:

Model config	PyTorch mAP/AP50	GFLOPS	Comment
Official model	36.7/55.4	17.0
Retrained Official model	36.9/55.6	17.0
SiLU replaced by ReLU	34.9/53.7	17.0
SiLU replaced by ReLU + slice replaced by conv (Would like to call it yolov5-lite)	35.0/54.4	17.1	Replaced the initial slice layer with a conv(No=12,Ni=3, K=3, S=2).

Let me know what you think.

wudashuo · 2021-05-18T11:25:44Z

I've done the exact same thing and have moved the model to devices months ago, following somebody's blog. The result I tested on devices seems to be okay, but it still has losses compared to yolov5s, besides, there are a few losses during migration.
As many devices don't support SiLU and Focus, I suppose many people would have done these changes to migrate the model to embedded devices, but there is still a lot to improve, not just simply replace SiLU with ReLU, replace Focus with Conv, that's just a compromise way. I'm working on it for months, trying to find out a model that less than 15 GFLOPs and more than 35 mAP. If you have some ideas, please let me know.
By the way, I don't think it should be called yolv5-lite, the parameters and GFLOPs are larger than yolov5s, it's not lighter than yolov5s.

glenn-jocher · 2021-05-19T12:29:56Z

@debapriyamaji @wudashuo one other point is that the Focus() module can be implemented with no slicing by allowing it to use the Contract() module. This provides no benefit on PyTorch/CUDA but may help other deployment targets.

yolov5/models/common.py

Lines 163 to 187 in b7cd1f5

    
           class Focus(nn.Module): 
        
               # Focus wh information into c-space 
        
               def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups 
        
                   super(Focus, self).__init__() 
        
                   self.conv = Conv(c1 * 4, c2, k, s, p, g, act) 
        
                   # self.contract = Contract(gain=2) 
        
               def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2) 
        
                   return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)) 
        
                   # return self.conv(self.contract(x)) 
        
           class Contract(nn.Module): 
        
               # Contract width-height into channels, i.e. x(1,64,80,80) to x(1,256,40,40) 
        
               def __init__(self, gain=2): 
        
                   super().__init__() 
        
                   self.gain = gain 
        
               def forward(self, x): 
        
                   N, C, H, W = x.size()  # assert (H / s == 0) and (W / s == 0), 'Indivisible gain' 
        
                   s = self.gain 
        
                   x = x.view(N, C, H // s, s, W // s, s)  # x(1,64,40,2,40,2) 
        
                   x = x.permute(0, 3, 5, 1, 2, 4).contiguous()  # x(1,2,2,64,40,40) 
        
                   return x.view(N, C * s * s, H // s, W // s)  # x(1,256,40,40)

Also I'd argue that SiLU is well supported on many backends. We use it in our YOLOv5 CoreML models in our iOS App, where YOLOv5s runs in <18ms on iPhone 11/12. See https://apps.apple.com/app/id1452689527

debapriyamaji · 2021-05-19T15:51:52Z

Hi,
@wudashuo You can achieve your goal of 15GFLOPS and mAP of 35 by running yolov5s6 at a resolution of 576x576. Running inference on the pretrained checkpoint with i/p resolution of 576x576, I got mAP of 37.7 for 14.15 GFLOPS.

Regarding, yolov5-lite, GFLOPS number indeed look higher. However, these number are for convolution only. If you consider the operation required for sigmoid and multiplications needed for SiLU, yolov5 will be higher than yolov5-lite. From, the context of efficientnet-lite, complexity is almost same as efficientnet. Main difference is in porting these models to embedded devices.

debapriyamaji · 2021-05-22T18:50:05Z

@glenn-jocher Thanks for the insight regarding Focus and SiLU layer. I have some follow-up questions:

Is there any FPS benchmarking data available for different yolov5 models in apple devices?
Did you run these models in 8 bit mode?
How did you come to the conclusion that SiLU is well supported in apple devices? Did you run models with ReLU and SiLU and didn't observe any improvement in performance?

Thanks in advance.

glenn-jocher · 2021-05-23T14:19:03Z

@debapriyamaji see iOS iDetection Speed Table #1276 for YOLOv5 benchmarks on iPhone models. Quantization seems to have no effect on ANE throughput. SiLU tests work perfectly well in iDetection.

debapriyamaji · 2021-05-24T12:12:02Z

@glenn-jocher Thanks for pointing to the FPS table. Is there any similar table for accuracy that would show the drop in accuracy because of quantization?
I am trying to benchmark the ReLU model against SiLU model after quantization in tflite. Will share the results once I am done . Since ReLU is more quantization frindly than SiLU, I am expecting similar trend here as well. Thanks.

glenn-jocher · 2021-05-24T13:19:11Z

@debapriyamaji there is no drop in accuracy when moving from FP32 to FP16 inference.

debapriyamaji · 2021-05-24T14:10:46Z

@glenn-jocher Sorry for not mentioning it in the previous post. I meant the drop in accuracy because of FP8 since the CoreML models are exported as FP8.

One more point I wanted to ask: A14 processor is ~11.0 TOPS. Yolov5s @320x192 is ~ 2.55GOPS. If it runs at 14.3mS or ~70FPS, GOPS utilization is (70*2.55) GOPS = 178.5 GOPS. That's only 1.6% of the entire compute power. So, the efficiency is quite low right?

Thanks.

developer0hye · 2021-06-10T09:12:28Z

@glenn-jocher
Wow.. FReLU is so powerful... Do you have a plan to use FReLU instead of ReLU layer for new yolov5?

glenn-jocher · 2021-06-10T10:05:47Z

@developer0hye we have an activations study on YOLOv5s here:
https://wandb.ai/glenn-jocher/activations

Yes, FReLU performed very well, but you have to be careful interpreting these results as it's introducing additional convolutions into the model, which will especially help small models, but will cause faster overfitting in large models. Also memory usage increased from 12G to 18G when moving from SiLU to FReLU (though training speed was unaffected).

All in all it requires a lot more study into applying it to larger models, and perhaps only applying it to certain areas of the model rather than the entire model (could use feedback from the FReLU authors here), we just don't have time or manpower.

EDIT1: I think memory usage could be reduced by not applying to as much to the earlier layers. In general the P0, P1, P2 layers in the backbone (first 1/3 of the backbone) all have very small strides with high memory usage and inference times.

glenn-jocher · 2021-06-10T10:07:17Z

@AyushExel can I borrow you to pass a feature request to the W&B team? We really need some semi-transparency on these legend overlays. Maybe 20-30% transparency would allow you to see the data under the legend, which would be great in my above screenshot.

AyushExel · 2021-06-10T10:13:22Z

@glenn-jocher I've passed it on :)

glenn-jocher · 2021-06-10T10:27:18Z

@AyushExel awesome thanks :) !

Alex-afka · 2021-06-10T10:44:47Z

@glenn-jocher
Did you replace all the activation functions with frelu？
How does frelu work in yolov5l？

glenn-jocher · 2021-06-10T10:49:00Z

@Alex-afka see #2891 for details

github-actions · 2021-07-11T00:08:57Z

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Wiki – https://github.com/ultralytics/yolov5/wiki
Tutorials – https://docs.ultralytics.com/yolov5
Docs – https://docs.ultralytics.com

Access additional Ultralytics ⚡ resources:

Ultralytics HUB – https://ultralytics.com
Vision API – https://ultralytics.com/yolov5
About Us – https://ultralytics.com/about
Join Our Team – https://ultralytics.com/work
Contact Us – https://ultralytics.com/contact

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

PrashantDixit0 · 2023-01-17T21:38:26Z

@glenn-jocher and @debapriyamaji, Is this YOLOv5-lite pretrained models are avaialble for research or testing purposes ?

glenn-jocher · 2023-11-15T09:24:23Z

@PrashantDixit0 Yes, the YOLOv5s, YOLOv5m, YOLOv5l, YOLOv5x models are available with pretrained weights, for various tasks like object detection, instance segmentation, and more. You can find more information on using these models for research and testing purposes in the Ultralytics documentation at https://docs.ultralytics.com/yolov5/.

debapriyamaji added the enhancement New feature or request label May 14, 2021

github-actions bot added the Stale label Jul 11, 2021

github-actions bot closed this as completed Jul 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yolov5-lite #3168

yolov5-lite #3168

debapriyamaji commented May 14, 2021 •

edited

Loading

github-actions bot commented May 14, 2021 •

edited by glenn-jocher

Loading

glenn-jocher commented May 16, 2021

debapriyamaji commented May 17, 2021 •

edited

Loading

wudashuo commented May 18, 2021

glenn-jocher commented May 19, 2021

debapriyamaji commented May 19, 2021 •

edited

Loading

debapriyamaji commented May 22, 2021 •

edited

Loading

glenn-jocher commented May 23, 2021

debapriyamaji commented May 24, 2021

glenn-jocher commented May 24, 2021

debapriyamaji commented May 24, 2021 •

edited

Loading

developer0hye commented Jun 10, 2021

glenn-jocher commented Jun 10, 2021 •

edited

Loading

glenn-jocher commented Jun 10, 2021

AyushExel commented Jun 10, 2021

glenn-jocher commented Jun 10, 2021

Alex-afka commented Jun 10, 2021 •

edited

Loading

glenn-jocher commented Jun 10, 2021

github-actions bot commented Jul 11, 2021 •

edited by glenn-jocher

Loading

PrashantDixit0 commented Jan 17, 2023

glenn-jocher commented Nov 15, 2023

yolov5-lite #3168

yolov5-lite #3168

Comments

debapriyamaji commented May 14, 2021 • edited Loading

🚀 Feature

Motivation

Pitch

Additional context

github-actions bot commented May 14, 2021 • edited by glenn-jocher Loading

Requirements

Environments

Status

glenn-jocher commented May 16, 2021

debapriyamaji commented May 17, 2021 • edited Loading

wudashuo commented May 18, 2021

glenn-jocher commented May 19, 2021

debapriyamaji commented May 19, 2021 • edited Loading

debapriyamaji commented May 22, 2021 • edited Loading

glenn-jocher commented May 23, 2021

debapriyamaji commented May 24, 2021

glenn-jocher commented May 24, 2021

debapriyamaji commented May 24, 2021 • edited Loading

developer0hye commented Jun 10, 2021

glenn-jocher commented Jun 10, 2021 • edited Loading

glenn-jocher commented Jun 10, 2021

AyushExel commented Jun 10, 2021

glenn-jocher commented Jun 10, 2021

Alex-afka commented Jun 10, 2021 • edited Loading

glenn-jocher commented Jun 10, 2021

github-actions bot commented Jul 11, 2021 • edited by glenn-jocher Loading

PrashantDixit0 commented Jan 17, 2023

glenn-jocher commented Nov 15, 2023

debapriyamaji commented May 14, 2021 •

edited

Loading

github-actions bot commented May 14, 2021 •

edited by glenn-jocher

Loading

debapriyamaji commented May 17, 2021 •

edited

Loading

debapriyamaji commented May 19, 2021 •

edited

Loading

debapriyamaji commented May 22, 2021 •

edited

Loading

debapriyamaji commented May 24, 2021 •

edited

Loading

glenn-jocher commented Jun 10, 2021 •

edited

Loading

Alex-afka commented Jun 10, 2021 •

edited

Loading

github-actions bot commented Jul 11, 2021 •

edited by glenn-jocher

Loading