Fix `MixConv2d()` remove shortcut + apply depthwise #5410

glenn-jocher · 2021-10-30T11:29:37Z

MixConv2d fixes:

Apply depthwise convolutions per paper
Remove shortcut (causing errors, not in paper)

Validation test:

import torch

from utils.torch_utils import profile
from models.experimental import MixConv2d
from models.common import Conv

m1 = MixConv2d(128, 256, (3, 5), 1)
m2 = Conv(128, 256, 3, 1)
results = profile(input=torch.randn(16, 128, 80, 80), ops=[m1, m2], n=3)

YOLOv5 🚀 v6.0-39-g3d9a368 torch 1.9.0+cu111 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)

      Params      GFLOPs  GPU_mem (GB)  forward (ms) backward (ms)                   input                  output
        4864      0.9961         0.684         4.922         12.76       (16, 128, 80, 80)       (16, 256, 80, 80)
      295424        60.5         0.990         9.727         8.917       (16, 128, 80, 80)       (16, 256, 80, 80)

🛠️ PR Summary

_{Made with ❤️ by Ultralytics Actions}

🌟 Summary

Enhanced neural network layers with SiLU activation and optimized MixConv2d implementation.

📊 Key Changes

Replaced LeakyReLU activation with SiLU (Swish-1 activation function) in certain network layers for potentially improved performance.
Refined the MixConv2d layer by introducing a cleaner convolution strategy, which takes into account the greatest common divisor (GCD) for grouped convolutions, and a new method for distributing channels over different kernels based on their size.
Ensured the new activation function SiLU follows the inplace pattern to maintain memory efficiency, as previously used activations did.

🎯 Purpose & Impact

Enhanced Model Performance: The switch to SiLU activation may lead to improved training results as SiLU often outperforms traditional activations like LeakyReLU in neural networks.
Optimized Convolutional Layers: Improved MixConv2d can provide better computational efficiency and precision in how input channels are distributed across different convolutional kernel sizes.
Consistent Memory Efficiency: Maintaining inplace activation ensures the memory footprint of models is kept low, which is beneficial for users with limited computing resources.

The changes can result in more accurate models that are efficient in both operation and resource utilization, benefiting a wide range of users, from researchers to industry professionals implementing YOLOv5 in their systems.

Fix MixConv2d() remove shortcut + apply depthwise

3d9a368

glenn-jocher linked an issue Oct 30, 2021 that may be closed by this pull request

About the use of MixConv2d #5403

Closed

glenn-jocher changed the title ~~Fix MixConv2d() remove shortcut + apply depthwise~~ Fix MixConv2d() remove shortcut + apply depthwise Oct 30, 2021

glenn-jocher self-assigned this Oct 30, 2021

glenn-jocher merged commit 5d4258f into master Oct 30, 2021

glenn-jocher deleted the fix/mixconv branch October 30, 2021 11:38

glenn-jocher mentioned this pull request Oct 30, 2021

About the use of MixConv2d #5403

Closed

glenn-jocher mentioned this pull request Feb 22, 2022

YOLOv5 v6.1 release #6739

Merged

BjarneKuehl pushed a commit to fhkiel-mlaip/yolov5 that referenced this pull request Aug 26, 2022

Fix MixConv2d() remove shortcut + apply depthwise (ultralytics#5410)

dc2e587

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `MixConv2d()` remove shortcut + apply depthwise #5410

Fix `MixConv2d()` remove shortcut + apply depthwise #5410

glenn-jocher commented Oct 30, 2021 •

edited by UltralyticsAssistant

Loading

Fix MixConv2d() remove shortcut + apply depthwise #5410

Fix MixConv2d() remove shortcut + apply depthwise #5410

Conversation

glenn-jocher commented Oct 30, 2021 • edited by UltralyticsAssistant Loading

🛠️ PR Summary

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

Fix `MixConv2d()` remove shortcut + apply depthwise #5410

Fix `MixConv2d()` remove shortcut + apply depthwise #5410

glenn-jocher commented Oct 30, 2021 •

edited by UltralyticsAssistant

Loading