Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rerange the blocks of Focus Layer into row major to be compatible with tensorflow SpaceToDepth #413

Closed
ausk opened this issue Jul 15, 2020 · 9 comments
Labels
enhancement New feature or request Stale

Comments

@ausk
Copy link

ausk commented Jul 15, 2020

馃殌 Feature

Modify Focus Layer into row major to be compatible with tf.space_to_depth.

Just change the blocks order:
from : torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)
to : torch.cat([x[..., ::2, ::2], x[..., ::2, 1::2], x[..., 1::2, ::2], x[..., 1::2, 1::2]], 1)

Motivation

In model/common.py, the Focus Layer is defined in Pytorch as following:

class Focus(nn.Module):
    # Focus wh information into c-space
    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):  # ch_in, ch_out, kernel, stride, padding, groups
        super(Focus, self).__init__()
        self.conv = Conv(c1 * 4, c2, k, s, p, g, act)

    def forward(self, x):  # x(b,c,w,h) -> y(b,4c,w/2,h/2)
        # original 
        return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
        # suggestion 
        return self.conv(torch.cat([x[..., ::2, ::2], x[..., ::2, 1::2], x[..., 1::2, ::2], x[..., 1::2, 1::2]], 1))

And @bonlime posted a brief answer to What's the Focus layer? #207:

check TResNet paper. p2. They call it SpaceToDepth

In the TResNet paper, p2.1 We wanted to create a fast, seamless stem layer, with little information loss as possible, and let the simple well designed residual blocks do all the actual processing work. The stem sole functionality should be to downscale the input resolution to match the rest of the architecture, e.g., by a factor of 4. We met these goals by using a dedicated SpaceToDepth transformation layer [32], that rearranges blocks of spatial data into depth. The SpaceToDepth transformation layer is followed by simple 1x1 convolution to match the number of wanted channels.

That to say, the focus layer is to fast download the input resolution by rearanging blocks of spatial data into depth, and change the feature channels generally by 1x1 conv.


And there is an op SpaceToDepth (tf.space_to_depth, tf2.nn.space_to_depth) to rearranges blocks of spatial data.

The Fcous layer use torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1).

Then we compare:

(0) input

[[[[0 1]
   [2 3]]]]

(1) by Focus torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)

[[[[0]]
  [[2]]
  [[1]]
  [[3]]]]

(2) by tensorflow

[[[[0]]
  [[1]]
  [[2]]
  [[3]]]]

(3) modify Focus torch.cat([x[..., ::2, ::2], x[..., ::2, 1::2], x[..., 1::2, ::2], x[..., 1::2, 1::2]], 1)

[[[[0]]
  [[1]]
  [[2]]
  [[3]]]]

So, just modify the order of the blocks, we can make it compatible tensorflow SpaceToDepth op.
It will make the model be more likely to transport into tensorflow.

@ausk ausk added the enhancement New feature or request label Jul 15, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Jul 15, 2020

Hello @ausk, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook Open In Colab, Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

  • Cloud-based AI systems operating on hundreds of HD video streams in realtime.
  • Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
  • Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

@ausk ausk changed the title Modify Focus Layer in row major to be compatible with tensorflow SpaceToDepth Rerange the blocks of Focus Layer into row major to be compatible with tensorflow SpaceToDepth Jul 15, 2020
@glenn-jocher
Copy link
Member

@ausk modifying the Focus() module will invalidate all YOLOv5 pretrained models, so I would highly advise against it.

@ausk
Copy link
Author

ausk commented Jul 16, 2020

@glenn-jocher Modifying the Focus() module will bring benefits of improved versatility, because many frameworks/libraries store the data in row major order, such as tensorflow. And onnx/tensorrt also support space2depth.

Yes, it will hurt the accuracy of current pretrained models. But if train from scratch, I still recommand to modify. It's a tradeoff.

@glenn-jocher
Copy link
Member

Sure. I volunteer you to retrain all of the pretrained models to their current accuracy with your proposed architecture changes then. Once this is done please submit a PR and we are all set :)

@ausk
Copy link
Author

ausk commented Jul 25, 2020

Thank you for your work, anyway.

I realise that the space2depth( slice and concate ops) of Focus is the 0th layer of the model, so when inference, we can just remove it, just the conv. So the input becomes nchw (nb, 12, nh, nw). Finally, I have translated the small model (v2) into keras( tensorflow) with nhwc(1, nh, nw, nc) input, and inference success.

Just close as you rejected this.

@ausk ausk closed this as completed Jul 25, 2020
@glenn-jocher
Copy link
Member

@ausk ok sounds good! But no I didn't reject the idea. If you can retrain the 4 models with your changes to >= performance and submit a PR then we are good to go.

@glenn-jocher
Copy link
Member

glenn-jocher commented Jan 10, 2021

@ausk better late than never, I've reopened this issue and will examine this option more closely to better align PyTorch and TF YOLOv5 versions to possibly improve TFLite export (google-coral/edgetpu#272).

EDIT: I don't see a problem here, seems like a simple change that brings exportability benefits. I'll try my best to include this update in the next release that includes fully retrained models (i.e. 4.1 or 5.0 possibly).

@glenn-jocher glenn-jocher reopened this Jan 10, 2021
@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@glenn-jocher
Copy link
Member

TODO removed following release v6.0 architecture updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Stale
Projects
None yet
Development

No branches or pull requests

2 participants