Training Steps Mismatch in the paper and the code in ImageNet Experiments #24

Chaimmoon · 2020-05-03T14:21:34Z

Hi,

In ImageNet Experiments, the paper said that it should be trained for 800 epochs:

However, in the code, it said that it should be trained for 80 epochs:

So there is a big difference……

Besides, I try to re-implement in PyTorch, and the ACC is 7~8 points behind your method. The network architecture and number of parameters is the same as your Darknet results……

Best,
Mu

WongKinYiu · 2020-05-03T14:37:41Z

@Chaimmoon

Thank you for point out the typos.
It should be 800,000, which is same in the cfg.

I have only implemented CSPDensenet and CSPDarknet with Pytorch.
Following is the results of (CSP)Densenet-{121, 169, 201, 264} with PyTorch.

and my PyTorch implemented darknet53 and cspdarknet53 get 76.3/92.9 and 76.9/93.3 top-1/top-5 accuracy with 224x224 input resolution, respectively.

You should make sure the BN layers and activation functions are same as provided cfg file.

WongKinYiu · 2020-05-03T15:15:57Z

@Chaimmoon

this is my PyTorch implementation of CSPDarknet.
darknet.py.txt

I borrow some functions from mmdetection and mmcv.
the main difference between CSPDarknet and CSPResNe(X)t is CSPDarknet use darknet_layer and CSPResNe(X)t use resne(x)t_layer.

            x = down_layer(x)
            x1, x2 = x.chunk(2, dim=1)
            x2 = darknet_layer(x2)
            x = torch.cat([x1,x2], 1)
            x = tran_layer(x)

Chaimmoon · 2020-05-03T15:17:26Z

@Chaimmoon

Thank you for point out the typos.
It should be 800,000, which is same in the cfg.

I have only implemented CSPDensenet and CSPDarknet with Pytorch.
Following is the results of (CSP)Densenet-{121, 169, 201, 264} with PyTorch.

and my PyTorch implemented darknet53 and cspdarknet53 get 76.3/92.9 and 76.9/93.3 top-1/top-5 accuracy with 224x224 input resolution, respectively.

You should make sure the BN layers and activation functions are same as provided cfg file.

@WongKinYiu

Thanks for your reply!

I implemented the ResNet10, ResNet50 and ResNeXt50. The results are not quite good as your paper said... (Besides, can you provide the cfg file for the ResNet10_CSP? The architectures for ResNet10 and 50 are quite different.)

As for the BN, it should be torch.nn.BatchNorm2d, and the activation function should be torch.nn.LeakyReLU, right?

Can you provide your PyTorch code? Thanks

Mu

Best,
Mu

WongKinYiu · 2020-05-03T15:22:29Z

@Chaimmoon

My PyTorch code is posted on #24 (comment).

I am sorry about that I can not release my lightweight models due to some issues.
You can try to follow the rule of ResNet50->CSPResNet50 to modify ResNet10->CSPResNet10.

nyj-ocean · 2020-05-05T10:52:43Z

@WongKinYiu
Thanks for your work!
I have a question about [sam] layers

in AlexeyAB/darknet#3708 (comment)
SAM module consists of one [convolutional] layer and one sam layer like following

while in AlexeyAB/darknet#5355 (comment)
SAM module consists of two [convolutional] layers and one sam layer ,not one [convolutional] layer, like following

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=logistic

[sam]
from=-2

what's more,in AlexeyAB/darknet#5355 (comment)
the [convolutional] layer in front of the sam layer has pad=1,while in AlexeyAB/darknet#3708 (comment), the [convolutional] layer in front of the sam layer dose not have pad=1,

I want to know which [sam] layer is correct?

WongKinYiu · 2020-05-05T11:11:00Z

@nyj-ocean Hello,

In SAM in yolo v4 use sigmoid or mish? AlexeyAB/darknet#5355 (comment)

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=logistic

[sam]
from=-2

which is sam module.

In About [sam] layer. AlexeyAB/darknet#3708 (comment)

which is the usage of sam layer.
pad=1 and pad=0 are same when convolutional filter size is 1x1.

nyj-ocean · 2020-05-05T14:05:01Z

@WongKinYiu
Thanks for your reply
I want to add the SAM module to YOLOv3,.
can you help me check whether the following cfg is right?

SAM-to-yolov3.cfg.txt

WongKinYiu · 2020-05-05T14:36:12Z

@nyj-ocean

the latest [sam] block seems at different layer when compare with 1st and 2nd [sam] block in your cfg file.

and in my previous experiments, i used sam layer as:
SAM-to-yolov3.cfg.txt

nyj-ocean · 2020-05-05T15:10:43Z

@WongKinYiu
Thanks for your help!
I noticed that the yolov4 paper has mentioned a modified SAM block.
Is the SAM block in your provided SAM-to-yolov3.cfg.txt #24 (comment) equal to the modified SAM block mentioned in yolov4?

WongKinYiu · 2020-05-05T15:36:10Z

yes, it is same.
and the comparison of w/w\o sam is posted on 1st table of readme in this repo.

nyj-ocean · 2020-05-06T03:50:51Z

@WongKinYiu
thanks for your help!!!

Chaimmoon · 2020-05-08T09:31:40Z

@WongKinYiu

Hi, I have checked the network structure and number of parameters in my CSPResNet/CSPResNeXt PyTorch implementation, which is the same as what you reported in your Github README file, including nn.BachNorm2d, nn.LeakyReLu, Training epochs, batch size and learning rate schedule. I also have a close look at your DarkNet PyTorch implementation. However, the ACC point is still below yours...

My Results:

CSPResNet50: Prec@1 75.772 Prec@5 92.716 (Paper results: 76.6 % 93.3%)
CSPResNeXt50: Prec@1 76.328 Prec@5 93.058 (Paper results: 77.9 % 94.0%)

Thanks!

WongKinYiu · 2020-05-08T10:03:11Z

@Chaimmoon

I am not sure it is important or not, I just follow https://pjreddie.com/darknet/imagenet/.

And I think gets a little bit lower accuracy is normal, since darknet use 256x256 for validation, and I guess your PyTorch code use 224x224 instead.
My CSPDarknet53 PyTorch (224x224) implementation also gets 0.6% lower top-1 accuracy than Darknet (256x256) implementation.

Could you share your code of CSPResNet / CSPResNeXt, I would like to upload the implementation and results to pytorch branch if it is OK.

nyj-ocean · 2020-05-09T08:59:22Z

@WongKinYiu
I'm sorry to bother you again.

I notice that the modified SAM in yolov4 paper is reference to the CBAM paper.

However, I also find that ThunderNet paper also design a SAM.

so I want to know:

The SAM in CBAM paper is same as the SAM in ThunderNet paper?
In yolov4 paper, the modified SAM is reference to the CBAM paper.
But in About [sam] layer. AlexeyAB/darknet#3708 (comment), LukeAI said the [sam] layer is for thundernet.
Are the two statements in conflict? which one is correct?

WongKinYiu · 2020-05-09T09:21:19Z

@nyj-ocean

There are many kind of channel attention module (CAM) spatial attention module (SAM) in the literature. For example SENet and SKNet proposed different kind of CAM, and CBAM and ThunderNet prposed different kind of SAM. In general, we will cite the first paper or the most similar paper or both in related work. So the answer of your question is:

The SAM in CBAM paper is same as the SAM in ThunderNet paper?

No, they are different.

In yolov4 paper, the modified SAM is reference to the CBAM paper.
But in About [sam] layer. AlexeyAB/darknet#3708 (comment), LukeAI said the [sam] layer is for thundernet.
Are the two statements in conflict? which one is correct?

The CBAM is the first paper which proposed SAM, we cite it in yolov4 paper. The ThunderNet prposed the most similar SAM module as ours, we cite it in cspnet paper.
SAM in CBAM:

SAM in ThunderNet:

nyj-ocean · 2020-05-09T09:37:44Z

@WongKinYiu
Thanks for your reply.
yolov4 paper modify SAM from spatial-wise attention to point-wise attention,
So the SAM module before modified in yolov4 (that is spatial-wise attention ) is similar to the SAM module in CBAM paper?

WongKinYiu · 2020-05-09T09:41:24Z

yes, all of different kind of sam modules produce the attention of spatial.

nyj-ocean · 2020-05-09T09:43:34Z

@WongKinYiu
thanks a lot

Chaimmoon · 2020-05-11T01:52:21Z

@Chaimmoon

I am not sure it is important or not, I just follow https://pjreddie.com/darknet/imagenet/.

And I think gets a little bit lower accuracy is normal, since darknet use 256x256 for validation, and I guess your PyTorch code use 224x224 instead.
My CSPDarknet53 PyTorch (224x224) implementation also gets 0.6% lower top-1 accuracy than Darknet (256x256) implementation.

Could you share your code of CSPResNet / CSPResNeXt, I would like to upload the implementation and results to pytorch branch if it is OK.

Hi @WongKinYiu

Thanks for your reply! I think that during training and testing, the DarkNet framework keeps the image size as 256256. However, for common PyTorch training, the training size is 224224, and the test size is 256*256. Is my understanding right?

WongKinYiu · 2020-05-11T02:43:11Z

@Chaimmoon

it is depend on your code.
the most common testing protocol in PyTorch is single-crop (224x224). https://pytorch.org/docs/stable/torchvision/models.html
and the other common testing protocols nowadays are 10-crop (224x224 * 5-crop * flip), 5-crop(224x224 * (center+ 4 corners)), and full (256x256).

nyj-ocean · 2020-05-13T11:35:04Z

@WongKinYiu
I'm sorry to bother you again.
I want to produce the picture about anchors of yolov3，like following . but I don't know how to do it.
Can you tell me how to produce this picture about anchors?

WongKinYiu · 2020-05-13T12:02:47Z

@nyj-ocean

i do not know too, i always use the anchors which yolo9000 calculated.

AlexeyAB · 2020-05-13T12:14:28Z

You can calculate new anchors by using this command:
./darknet detector calc_anchors coco.data -num_of_clusters 9 -width 512 -height 512 -show

nyj-ocean · 2020-05-14T13:51:03Z

@WongKinYiu
thanks for your reply

@AlexeyAB
Thank you so much!!
It helps me a lot!
If the background color of cloud.png is white, it will be better for me.
How can I change the background color of cloud.png from black to white?

AlexeyAB · 2020-05-14T14:07:29Z

img = cv::Scalar::all(255); https://github.com/AlexeyAB/darknet/blob/bef28445e57cd560fa3d0a24af98a562d289135b/src/image_opencv.cpp#L1472
cv::rectangle(img, pt1, pt2, CV_RGB(0, 0, 0), 1, 8, 0); https://github.com/AlexeyAB/darknet/blob/bef28445e57cd560fa3d0a24af98a562d289135b/src/image_opencv.cpp#L1490

nyj-ocean · 2020-05-14T19:15:06Z

@AlexeyAB
great!
thanks a lot

nyj-ocean · 2020-05-21T10:20:45Z

@AlexeyAB
sorry to bother you again.
I ues the following command to generate my cloud.png on my own dataset.
./darknet detector calc_anchors my-own-dataset.data -num_of_clusters 9 -width 608 -height 608 -show
The following figure is my cloud.png

I find that there are many black spare parts in my own clond.png
However, there is almost no black spare parts in cloud.png of coco dataset. The anchor almost fills the whole cloud.png of coco dataset (seen #24 (comment))

Is there any problem with my own clond.png ?
or is there any problem with my anchor that I generated on my own dataset?
How can I eliminate the black spare parts in my own clond.png ？

WongKinYiu · 2020-05-21T12:21:20Z

i guess images in your dataset are form videos.

AlexeyAB · 2020-05-21T12:24:59Z

What is the black spare?
There is no problem.

nyj-ocean · 2020-05-21T14:35:36Z

@AlexeyAB
Theblack spareparts is like the following:

there are many black spare parts in my own clond.png
However, there is almost no black spare parts in cloud.png of coco dataset. (seen #24 (comment))

Why are there many black spare parts in my own cloud.png ?
Is it normal?
I want to eliminate these black spare parts in my own clond.png.
How can I eliminate these black spare parts？

nyj-ocean · 2020-05-21T14:36:50Z

@WongKinYiu
The images in my dataset are not taken from videos

AlexeyAB · 2020-05-21T17:21:58Z

Why are there many black spare parts in my own cloud.png ?

Because your objects are small relative to the image size. This is normal.

Just may be you should use higher network resolution for anchors calculation, training and detection to get good results.

https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

Only if you are an expert in neural detection networks - recalculate anchors for your dataset for width and height from cfg-file: darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416 then set the same 9 anchors in each of 3 [yolo]-layers in your cfg-file. But you should change indexes of anchors masks= for each [yolo]-layer, so that 1st-[yolo]-layer has anchors larger than 60x60, 2nd larger than 30x30, 3rd remaining. Also you should change the filters=(classes + 5)* before each [yolo]-layer. If many of the calculated anchors do not fit under the appropriate layers - then just try using all the default anchors.

nyj-ocean · 2020-05-22T04:40:22Z

@AlexeyAB
Thank you so much

nyj-ocean · 2020-08-05T07:33:40Z

@AlexeyAB
sorry to bother you again.
./darknet detector calc_anchors coco.data -num_of_clusters 9 -width 512 -height 512 -show
It will create cloud.png
If it can create cloud.eps , it will be better for me.
How can I change the cloud.png from png to eps?

AnhPC03 · 2020-09-08T10:17:14Z

@WongKinYiu

Hi, I have checked the network structure and number of parameters in my CSPResNet/CSPResNeXt PyTorch implementation, which is the same as what you reported in your Github README file, including nn.BachNorm2d, nn.LeakyReLu, Training epochs, batch size and learning rate schedule. I also have a close look at your DarkNet PyTorch implementation. However, the ACC point is still below yours...

My Results:

CSPResNet50: Prec@1 75.772 Prec@5 92.716 (Paper results: 76.6 % 93.3%)

CSPResNeXt50: Prec@1 76.328 Prec@5 93.058 (Paper results: 77.9 % 94.0%)

Thanks!

@Chaimmoon Could you share me your code of CSPResNet50?
Thank you.

nyj-ocean · 2020-10-10T06:42:44Z

@WongKinYiu

I'm sorry to bother you again.

I have another question about SAM module

yolov4 paper modify SAM from spatial-wise attention to point-wise attention,

I can not fully understand that yolov4 modify SAM from spatial-wise attention to point-wise attention.
Does it mean that yolov4 modify SAM from Max-pooling and Average-Pooling to Convolution layers?
What is point-wise attention ?
Is the point-wise attention equal to the convolution layer ?

WongKinYiu · 2020-10-11T03:46:12Z

channel-wise: each channel has one attention 1x1xc.
spatial-wise: each position has one attention wxhx1.
point-wise: each feature point has one attention wxhxc.

nyj-ocean · 2020-10-11T07:01:53Z

@WongKinYiu

Thanks for your reply.

what I understand about yolov4 modify SAM from spatial-wise attention to point-wise attention is that is yolov4 use a 1*1 convolution layer replace the maxpool ,avgpool ,7*7 convolution layer ，just like the following:

Is my understanding correct？

2.If my understanding is correct, can you tell me why yolov4 modify SAM from spatial-wise attention to point-wise attention ?
What are the benefits of making this modify？
Is it to reduce inference time?AlexeyAB/darknet#3708 (comment)

These questions are very troubling to me. I look forward to your answers.Thanks a lot

WongKinYiu mentioned this issue May 15, 2020

Does CSP model need to increase training times #26

Closed

Training Steps Mismatch in the paper and the code in ImageNet Experiments #24

Training Steps Mismatch in the paper and the code in ImageNet Experiments #24

Comments

Chaimmoon commented May 3, 2020

WongKinYiu commented May 3, 2020 • edited Loading

WongKinYiu commented May 3, 2020

Chaimmoon commented May 3, 2020

WongKinYiu commented May 3, 2020

nyj-ocean commented May 5, 2020

WongKinYiu commented May 5, 2020

nyj-ocean commented May 5, 2020

WongKinYiu commented May 5, 2020

nyj-ocean commented May 5, 2020

WongKinYiu commented May 5, 2020 • edited Loading

nyj-ocean commented May 6, 2020

Chaimmoon commented May 8, 2020 • edited Loading

WongKinYiu commented May 8, 2020 • edited Loading

nyj-ocean commented May 9, 2020

WongKinYiu commented May 9, 2020

nyj-ocean commented May 9, 2020 • edited Loading

WongKinYiu commented May 9, 2020

nyj-ocean commented May 9, 2020

Chaimmoon commented May 11, 2020

WongKinYiu commented May 11, 2020 • edited Loading

nyj-ocean commented May 13, 2020

WongKinYiu commented May 13, 2020

AlexeyAB commented May 13, 2020

nyj-ocean commented May 14, 2020 • edited Loading

AlexeyAB commented May 14, 2020

nyj-ocean commented May 14, 2020

nyj-ocean commented May 21, 2020

WongKinYiu commented May 21, 2020

AlexeyAB commented May 21, 2020

nyj-ocean commented May 21, 2020

nyj-ocean commented May 21, 2020 • edited Loading

AlexeyAB commented May 21, 2020

nyj-ocean commented May 22, 2020

nyj-ocean commented Aug 5, 2020

AnhPC03 commented Sep 8, 2020

nyj-ocean commented Oct 10, 2020

WongKinYiu commented Oct 11, 2020

nyj-ocean commented Oct 11, 2020

WongKinYiu commented May 3, 2020 •

edited

Loading

WongKinYiu commented May 5, 2020 •

edited

Loading

Chaimmoon commented May 8, 2020 •

edited

Loading

WongKinYiu commented May 8, 2020 •

edited

Loading

nyj-ocean commented May 9, 2020 •

edited

Loading

WongKinYiu commented May 11, 2020 •

edited

Loading

nyj-ocean commented May 14, 2020 •

edited

Loading

nyj-ocean commented May 21, 2020 •

edited

Loading