Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does CSP model need to increase training times #26

Closed
zsgj-Xxx opened this issue May 14, 2020 · 8 comments
Closed

Does CSP model need to increase training times #26

zsgj-Xxx opened this issue May 14, 2020 · 8 comments

Comments

@zsgj-Xxx
Copy link

Due to the limitation of GPU devices, I only tested the model with epoch = 1, and found that compared with the traditional resnext model, the result of cspresnext model for an epoch is not satisfactory. Is it because of the residual link used that the model needs more time to learn

@WongKinYiu
Copy link
Owner

Hello,

I have not checked converge speed of models with and without CSP.
However, all of my experiments follow the same setting as https://pjreddie.com/darknet/imagenet/.
So the training epochs are totally same.

@zsgj-Xxx
Copy link
Author

Thank you very much for your reply,

I want to do some small tests with CSP
I tried to copy it on the pytorch, but the parameters were worse, I haven't found any problems yet
How to modify the CSP method based on resnext?

@WongKinYiu
Copy link
Owner

the topology of resnet, resnext, and darknet are almost same.
#24 (comment) is for your reference.

@zsgj-Xxx
Copy link
Author

Thank you for your work,

I just need to replace darknet_layer with resne(x)t_layer to get the result I need?:heart_eyes:

@zsgj-Xxx
Copy link
Author

image

In addition, in this figure, after maxpooling, is ① CSP? But I think the parameter displayed is not split, but copy

@WongKinYiu
Copy link
Owner

yes.

i think there will be a convolutional layer behind ①. more details: #18

@zsgj-Xxx
Copy link
Author

image
I'm sorry that I've read the paper and the cfg file over and over again, but I still don't understand it

14 * 14 * 1024 - > whether two 7 * 7 * 1024 branches have also been trained

It looks like
image

@WongKinYiu
Copy link
Owner

14x14 is belong to partial transition layer in previous stage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants