-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implemented resnet18 and resnet34 #16363
Conversation
Adding the model summaries here for info: Resnet18:
Resnet34:
It turns out that they do not match PyTorch's numbers which is something I do not understand. This appears to be due to the fact that there are bias in TF's convs, and not in PyTorch's ones, and also due to how PyTorch counts BN's params. However, the last dimension before the dense layer matches, and the size (WH) of the feature maps matches as well. |
So 2 things w.r.t. to the comparison with PyTorch:
Side note: the default momentum values for the batch norm in Keras and PyTorch are not the same: 0.9 for PyTorch and 0.99 in Keras. This, coupled with the use of bias in TF will mean that the training will be different between the 2 frameworks. I think it would be nice to implement the possibility to change the batch norm momentum to fit PyTorch's one, I am going to open a new issue and a new PR about this. |
Thanks for the PR. Could u make the sure the weights for imagenet also available? Also please make sure to run the evaluation with imagenet eval set, and report the acc number in the PR. |
@qlzh727 should I train the models also for the no bias case? Also, could you point me to the script that were used to train the bigger models? I couldn't find them but maybe didn't look well enough |
@qlzh727 I was looking for an official script to train a classification model on imagenet, and stumbled upon this: https://github.com/tensorflow/models There is a typical example allowing to train classification models, but I also noticed that there is already an implementation of ResNet without the bias and with the basic blocks here. I don't think the weights are available, but now my question is more: should we re-implement it here given it's already present in this other repo? Basically, is there a difference in concern between keras applications and tensorflow models? |
I just noticed that one additional difference with the PyTorch implementation (in both keras applications and tensorflow models) is the initialization strategy for the convolution weights.
|
We currently don't have any script for retrain the model. Keras application was used for fine tuning and we usually reuse weights/checkpoints from original paper (if it was published). |
tensorflow-models is more focused on end to end solutions, and if that's already available in tf-models, we probably can skip it here in keras.application (given that you can't get any existing weigths). |
Well the original paper did train both resnet 18 and 34, but not sure in which framework or even whether the weights are available. Another solution would be to translate the ones from PyTorch, potentially forcing the bias to 0 for the original implementations with bias. Wdyt? EDITOne last thing is that if we do not include the resnet 18 and 34 here, it might still be nice to have a pointer to tensorflow/models, in order for people looking for an implementation to find it easily (this is not the case rn, see keras-team/keras-applications#151) |
@qlzh727 Indeed since I am porting from PyTorch I needed to use their preprocessing. Here are my tentative answers:
If however, you have at your disposal the caffe weights for both models (and by any chance the script to port them), I can definitely do the porting, and checks. |
Any progress on this? This would be really great to have! |
@zaccharieramzi Can you please resolve conflicts? Thank you! |
@gbaned should be done |
Sorry for the long wait, since end user could easily miss the preprocess API with pytorch format, how about we include the preprocess as part of the model, and control it via a |
Due to the fact that the model requires a different preprocessing for inputs in the inputs between the ResNet18/34 and the other ResNets, we would probably need to re-train these weights. Let's migrate this to a PR on keras-cv. Please send a pull request to KerasCV, and place the model in the models package: https://github.com/keras-team/keras-cv/tree/master/keras_cv/models from there, we can retrain the models |
@LukeWood sure, opening this PR keras-team/keras-cv#805 |
Thank you. Let's move to the discussion to the KerasCV PR. |
Just mentioning for those following the conversation that the corresponding PR in keras-cv has been merged. |
This should solve this issue : keras-team/keras-applications#151
Which has duplicates here:
ResNet [18, 34]
to keras.applications #15269I don't know how to test this, this is why I am making it a draft PR.
I haven't implemented the V2, to make this easy to review, and I haven't trained the networks to get the weights.
Note: this is a reopening of #16358, which I messed up with wrong emails in the commits.