Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keras fine tuned VGG16 with good accuracy gives many wrong predictions #11590

Closed
Wazaki-Ou opened this issue Nov 6, 2018 · 18 comments
Closed

Comments

@Wazaki-Ou
Copy link

I have fine-tuned VGG16 to classify my own dataset (of 4 classes) and while the training seemed to go well (high accuracy on training and validating during training and on testing set when the training finished) and the results of both model.evaluate() and the use of confusion matrix show really good results, once I tried to predict on my dataset using the model, it's a nightmare. I am using the following code:

im = cv2.resize(frame,(image_size, image_size), interpolation = cv2.INTER_AREA)
#convert the image pixels to a numpy array
framer = img_to_array(im)             
image = framer.reshape((1, framer.shape[0], framer.shape[1], framer.shape[2]))
# prepare the image for the VGG model
image = preprocess_input(image)  
label = FLOW1_model.predict_classes(image, verbose=0)

I divided the total of correct prediction by the total number of images to estimate the accuracy, and the result is around 30% for 1 class and 50 % for another class (on testing data) and it tends to classify the images as the 4rth class a lot. I even tried with training data and the accuracy is still low. I don't understand why there is this big difference in performance calculation. Am I not supposed to trust the accuracy, loss, confusion matrix and model.evaluate() ?? If so, how can I monitor the learning of my model and how can I make sure it is learning ?
For the reference, this is how I compiled the VGG16 (after loading it without top and adding my own classifier then freezing everything except the top 2 blocks)

# Compile the model method 2
sgd = SGD(lr=0.00001, decay = 1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['accuracy'])

I would really appreciate any help. I am no longer sure whether there is a bug somewhere in Keras VGG or if I am not using predict correctly. It is just weird how the predictions are wrong this much after all the good results the model seemed to give.

@Wazaki-Ou Wazaki-Ou changed the title Keras fine tuned VGG16 gives wrong predictions Keras fine tuned VGG16 with good accuracy gives many wrong predictions Nov 6, 2018
@gabrieldemarmiesse
Copy link
Contributor

We can't really assume that it's a keras bug unless you know what is going on or that you can provide a minimal example with a very small network to reproduce the bug. Having different performance at training/testing time is very common in ML.

@gabrieldemarmiesse gabrieldemarmiesse added the type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. label Nov 6, 2018
@Wazaki-Ou
Copy link
Author

Thank you for the answer @gabrieldemarmiesse. I understand that it is common to have a difference in the performance due to different factors. However, even if we assume that the model overfit on the training dataset, isn't it weird that model.predict() gives a very low accuracy even on the training set?
I will see if I can share a notebook and the dataset soon. I didn't want to copy/paste the script here since it would be too long.

@gabrieldemarmiesse
Copy link
Contributor

Do you have batchnorm in your model?

@Wazaki-Ou
Copy link
Author

Wazaki-Ou commented Nov 11, 2018

This is the code I used. I will try to update the notebook and add the code with predict_classes(). The dataset is not loaded since I don't have it ready on my laptop. If you need it, I will upload it later too.
https://colab.research.google.com/drive/1DG-nc2OrNj4za7Gs-eCSWhhj9PX2uAtT
For batchnorm, I heard about it but since it was never mentioned in the tutorials I followed, I assumed it's better not to. I am also not very familiar with it so I was worried it would create issues. Could that be the issue ?

@Wazaki-Ou
Copy link
Author

Wazaki-Ou commented Nov 13, 2018

Here is a link to the dataset https://drive.google.com/drive/folders/1lXQQhsIw4XnSIkPnBcuva4myY8raHLba?usp=sharing
I would really appreciate if someone could give some feedback about why this issue is happening. @gabrieldemarmiesse Am I missing a step in the fine tuning ??

@adityapatadia
Copy link

This happens even when I train inception model. Downgrading keras to 2.1.6 works.

@CA4GitHub
Copy link

@adityapatadia downgrading to keras 2.1.6 and running the same code significantly changed the results?

@adityapatadia
Copy link

Yes. Everything is back to normal. Latest keras has issues.

@gabrieldemarmiesse
Copy link
Contributor

I'll tag this as a bug since many people reported the same behaviour. It would be nice to know which commit exactly did this. Can someone make a test to expose this behaviour easily in one single run?

@gabrieldemarmiesse gabrieldemarmiesse added type:bug/performance and removed type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. labels Feb 2, 2019
@adityapatadia
Copy link

@GabrielBianconi since you are on it, please also check this issue #9965. It's closed long ago but the bug is still haunting lot of users. The example give here #9214 proves that batchnorm is doing wrong math.

@gabrieldemarmiesse
Copy link
Contributor

This "issue" with batchnorm isn't really an issue. It's just something to be aware about. trainable and training are two different things. They have the same behaviour as the eval mode and the no_grad mode in pytorch. They allow users maximum flexibility. This is why we can't fuse them. Users have to be aware of this when they do fine tuning. We can't hide everything in a black box.

@adityapatadia
Copy link

adityapatadia commented Feb 3, 2019

I understand that, but as per my knowledge it's not possible to set 'training' flag to false after model is loaded and we are fine tuning it. Would be great if you can give example of how to set training to false for those layers in which we do trainable=false. Specifically for this example: https://keras.io/applications/#fine-tune-inceptionv3-on-a-new-set-of-classes

@Wazaki-Ou
Copy link
Author

Yes. Everything is back to normal. Latest keras has issues.

Did you train the model again using that Keras version or just use predict_classes() ?
Thanks for the links also ! I will check your recommendations and see if it solves the issue

@Wazaki-Ou
Copy link
Author

This happens even when I train inception model. Downgrading keras to 2.1.6 works.

@adityapatadia Can you please tell me what version of tensorflow you used? I tried again with this specific version and I still have the same issue. When I try to predict classes, on one class I get accuracy 0 while for the 3 others accuracy is between 60 and 90. The training/validation shows very good results ... so I'm getting really confused here. Or is it a VGG16 issue?

@adityapatadia
Copy link

I used TF 1.12. May be it is VGG 16 issue. I was facing same issue with inception and it was resolved after downgrade.

@adityapatadia
Copy link

Yes. Everything is back to normal. Latest keras has issues.

Did you train the model again using that Keras version or just use predict_classes() ?
Thanks for the links also ! I will check your recommendations and see if it solves the issue

I trained new model.

@RomainCendre
Copy link

This "issue" with batchnorm isn't really an issue. It's just something to be aware about. trainable and training are two different things. They have the same behaviour as the eval mode and the no_grad mode in pytorch. They allow users maximum flexibility. This is why we can't fuse them. Users have to be aware of this when they do fine tuning. We can't hide everything in a black box.

I understand the difficulties to hide some mecanisms....
At least, I think it would be great to have a proper example of "how to do fine tuning" on Network using BatchNorm layer.

@gabrieldemarmiesse
Copy link
Contributor

At least, I think it would be great to have a proper example of "how to do fine tuning" on Network using BatchNorm layer.

PR welcome to fix the current example in the docs :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants