Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training my own model #1

Open
jeffz0 opened this issue Oct 13, 2018 · 7 comments
Open

Training my own model #1

jeffz0 opened this issue Oct 13, 2018 · 7 comments

Comments

@jeffz0
Copy link

jeffz0 commented Oct 13, 2018

Hi. I've been trying to train my own models for vqa-x, but when I try to generate explanations on models I trained using train.py, my vqa answers and explanations are horrible.

I train for 50k iterations and the print results during training look great, but when I run generate_explanations.py, I would only get 1/1459 vqa answers correct and my explanations look off. In fact, when I test the training set, I only get 15/29459 correct. However, when I use your pretrained model, I get 1073/1459 correct using generate_explanations.py for the validation set.

Is there a step I'm missing going from training my own model using train.py to generating explanations using generate_explanations.py? Training more iterations (going from 30k to 50k) doesn't seem to be improving this issue.

Thanks

@Seth-Park
Copy link
Owner

Seth-Park commented Oct 15, 2018

I believe you might be using the wrong adict.json and vdict.json files. Are you training the VQA-X model from the pretrained VQA model? If so, you need to make sure that the adict.json in PJ-X-VQA/model remains as is and only swap out the exp_vdict.json. If you are training everything from scratch (vqa AND vqa-x), you should not use the adict.json and vdict.json files in the repo, but use the newly generated ones.

@jeffz0
Copy link
Author

jeffz0 commented Oct 15, 2018

Hi Seth, thanks for the response! However, I am using the pretrained VQA model, so I did not generate adict.json or vdict.json files (as specified in the instructions). What is the vocab.json file you are referring to and how do I swap it out? Did you mean vdict.json?

@Seth-Park
Copy link
Owner

Sorry, I meant the exp_dict.json file instead of vocab.json.
If during training the model is predicting the correct answer, then the problem is not caused from loading the incorrect pretrained VQA model, I assume.
There must be some errors in loading the final vqa-x model during validation then.
Did you make sure that the contents in model directory are correct?
Also you might want to check if you are providing the right paths to the model and caffemodel as flags to generate_explanations.py.
Do you mind sharing the contents of your model directory and the exact command you use for generate_explanations.py?

@jeffz0
Copy link
Author

jeffz0 commented Oct 15, 2018

Oh ok, so I should create a new exp_vdict.json if I'm training from pretrained VQA?

The command I'm running is: python generate_explanation.py --ques_file ../VQA-X/Questions/v2_OpenEnded_mscoco_val2014_questions.json --ann_file ../VQA-X/Annotations/v2_mscoco_val2014_annotations.json --exp_file ../VQA-X/Annotations/val_exp_anno.json --gpu 1 --out_dir ../VQA-X/results --folder ../model/ --model_path ../model/release_iter_40000.caffemodel --save_att_map

My model folder looks like this:
model

@jeffz0
Copy link
Author

jeffz0 commented Oct 18, 2018

Hey Seth, so I recreated my exp_vdict.json when training from pretrained VQA, but I'm still getting the same issues :. The models from my training end up in "./snapshots/VQA-X/release". These are the caffemodels I should be loading in right?

@Seth-Park
Copy link
Owner

Correct.
Let me take a closer look to the codebase and try to reproduce the problem in another machine from scratch.

@nattari
Copy link

nattari commented Nov 29, 2022

Hey! I have been facing similar problem. I am not using the pretrained model but training from scratch on my data. I see improvement in explanation loss and accuracy but the vqa loss doesn't seem to reduce. What could potentially go wrong? If you have any insights, it would be really helpful.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants