-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional Files #2
Comments
Hi, thank you for your comment! These files are provided by the dataset. You can find them inside the zip folder linked in the CounTX README.md. I have pasted the link here for convenience: link to data. When you download and extract the zip folder, you will see the relevant files. Please see the image pasted below for reference. Note that their file names match the ones in the command you have included in your message. Please feel free to ask more questions as they come up, and I will do my best to help in any way I can. |
I see. In that case, what if I want to test how well the model counts on a different set of images like my own dataset? Do I have to manually create these? What are the things that I have to prepare? |
Also, even after extracting, I don't see those files. Can you upload them on the repo by chance? |
Here is a link to the zip folder (containing all the files above) that I used: FSC-147 Link. I think it would be easiest to manually create the necessary dataset files. I will reply with clearer instructions on how to do so soon. |
@jwahnn , could you please email me directly at niki.amini-naieni@eng.ox.ac.uk about this if you want further help? Let me know if you were able to download the necessary files, and then we can go from there. |
Hi, I was waiting for the clearer instructions that you mentioned earlier. I will send a list of questions that I have via email if that is the preferred method. Let me know :) |
RE: "I was waiting for the clearer instructions that you mentioned earlier:" This is the command to get the results from the paper for the test set:
The easiest method to deploy the model on your own dataset is the following:
where each entry is a dictionary with the image name as the key (e.g., "1050.jpg") and another dictionary containing the dot annotations (the dictionary with entry "points" in the example above) as the value. Please see the file
But a simple class name will also do. The JSON file should have the following format:
where each entry is a dictionary with the image name as the key (e.g., "2.jpg," "3.jpg") and another dictionary containing the data split and text description as the value. Please see the file
where the keys are the data splits and the values are the lists of image names in the corresponding data split.
One point though is that I have not tested running inference on another dataset other than the ones in the paper, so while I think these instructions are generally correct, you might need to do some further work with the code to get the inference to run with no errors on your dataset. Although, the inference code should run smoothly and exactly reproduce the results of the paper for FSC-147. RE: I will send a list of questions that I have via email if that is the prefer method. Let me know :) |
This is amazing! Thanks for sharing all this. I really appreciate it. I will be working on checking in the next few days as to whether this works smoothly without issues. In the meantime, I have three more questions:
|
No problem! Yes, please let me know any issues as they come up, and I will try to help. RE: Would it be fine for me to run 'test.py' instead of 'test_reproduce_paper.py' like you did?: |
Hi, I will respond in a bit. I have to work on ECCV submissions. Stay tuned. Also, thanks for sharing the code. It really helps with diagnosing the issue! |
Hi, I followed the prompts above to test my dataset, but encountered the following error. May I ask what the reason may be? |
What is the size of your input image? It seems that the error is that your image is too large. Did you follow the preprocessing step in the code? Do you have some code for me to look at? |
Thanks for your reply! I modified the size of my image according to step 1, the code can run normally now!
|
That is good. Do you have your main file that you are running the network on and the example image that gives you the error? I will run it on my end. |
Indeed, the input size of the image is incorrect. After modifying the size again, the code can now run normally. I shared my code in the comments above |
Awesome! Let me know if you have more questions. |
Do all the images look similar to the one you have shown? Do they all have the same text description? |
I think the easiest approach to try first would be to add your images to FSC-147, and then use the provided training code to train on the mixed dataset composed of the FSC-147 images and the boxes images. "the boxes" is already a text description in FSC-147-D (see the image below). To do this, just add your images to the folder containing the FSC-147 images and then modify the FSC-147 dataset files accordingly (see above directions for how to deploy the model on your own dataset for reference). If that does not provide better performance, you could try finetuning the pretrained CounTX model on the new dataset following the same procedure that was used in the paper for training on only FSC-147. I have other ideas, but I think the above is probably enough to get started. |
Another test to try is to change the text description to "the boxes" instead of "the box." |
Great, I look forward to seeing your updates! |
Yes, you could try this to see if it changes performance, but what you have already done may be better. When you add your data, I would use only either "the box" or "the boxes" instead of both (i.e., either change the text description for your dataset to "the boxes" or just change the existing text description in FSC-147-D to "the box"). Again, what you have already done may be better since now the model is specialized in counting boxes. How many images total was your model trained on? |
I have changed the text description for my dataset to "the boxes". |
FSC-147 has 6135 images, so it would take a lot more time to train. Are you using early stopping (i.e., checking the performance on the validation set for each epoch and using the model that achieves the lowest errors)? Are you using any data augmentation to superficially increase the size of the training set? Is there any chance you could increase the number of images used for training (maybe even by checking if images in FSC-147 with corresponding labels "the boxes" could be added to your training set)? You could still try adding your images to FSC-147 (and this worked well in the case of CARPK) and train on the mixed dataset, but it might not help much. |
Also, if you examine the images the model fails on (since your dataset is small, this should be easier), you might get some insights into how to improve the performance. |
I found a dataset with images of cardboard boxes: https://app.roboflow.com/ds/ZCwOYJLruw?key=SjV4l9bmj5. If these boxes look like the ones you have in your dataset, you can convert the bounding box labels to dot annotations and train on a much larger dataset of boxes. |
Finally, since your dataset is small, it might be better to keep the CounTX image and text encoders frozen and just finetune the feature interaction model (smaller set of parameters) on your specific dataset. |
At this point, I would just train from scratch on more data. When you train from scratch, do you freeze the text encoder and finetune the image encoder as was done in the paper? For reference, you can transform any detection dataset for boxes to one compatible with CounTX by taking the box centers as the dot annotations. The same logic is true for instance segmentation datasets. |
Yes, when I start training from scratch, I freeze the text encoder and fine tune the image encoder as in the paper. |
Yes, so at this point, I would just increase the size of the training set and train from scratch. Have you tried training on a mix of FSC-147 and your dataset (asking out of curiosity)? |
Thank you, I will expand the number of images in the dataset and look forward to better performance. |
Due to the large dataset and limited GPU computing power, it will take about a week to obtain the results. I will share the results at that time. |
Hi, I'm back again, there are experimental results now. Based on the my experimental results, the following conclusions can be drawn:
If you have any other suggestions that may improve the performance of the model, I would greatly appreciate it! |
Hi, thanks for these results. To clarify, the MAE of 2.14 and 2.29 are on the joint dataset or just your custom dataset? Those seem like pretty low error values. What is the average number of boxes per image in the dataset that produced those errors? You could also try data augmentation to falsely grow the size of your existing training set, but I am not sure how much that would help. Are you using any data augmentation in your current training pipeline? For example, this paper is about a data augmentation method that improves the performance of CounTR (the model CounTX is based off of) on object counting. Are you using early stopping? Early stopping significantly improved CounTX's results. However, the results that you have might be the best that you can do with the existing method. |
Hi @jwahnn are you still having issues? |
|
If you use the original train.py file, it already includes data augmentation and early stopping. I think increasing the number of samples from your specific dataset would be the best option. What do you see are the differences between your dataset and the new dataset? |
Thank you, my specific dataset is difficult to expand due to some objective factors. I intuitively feel that there is not much difference between the two datasets. I have another idea. I want to pretrain on an extended box dataset and finetune on my specific dataset. Do you think it's reliable? I can collect more box images than I do now. |
Yes, this is a good idea to try. I will attach a file here (it is not super neat and not ready for posting to the main GitHub) that shows how I trained on CARPK. I got the best performance not when finetuning on CARPK after training on FSC-147, but when jointly training on both CARPK and FSC-147. I used a sampling procedure that controlled the fraction of the batch made up of samples from CARPK. You could use a similar approach. You could compose each training batch such that 60 % of the data comes from your specific dataset and 40 % comes from the other boxes dataset (or use some other split). Again, we are just trying ideas at this point. I am not sure if these approaches will improve performance significantly.
|
So far, I followed the instructions in setting up the environment and completing the preparation steps.
Now, I am trying to run inference. I assume I have to change
--im_dir
,--FSC147_anno_file
, and--data_split_file
, but am facing problems understanding what--FSC147_anno_file
and--data_split_file
are referring to and where I can get them.Are these files that happen to be generated during training? I wanted to use the existing baseline model for inference rather than training.
python test.py --data_split "val" --output_dir "./test" --resume "./results/checkpoint-1000.pth" --img_dir "/scratch/local/hdd/nikian/images_384_VarV2" --FSC147_anno_file "/scratch/local/hdd/nikian/annotation_FSC147_384.json" --FSC147_D_anno_file "./FSC-147-D.json" --data_split_file "/scratch/local/hdd/nikian/Train_Test_Val_FSC_147.json"
The text was updated successfully, but these errors were encountered: