Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Last checkpoint for resuming training #71

Closed
agentmorris opened this issue May 20, 2023 · 3 comments
Closed

Last checkpoint for resuming training #71

agentmorris opened this issue May 20, 2023 · 3 comments

Comments

@agentmorris
Copy link
Owner

Could you please publish Last checkpoint (for resuming training) for mdv5?


Issue cloned from Microsoft/CameraTraps, original issue posted by skye-glitch on Aug 03, 2022.

@agentmorris
Copy link
Owner Author

Thanks for your interest in fine-tuning MDv5... I don't think you need a checkpoint for that, in fact according to the YOLOv5 documentation, you would only really use a checkpoint if you're resuming an interrupted training cycle, which only applies if you have access to the original training data.

Assuming you're looking to fine-tune MDv5 on new data, I think what you want to do is just use MDv5a or MDv5b as the starting weights for a new training cycle, like this:

python train.py --data your_new_training_data.yaml --weights path/to/md_v5a.0.0.pt

I can't exactly find documentation for this, but the YOLOv5 developer provides very helpful instructions on this thread.

Of course, you're in uncharted territory in terms of what the ideal learning rate would be for fine-tuning, and whether you might want to freeze some layers (documented here).

Let us know if that addresses your question? And let us know how it goes!

-Dan


(Comment originally posted by agentmorris)

@agentmorris
Copy link
Owner Author

Thanks Dan. The training command works.
Please correct me if I am wrong: md_v5 is a YOLOv5 model and I can do training/inference using scripts that works on YOLOv5. For the purpose of keeping files in a consistent format, run inference with run_detector_batch.py is recommended. For resuming training, there is no special requirement for the model, and we can just train with any YOLOv5 script?


(Comment originally posted by skye-glitch)

@agentmorris
Copy link
Owner Author

For the purpose of keeping files in a consistent format, run inference with run_detector_batch.py is recommended.

Yes, that's correct. You can use YOLOv5's inference scripts and you will get meaningful bounding boxes, but you won't get the file format - or even the output class identifiers - that all of the script in our repo work with, or that third-party tools for working with MD results expect.

For resuming training, there is no special requirement for the model, and we can just train with any YOLOv5 script?

As far as I know, that's true... but as far as I know, you're the first person to try this. :) Let everyone know how it goes!


(Comment originally posted by agentmorris)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant