Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving Early Stopping Patience Value in last.pt Checkpoint #13173

Open
1 task done
mabubakarsaleem opened this issue Jul 7, 2024 · 2 comments
Open
1 task done

Saving Early Stopping Patience Value in last.pt Checkpoint #13173

mabubakarsaleem opened this issue Jul 7, 2024 · 2 comments
Labels
question Further information is requested Stale

Comments

@mabubakarsaleem
Copy link

Search before asking

Question

Hello,

I have a question regarding the checkpointing mechanism in YOLOv5, specifically related to saving and resuming the training process.
When training a YOLOv5 model, the last.pt checkpoint saves the model's weights and optimizer state. However, it appears that training process parameters, such as the early stopping patience value, are not included in this checkpoint.
If my training is interrupted and I restart from the last.pt checkpoint, does the patience value reset to zero, or does it continue from the previously recorded value?

Additional

No response

@mabubakarsaleem mabubakarsaleem added the question Further information is requested label Jul 7, 2024
@glenn-jocher
Copy link
Member

@mabubakarsaleem hello,

Thank you for your question and for thoroughly searching the issues and discussions beforehand!

Currently, the last.pt checkpoint in YOLOv5 saves the model's weights and optimizer state but does not include training process parameters such as the early stopping patience value. Therefore, if your training is interrupted and you restart from the last.pt checkpoint, the patience value will reset to its initial state rather than continuing from the previously recorded value.

To maintain the early stopping patience value across training sessions, you can manually track this parameter and adjust it when resuming training. Here's a simple way to do this:

  1. Save the Patience Value: Before interrupting the training, save the current patience value to a file.
  2. Load the Patience Value: When resuming training, read the saved patience value and set it accordingly.

Here's a code snippet to illustrate this:

# Save patience value before interrupting training
patience_value = early_stopping.patience
with open('patience_value.txt', 'w') as f:
    f.write(str(patience_value))

# Load patience value when resuming training
with open('patience_value.txt', 'r') as f:
    patience_value = int(f.read())
early_stopping.patience = patience_value

Additionally, I encourage you to verify that you are using the latest versions of torch and the YOLOv5 repository to ensure you have the most up-to-date features and bug fixes. You can update YOLOv5 with the following commands:

git pull  # update YOLOv5
pip install -U torch  # update PyTorch

If you have any further questions or need additional assistance, feel free to ask. The YOLO community and the Ultralytics team are here to help!

Copy link
Contributor

github-actions bot commented Aug 8, 2024

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

2 participants