Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrain a custom model or train a new one #12883

Closed
1 task done
JrGxllxgo opened this issue Apr 4, 2024 · 6 comments
Closed
1 task done

Retrain a custom model or train a new one #12883

JrGxllxgo opened this issue Apr 4, 2024 · 6 comments
Labels
question Further information is requested Stale

Comments

@JrGxllxgo
Copy link

Search before asking

Question

Hello everyone!

I have been training a segmentation model. My dataset consists of +/- 2500 training images and +/- 700 validation images, in these images we have 2 classes to detect. I have trained it for 150 epochs and probably the 60% of the predictions are right but I want better results.

Creating a new dataset with more images with the cases the model doesn't predict very well and retraining the model might be a good idea but I'm not sure, maybe training a new model with more images with wrong results is better.

What do you recommend to me?

Thank you all!

Additional

No response

@JrGxllxgo JrGxllxgo added the question Further information is requested label Apr 4, 2024
@glenn-jocher
Copy link
Member

@JrGxllxgo hello! 😊

Great to hear about your progress with training a segmentation model. Improving model performance is often a journey of iteration and experimentation.

Given your scenario, expanding your dataset with more images, especially those where the model currently underperforms, is a good strategy. Instead of training a new model from scratch, retraining your existing model with this enriched dataset can help. This approach leverages what the model has already learned and further fine-tunes it on the new challenging cases, potentially leading to better results.

Here’s a simple way to proceed:

  1. Augment your dataset with more images, focusing on the cases with poor predictions.
  2. Continue training your current model with the expanded dataset for additional epochs.

This method is generally more efficient and effective than starting anew since it builds on the existing knowledge base of your model.

If you have further questions or need clarification, feel free to ask. Happy training!

@JrGxllxgo
Copy link
Author

Hello @glenn-jocher !!

Thanks for your recommendation! I'll augment my dataset, how many images you recommend me to ascend to?

When I started the retrain I supose that the command to do it is "python segment/train.pt --data path_to/my_yaml.yaml --weights myTrainedModel.pt --img 640", is that right?

And for the last question I have is if there is any tip or parameter that do the train more quickly, now my epochs are on 1h more or less and if I reduce this time is amazing, is there any thing I can do to reduce the time?

Thanks you for all!!!

@glenn-jocher
Copy link
Member

Hello @JrGxllxgo!

Thrilled to see your enthusiasm! Let's tackle your questions:

  1. Dataset Size: There's no one-size-fits-all number for how many images you should add. It heavily depends on your model's current performance and the variability of your data. As a general rule, try to balance or double the number of images in scenarios where your model underperforms. However, quality over quantity always matters. Ensure the added images genuinely introduce new information or challenge the model in meaningful ways.

  2. Retrain Command: Your command looks almost correct! Just ensure you're pointing to the right script. It should be something like this for training:

    python train.py --data path_to/my_yaml.yaml --weights myTrainedModel.pt --img 640

    Make sure to replace segment/train.pt with train.py.

  3. Speeding Up Training: To reduce training time, consider the following:

    • Reduce Batch Size: A smaller batch size requires less memory, potentially speeding up the training, but watch out for performance impact.
    • Use a Faster GPU: Hardware improvements can significantly reduce training time.
    • Freeze Early Layers: Early in training, freezing the initial layers of the model can speed up training since these layers often learn generic features that don't need frequent updates.

Remember, improving training time may sometimes come at the cost of model accuracy or generalization, so balance is key.

Hope this helps, and happy training!

@JrGxllxgo
Copy link
Author

Thanks for your help! @glenn-jocher

I got a "NVIDIA GeForce RTX 3060 Laptop GPU" how can improve training time with that? And last question before my train, how can I freeze that early layers?

@glenn-jocher
Copy link
Member

@JrGxllxgo, glad to assist! 😊 With your NVIDIA GeForce RTX 3060, you're already equipped with a robust GPU for training. To further optimize training time:

  1. Batch Size and Workers: Increase your batch size and number of workers in your training command as long as your GPU can handle it without running out of memory. This utilizes your GPU more efficiently.

  2. Mixed Precision Training: Utilize mixed precision training by adding --amp to your command if not already enabled. This can significantly speed up training on NVIDIA GPUs that support it.

Regarding freezing early layers, currently, YOLOv5 doesn't natively support layer freezing via a command-line option. The model learns as a whole during training. However, if you're comfortable with modifying the source code, you could manually adjust which layers to freeze in the model definition files, though this is recommended for advanced users familiar with PyTorch and the model architecture.

Happy training! 🚀

Copy link
Contributor

github-actions bot commented May 9, 2024

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label May 9, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

2 participants