-
-
Notifications
You must be signed in to change notification settings - Fork 16.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Understanding the recall and precision curve #12627
Comments
@mansi-aggarwal-2504 hello! Thank you for your kind words and for reaching out with your question. The interpolation in the For your use case, if you're interested in the overall recall and precision of your model on the dataset at a specific confidence threshold, you can indeed use the To get the overall metrics for your dataset, you would look at the values in these arrays corresponding to your chosen confidence threshold. If you want to calculate these metrics at a specific confidence threshold that wasn't directly evaluated during testing, you can use the interpolated values, which is what the code snippet you provided is doing. For more detailed information on how to interpret and use these metrics, please refer to our documentation at https://docs.ultralytics.com/yolov5/. I hope this helps, and if you have any more questions, feel free to ask. Happy detecting! 😊 |
Hello, And when I increased the max detections (test dataset 1): Here, I see a jump in recall but not a significant change in F1. Is the confidence threshold the major driver here? How could I interpret this better, and enhance my overall F1 score? (side note and query: The results are from a model trained on more data. I increased the dataset as I wasn't getting a good recall, so I thought that increasing data would enhance feature extraction and help increase the recall. However, the confidence threshold for recall curve doesn't increase (highest recall being achieved at a low confidence). EDIT: Results when it was default 300 max det (test dataset 2): Results when I updated max det to 1000 (test dataset 2): My recall increased but why is my precision intact - it is picking a lot more particles. Could someone help me understand this behaviou? |
Hello again @mansi-aggarwal-2504-2504, It's great to see you're diving deep into the performance of your model! When you increase the The confidence threshold is indeed a major driver in determining precision and recall. A lower threshold generally increases recall (more detections are considered), but it can also decrease precision (more false positives). Conversely, a higher threshold can increase precision (fewer false positives) but decrease recall (fewer detections overall). To enhance your overall F1 score, you could:
Regarding the side note, adding more data can indeed improve recall if the additional data helps the model learn to detect objects it previously missed. However, the highest recall at a low confidence threshold suggests that your model may be outputting many detections with low confidence that happen to be correct. This could be a sign that your model is uncertain and could benefit from further training or data. For the EDIT part, the If your precision remains unchanged after increasing To better understand your model's behavior, you might want to look at the distribution of confidence scores for the detections and see if there's a natural cutoff point that could inform a more appropriate I hope this clarifies your questions. Keep up the good work! 😊 |
Hello @glenn-jocher thanks for this awesome yolov5 repo! I am working on a custom dataset, but am hoping I can reuse the Below are 3 screenshots of metric logs. I can wrap my head around the scenario where P, R are both 0 (case 1). I am a bit confused when the metrics I get says P=1 and R=0 (case 2 & 3). Also, in case 2 and 3, how should I interpret when mAPs zero vs non-zero? |
👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help. For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed! Thank you for your contributions to YOLO 🚀 and Vision AI ⭐ |
Hello @ashwin-999! Thank you for your kind words and reaching out with your evaluation questions 🙌. In evaluating object detection models like YOLOv5, Precision (P), Recall (R), and mAP (mean Average Precision) are important metrics. Here's a brief overview to help interpret your cases:
Interpreting these metrics effectively depends on analyzing them together and considering the balance (or imbalance) between detecting correctly (precision) and detecting most or all true objects (recall). Both high precision with low recall or vice versa usually indicate an area for model improvement. The key is finding a balance that suits your application's needs, where sometimes detecting all objects (higher recall) may be more critical than being highly accurate in a fewer number of detections (precision), and other times, the reverse might be true. I hope this clarifies the metrics a bit! Keep experimenting and fine-tuning your model for the best balance that fits your use case. Happy detecting! 😊 |
Search before asking
Question
Hello!
Thank you for the great architecture.
I wanted to understand the
ap_per_class
in the metrics.py module, particularly the following snippet:Could someone help me understand the interpolation? If we want the absolute (?) recall and precision, can I simply use
recall
andprecision
?Additional
I just have one class in my custom dataset and multiple instances of that class in each image. Say, there are 100 images and 10 instances (arbitrary numbers) of the class in each image, I just want to know how many particles got detected out of these 1k instances, how many of the detections are correct, what we missed, etc. Basically, the recall and precision of the model for a given dataset (at a given confidence).
Thanks.
The text was updated successfully, but these errors were encountered: