Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to print the TP,FP,FN,TN in the terminal? #1251

Closed
ZwNSW opened this issue Oct 31, 2020 · 37 comments · Fixed by #5727
Closed

How to print the TP,FP,FN,TN in the terminal? #1251

ZwNSW opened this issue Oct 31, 2020 · 37 comments · Fixed by #5727
Labels
question Further information is requested Stale

Comments

@ZwNSW
Copy link

ZwNSW commented Oct 31, 2020

❔Question

I want to get tp to analyze my own data set, but I cannot output it after running test.py?

Additional context

@ZwNSW ZwNSW added the question Further information is requested label Oct 31, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Oct 31, 2020

Hello @ZwNSW, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook Open In Colab, Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

  • Cloud-based AI systems operating on hundreds of HD video streams in realtime.
  • Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
  • Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

@glenn-jocher
Copy link
Member

@ZwNSW TP and FP vectors are computed here:

yolov5/utils/general.py

Lines 250 to 319 in c8c5ef3

def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, fname='precision-recall_curve.png'):
""" Compute the average precision, given the recall and precision curves.
Source: https://github.com/rafaelpadilla/Object-Detection-Metrics.
# Arguments
tp: True positives (nparray, nx1 or nx10).
conf: Objectness value from 0-1 (nparray).
pred_cls: Predicted object classes (nparray).
target_cls: True object classes (nparray).
plot: Plot precision-recall curve at mAP@0.5
fname: Plot filename
# Returns
The average precision as computed in py-faster-rcnn.
"""
# Sort by objectness
i = np.argsort(-conf)
tp, conf, pred_cls = tp[i], conf[i], pred_cls[i]
# Find unique classes
unique_classes = np.unique(target_cls)
# Create Precision-Recall curve and compute AP for each class
px, py = np.linspace(0, 1, 1000), [] # for plotting
pr_score = 0.1 # score to evaluate P and R https://github.com/ultralytics/yolov3/issues/898
s = [unique_classes.shape[0], tp.shape[1]] # number class, number iou thresholds (i.e. 10 for mAP0.5...0.95)
ap, p, r = np.zeros(s), np.zeros(s), np.zeros(s)
for ci, c in enumerate(unique_classes):
i = pred_cls == c
n_gt = (target_cls == c).sum() # Number of ground truth objects
n_p = i.sum() # Number of predicted objects
if n_p == 0 or n_gt == 0:
continue
else:
# Accumulate FPs and TPs
fpc = (1 - tp[i]).cumsum(0)
tpc = tp[i].cumsum(0)
# Recall
recall = tpc / (n_gt + 1e-16) # recall curve
r[ci] = np.interp(-pr_score, -conf[i], recall[:, 0]) # r at pr_score, negative x, xp because xp decreases
# Precision
precision = tpc / (tpc + fpc) # precision curve
p[ci] = np.interp(-pr_score, -conf[i], precision[:, 0]) # p at pr_score
# AP from recall-precision curve
for j in range(tp.shape[1]):
ap[ci, j], mpre, mrec = compute_ap(recall[:, j], precision[:, j])
if j == 0:
py.append(np.interp(px, mrec, mpre)) # precision at mAP@0.5
# Compute F1 score (harmonic mean of precision and recall)
f1 = 2 * p * r / (p + r + 1e-16)
if plot:
py = np.stack(py, axis=1)
fig, ax = plt.subplots(1, 1, figsize=(5, 5))
ax.plot(px, py, linewidth=0.5, color='grey') # plot(recall, precision)
ax.plot(px, py.mean(1), linewidth=2, color='blue', label='all classes %.3f mAP@0.5' % ap[:, 0].mean())
ax.set_xlabel('Recall')
ax.set_ylabel('Precision')
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
plt.legend()
fig.tight_layout()
fig.savefig(fname, dpi=200)
return p, r, ap, f1, unique_classes.astype('int32')

@ZwNSW
Copy link
Author

ZwNSW commented Nov 1, 2020

@glenn-jocher Thanks for your answer.When I run the code you gave, there was an error like this.
" ap[ci, j], mpre, mrec = compute_ap(recall[:, j], precision[:, j])
TypeError: cannot unpack non-iterable numpy.float64 object"
I guess this should be an error during the assignment. But I don't know how to correct it.

@ZwNSW
Copy link
Author

ZwNSW commented Nov 1, 2020

When I print i, tpc, fpc, and their types to the terminal, I don't know which part is tpc, fpc, because the output of the terminal is as shown in the figure.I want to get tp, fp, fn for each category, and look forward to your answers and responses.
image

@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Dec 2, 2020
@github-actions github-actions bot closed this as completed Dec 7, 2020
@rita9410
Copy link

rita9410 commented Mar 5, 2021

@ZwNSW did you manage to find out the values ​​of tp, fp and fn?

@glenn-jocher
Copy link
Member

@rita9410 TP and FP vectors are computed here:

yolov5/utils/general.py

Lines 250 to 319 in c8c5ef3

def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, fname='precision-recall_curve.png'):
""" Compute the average precision, given the recall and precision curves.
Source: https://github.com/rafaelpadilla/Object-Detection-Metrics.
# Arguments
tp: True positives (nparray, nx1 or nx10).
conf: Objectness value from 0-1 (nparray).
pred_cls: Predicted object classes (nparray).
target_cls: True object classes (nparray).
plot: Plot precision-recall curve at mAP@0.5
fname: Plot filename
# Returns
The average precision as computed in py-faster-rcnn.
"""
# Sort by objectness
i = np.argsort(-conf)
tp, conf, pred_cls = tp[i], conf[i], pred_cls[i]
# Find unique classes
unique_classes = np.unique(target_cls)
# Create Precision-Recall curve and compute AP for each class
px, py = np.linspace(0, 1, 1000), [] # for plotting
pr_score = 0.1 # score to evaluate P and R https://github.com/ultralytics/yolov3/issues/898
s = [unique_classes.shape[0], tp.shape[1]] # number class, number iou thresholds (i.e. 10 for mAP0.5...0.95)
ap, p, r = np.zeros(s), np.zeros(s), np.zeros(s)
for ci, c in enumerate(unique_classes):
i = pred_cls == c
n_gt = (target_cls == c).sum() # Number of ground truth objects
n_p = i.sum() # Number of predicted objects
if n_p == 0 or n_gt == 0:
continue
else:
# Accumulate FPs and TPs
fpc = (1 - tp[i]).cumsum(0)
tpc = tp[i].cumsum(0)
# Recall
recall = tpc / (n_gt + 1e-16) # recall curve
r[ci] = np.interp(-pr_score, -conf[i], recall[:, 0]) # r at pr_score, negative x, xp because xp decreases
# Precision
precision = tpc / (tpc + fpc) # precision curve
p[ci] = np.interp(-pr_score, -conf[i], precision[:, 0]) # p at pr_score
# AP from recall-precision curve
for j in range(tp.shape[1]):
ap[ci, j], mpre, mrec = compute_ap(recall[:, j], precision[:, j])
if j == 0:
py.append(np.interp(px, mrec, mpre)) # precision at mAP@0.5
# Compute F1 score (harmonic mean of precision and recall)
f1 = 2 * p * r / (p + r + 1e-16)
if plot:
py = np.stack(py, axis=1)
fig, ax = plt.subplots(1, 1, figsize=(5, 5))
ax.plot(px, py, linewidth=0.5, color='grey') # plot(recall, precision)
ax.plot(px, py.mean(1), linewidth=2, color='blue', label='all classes %.3f mAP@0.5' % ap[:, 0].mean())
ax.set_xlabel('Recall')
ax.set_ylabel('Precision')
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
plt.legend()
fig.tight_layout()
fig.savefig(fname, dpi=200)
return p, r, ap, f1, unique_classes.astype('int32')

@rita9410
Copy link

rita9410 commented Mar 5, 2021

@glenn-jocher ok, but how can I print these variables by each class?

@glenn-jocher
Copy link
Member

glenn-jocher commented Mar 5, 2021

@rita9410 YOLOv5 TP and FP vectors are computed here:

yolov5/utils/general.py

Lines 250 to 319 in c8c5ef3

def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, fname='precision-recall_curve.png'):
""" Compute the average precision, given the recall and precision curves.
Source: https://github.com/rafaelpadilla/Object-Detection-Metrics.
# Arguments
tp: True positives (nparray, nx1 or nx10).
conf: Objectness value from 0-1 (nparray).
pred_cls: Predicted object classes (nparray).
target_cls: True object classes (nparray).
plot: Plot precision-recall curve at mAP@0.5
fname: Plot filename
# Returns
The average precision as computed in py-faster-rcnn.
"""
# Sort by objectness
i = np.argsort(-conf)
tp, conf, pred_cls = tp[i], conf[i], pred_cls[i]
# Find unique classes
unique_classes = np.unique(target_cls)
# Create Precision-Recall curve and compute AP for each class
px, py = np.linspace(0, 1, 1000), [] # for plotting
pr_score = 0.1 # score to evaluate P and R https://github.com/ultralytics/yolov3/issues/898
s = [unique_classes.shape[0], tp.shape[1]] # number class, number iou thresholds (i.e. 10 for mAP0.5...0.95)
ap, p, r = np.zeros(s), np.zeros(s), np.zeros(s)
for ci, c in enumerate(unique_classes):
i = pred_cls == c
n_gt = (target_cls == c).sum() # Number of ground truth objects
n_p = i.sum() # Number of predicted objects
if n_p == 0 or n_gt == 0:
continue
else:
# Accumulate FPs and TPs
fpc = (1 - tp[i]).cumsum(0)
tpc = tp[i].cumsum(0)
# Recall
recall = tpc / (n_gt + 1e-16) # recall curve
r[ci] = np.interp(-pr_score, -conf[i], recall[:, 0]) # r at pr_score, negative x, xp because xp decreases
# Precision
precision = tpc / (tpc + fpc) # precision curve
p[ci] = np.interp(-pr_score, -conf[i], precision[:, 0]) # p at pr_score
# AP from recall-precision curve
for j in range(tp.shape[1]):
ap[ci, j], mpre, mrec = compute_ap(recall[:, j], precision[:, j])
if j == 0:
py.append(np.interp(px, mrec, mpre)) # precision at mAP@0.5
# Compute F1 score (harmonic mean of precision and recall)
f1 = 2 * p * r / (p + r + 1e-16)
if plot:
py = np.stack(py, axis=1)
fig, ax = plt.subplots(1, 1, figsize=(5, 5))
ax.plot(px, py, linewidth=0.5, color='grey') # plot(recall, precision)
ax.plot(px, py.mean(1), linewidth=2, color='blue', label='all classes %.3f mAP@0.5' % ap[:, 0].mean())
ax.set_xlabel('Recall')
ax.set_ylabel('Precision')
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
plt.legend()
fig.tight_layout()
fig.savefig(fname, dpi=200)
return p, r, ap, f1, unique_classes.astype('int32')

They don't print out by default, you'd have to introduce some custom code to see them.

fpc and tpc are the FP and TP arrays of shape (n,10), for the 10 iou thresholds of 0.5:0.95. If you look at the last row, this is the FP and TP count per iou threshold:

tpc.shape
Out[3]: (3444, 10)
fpc.shape
Out[4]: (3444, 10)
tpc[-1]
Out[5]: array([138, 124, 105,  91,  80,  66,  54,  38,  22,   9])
fpc[-1]
Out[6]: array([3306, 3320, 3339, 3353, 3364, 3378, 3390, 3406, 3422, 3435])

So at 0.5 iou and 0.001 confidence threshold, for class 0, dataset inference results in 138 TPs and 3306 FPs.

@dariogonle
Copy link

dariogonle commented Jun 10, 2021

@rita9410 YOLOv5 TP and FP vectors are computed here:

yolov5/utils/general.py

Lines 250 to 319 in c8c5ef3

def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, fname='precision-recall_curve.png'):
""" Compute the average precision, given the recall and precision curves.
Source: https://github.com/rafaelpadilla/Object-Detection-Metrics.
# Arguments
tp: True positives (nparray, nx1 or nx10).
conf: Objectness value from 0-1 (nparray).
pred_cls: Predicted object classes (nparray).
target_cls: True object classes (nparray).
plot: Plot precision-recall curve at mAP@0.5
fname: Plot filename
# Returns
The average precision as computed in py-faster-rcnn.
"""
# Sort by objectness
i = np.argsort(-conf)
tp, conf, pred_cls = tp[i], conf[i], pred_cls[i]
# Find unique classes
unique_classes = np.unique(target_cls)
# Create Precision-Recall curve and compute AP for each class
px, py = np.linspace(0, 1, 1000), [] # for plotting
pr_score = 0.1 # score to evaluate P and R https://github.com/ultralytics/yolov3/issues/898
s = [unique_classes.shape[0], tp.shape[1]] # number class, number iou thresholds (i.e. 10 for mAP0.5...0.95)
ap, p, r = np.zeros(s), np.zeros(s), np.zeros(s)
for ci, c in enumerate(unique_classes):
i = pred_cls == c
n_gt = (target_cls == c).sum() # Number of ground truth objects
n_p = i.sum() # Number of predicted objects
if n_p == 0 or n_gt == 0:
continue
else:
# Accumulate FPs and TPs
fpc = (1 - tp[i]).cumsum(0)
tpc = tp[i].cumsum(0)
# Recall
recall = tpc / (n_gt + 1e-16) # recall curve
r[ci] = np.interp(-pr_score, -conf[i], recall[:, 0]) # r at pr_score, negative x, xp because xp decreases
# Precision
precision = tpc / (tpc + fpc) # precision curve
p[ci] = np.interp(-pr_score, -conf[i], precision[:, 0]) # p at pr_score
# AP from recall-precision curve
for j in range(tp.shape[1]):
ap[ci, j], mpre, mrec = compute_ap(recall[:, j], precision[:, j])
if j == 0:
py.append(np.interp(px, mrec, mpre)) # precision at mAP@0.5
# Compute F1 score (harmonic mean of precision and recall)
f1 = 2 * p * r / (p + r + 1e-16)
if plot:
py = np.stack(py, axis=1)
fig, ax = plt.subplots(1, 1, figsize=(5, 5))
ax.plot(px, py, linewidth=0.5, color='grey') # plot(recall, precision)
ax.plot(px, py.mean(1), linewidth=2, color='blue', label='all classes %.3f mAP@0.5' % ap[:, 0].mean())
ax.set_xlabel('Recall')
ax.set_ylabel('Precision')
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
plt.legend()
fig.tight_layout()
fig.savefig(fname, dpi=200)
return p, r, ap, f1, unique_classes.astype('int32')

They don't print out by default, you'd have to introduce some custom code to see them.

fpc and tpc are the FP and TP arrays of shape (n,10), for the 10 iou thresholds of 0.5:0.95. If you look at the last row, this is the FP and TP count per iou threshold:

tpc.shape
Out[3]: (3444, 10)
fpc.shape
Out[4]: (3444, 10)
tpc[-1]
Out[5]: array([138, 124, 105,  91,  80,  66,  54,  38,  22,   9])
fpc[-1]
Out[6]: array([3306, 3320, 3339, 3353, 3364, 3378, 3390, 3406, 3422, 3435])

So at 0.5 iou and 0.001 confidence threshold, for class 0, dataset inference results in 138 TPs and 3306 FPs.

@glenn-jocher, is it a way to print the TPs and FPs at a certain confidence? I mean I'd like to know the number of TPs and FPs at 0.5 iou and 0.60 confidence. Is it possible?

And what does the 3444 means in the tcp.shape output?

@glenn-jocher
Copy link
Member

@dariogonle the example you copied already prints them out at 10 different IoUs from 0.5 to 0.95. Results are evaluated at the --conf you supply.

@dariogonle
Copy link

dariogonle commented Jun 10, 2021

@glenn-jocher thank you for your response, but I would like the following: When I do a test I get the following Precision and Recall.
image

I'd like to know the number of TP and FP used to calculate that Precision and Recall. The precision and recall given are for a certain confidence (the one that maximizes the F1), 0.75 in this case. When I run this test (default conf-thres = 0.001) I get the following TPs and FPs.

image

So the supposed precision, for iou=0.5, should be => P = 262/(262+1984) = 0.11, but in the output the precision is 0.89. If I print the n_l variable in ap_per_class I get 284, so the recall should be R = 262/284 = 0.92, but the output recall is 0.78.

I was wondering how recall and precision is calculated. I guess that the number of TPs and FPs shown in the second image are for all confiances, but the precision and recall shown in the first image is for a certain level of confidence (0.75 in this case, because it is the value that maximizes the F1). That's why I asked you if there is a way to output the TPs and FPs for a certain confidence.

I'm sure that I'm missing something, but I can't figure it out. Thank you in advance.

@glenn-jocher
Copy link
Member

@dariogonle see metrics.py for P and R computation.

yolov5/utils/metrics.py

Lines 19 to 79 in 5c32bd3

def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, save_dir='.', names=()):
""" Compute the average precision, given the recall and precision curves.
Source: https://github.com/rafaelpadilla/Object-Detection-Metrics.
# Arguments
tp: True positives (nparray, nx1 or nx10).
conf: Objectness value from 0-1 (nparray).
pred_cls: Predicted object classes (nparray).
target_cls: True object classes (nparray).
plot: Plot precision-recall curve at mAP@0.5
save_dir: Plot save directory
# Returns
The average precision as computed in py-faster-rcnn.
"""
# Sort by objectness
i = np.argsort(-conf)
tp, conf, pred_cls = tp[i], conf[i], pred_cls[i]
# Find unique classes
unique_classes = np.unique(target_cls)
nc = unique_classes.shape[0] # number of classes, number of detections
# Create Precision-Recall curve and compute AP for each class
px, py = np.linspace(0, 1, 1000), [] # for plotting
ap, p, r = np.zeros((nc, tp.shape[1])), np.zeros((nc, 1000)), np.zeros((nc, 1000))
for ci, c in enumerate(unique_classes):
i = pred_cls == c
n_l = (target_cls == c).sum() # number of labels
n_p = i.sum() # number of predictions
if n_p == 0 or n_l == 0:
continue
else:
# Accumulate FPs and TPs
fpc = (1 - tp[i]).cumsum(0)
tpc = tp[i].cumsum(0)
# Recall
recall = tpc / (n_l + 1e-16) # recall curve
r[ci] = np.interp(-px, -conf[i], recall[:, 0], left=0) # negative x, xp because xp decreases
# Precision
precision = tpc / (tpc + fpc) # precision curve
p[ci] = np.interp(-px, -conf[i], precision[:, 0], left=1) # p at pr_score
# AP from recall-precision curve
for j in range(tp.shape[1]):
ap[ci, j], mpre, mrec = compute_ap(recall[:, j], precision[:, j])
if plot and j == 0:
py.append(np.interp(px, mrec, mpre)) # precision at mAP@0.5
# Compute F1 (harmonic mean of precision and recall)
f1 = 2 * p * r / (p + r + 1e-16)
if plot:
plot_pr_curve(px, py, ap, Path(save_dir) / 'PR_curve.png', names)
plot_mc_curve(px, f1, Path(save_dir) / 'F1_curve.png', names, ylabel='F1')
plot_mc_curve(px, p, Path(save_dir) / 'P_curve.png', names, ylabel='Precision')
plot_mc_curve(px, r, Path(save_dir) / 'R_curve.png', names, ylabel='Recall')
i = f1.mean(0).argmax() # max F1 index
return p[:, i], r[:, i], ap, f1[:, i], unique_classes.astype('int32')

@dariogonle
Copy link

dariogonle commented Jun 10, 2021

@glenn-jocher I understand how precision and recall are calculated but I don't understand the number of TPs and FPs that I get.

If R = TP/Targets then TP = R*Target = 0.78 * 284 = 222. Then if P = TP / (TP + FP) then FP = 27. With this theoretical values for TPs and FPs it is possible to get a precision of 0.89 and a recall of 0.78. But the output of tpc[-1] and fpc[-1] does not match with this values.

@glenn-jocher
Copy link
Member

glenn-jocher commented Jun 10, 2021

@dariogonle FPs and TPs are computed in the code I provided above on L53 and L54

@glenn-jocher
Copy link
Member

@ZwNSW @dariogonle good news 😃! Your original issue may now be fixed ✅ in PR #5727. This PR explicitly computes TP and FP from the existing Labels, P, and R metrics:

TP = Recall * Labels
FP = TP / Precision - TP

These TP and FP per-class vectors are left in val.py for users to access if they want:

yolov5/val.py

Line 240 in 36d12a5

tp, fp, p, r, f1, ap, ap_class = ap_per_class(*stats, plot=plots, save_dir=save_dir, names=names)

To receive this update:

  • Gitgit pull from within your yolov5/ directory or git clone https://github.com/ultralytics/yolov5 again
  • PyTorch Hub – Force-reload model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
  • Notebooks – View updated notebooks Open In Colab Open In Kaggle
  • Dockersudo docker pull ultralytics/yolov5:latest to update your image Docker Pulls

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!

@glenn-jocher glenn-jocher linked a pull request Nov 20, 2021 that will close this issue
@a227799770055
Copy link

@glenn-jocher
If I want to know which images were FN or FP, can I got the image id or image name by modified the code?
Thanks a lot!

@glenn-jocher
Copy link
Member

@a227799770055 we're working on better results introspection tools to allow you to see the worst performing images in a validation set. This isn't near to release yet but should be rolled out over the next few months. I'll add a note to allow for sorting by different metrics like FN, FP, mAP etc.

@kalenmike

@irgipaulius
Copy link

@glenn-jocher is there any progress on this?

@muhilhamms21
Copy link

Hii, it looks like theres no progress in here, but i want to ask some question because recently i just got this issue and i dont know how to solve it. I already print the TP and FP to the terminal but it looks like the number of TP and FP printed on the terminal doesnt add up with the label in the prediction image. The number of TP and FP is lower than the label present in the prediction image, even theres cases where one of the class i have got soo many FP but the FP number i got in terminal is 0. Is there anyone know about this? Thankyou in Advance

@glenn-jocher
Copy link
Member

@Rsphyxs hello! It's possible that there's a discrepancy between the TP and FP values you computed and the labels in the prediction images due to differences in how the metrics are computed. It's also possible that your implementation of printing TP and FP is not accounting for all cases.

One way to investigate this further would be to manually compare the labels in the prediction images with the computed TP and FP to see if there are any mismatches. Additionally, you can try checking your implementation of printing TP and FP for any errors or bugs.

Hope this helps! Let us know if you have any more questions.

@muhilhamms21
Copy link

muhilhamms21 commented May 19, 2023

Thankyou so much for your reply @glenn-jocher . I already check my implementation of printing the TP and FP because i follow your guide for it and i dont think theres issue on it. But i have follow up question, i am newbie in object detection so theres several things i probably didnt know, but i realized that FP value is based on TP and TP got its value from recall * labels right? but i just realized the class that i mention got 0 TP and FP even though theres a label prediction both TP and FP label for its class, and when i checked the corresponding class have 0 Recall, how is that possible? Since what i know is recall is how much TP the class got from all the right label? Below is image i attached to give more information regarding this issue. Thankyou in advance again. For addition information, i test several images with different scenario, and some of it got a right result for TP and FP but the several others dont match up between the labels in prediction images and in the result

Labeled image for validation (Sorry i have to censored the face cause its personal)
image

Predicted image
image

Result i got
image

@glenn-jocher
Copy link
Member

@Rsphyxs, it's possible for a class to have a 0 recall value even if there are labels and predictions for that class. Recall is the proportion of true positive predictions out of all the positive ground truth instances, which means that if a class has false negatives (i.e. missed detections), the recall value could be 0 even if there are true positive and false positive detections for that class.

In your case, it appears that the class you mentioned has false negatives and no true positives, which is resulting in a 0 recall value. This could explain why the computed TP and FP values differ from the labels in some cases.

To investigate this further, you could try comparing the labels and predictions for that class in the validation set to see if there are any missed true positive instances. Additionally, you can try tweaking the model hyperparameters or augmentation strategies to improve the detection performance for that class.

Hope this helps! Let us know if you have any more questions or concerns.

@muhilhamms21
Copy link

muhilhamms21 commented May 19, 2023

@glenn-jocher , Thanks for the reply again, but i guess theres some mistake since in the RAM class from the image that i given. They have 1 TP (which is in the hand of the person with 0.6 Confidence Score) and 1 FP and 1 FN hence the P and R should be 0.5 right? My first assumption why the ram doesnt detected as an TP is because the IoU threshold is too large so i try to lower the number but still got the same result? am i missing something?

@glenn-jocher
Copy link
Member

@Rsphyxs, thanks for bringing this to my attention. I apologize for any confusion caused earlier. You're correct that a class with 1 TP and 1 FP should have precision and recall values of 0.5.

It's possible that other factors are affecting the TP values for this class, such as the object's size or the location of the bounding box. Additionally, it's possible that there are bugs or inaccuracies in the TP and FP calculation code that could be contributing to the mismatch between the predicted and computed values.

To investigate this further, you could try manually examining the images and labels to see if there are any mismatches or inaccuracies in the detection results. Additionally, you can try tweaking the model's hyperparameters or adjusting the IoU threshold to see if this improves the detection performance for this class.

Hope this helps! Let us know if you have any more questions or concerns.

@muhilhamms21
Copy link

Hi @glenn-jocher , Can i ask a question in the Precision Recall formulation, can u explain this line of code cause i dont really get it
image
I already ask chatGPT regarding this, is this valid?
image

Thankyou in advance!

@glenn-jocher
Copy link
Member

Hi @Rsphyxs,

Sure, I'd be happy to explain the line of code you're seeing. The line of code you posted is a thresholding operation that sets predicted bounding boxes as true positives (TPs) or false positives (FPs) based on the Intersection over Union (IoU) overlap with the ground truth bounding boxes.

The iou > iouv expression checks if the IoU between the predicted and ground truth boxes is greater than a given IoU threshold iouv. If it is, the predicted box is considered a TP. If it isn't, the predicted box is considered a FP.

Regarding the question you asked in the chat, yes the result looks valid based on the values being printed. However, please note that the values depend on many factors such as the chosen IoU threshold and the accuracy of the model's predictions.

Hope this helps! Let me know if you have any more questions or if you need further clarification.

@muhilhamms21
Copy link

muhilhamms21 commented May 23, 2023

@glenn-jocher Thats really helpful thankyou so much with your answer. But i want to ask again cause i just got this problem when im running test with val.py. Because i want to test one of my class which is named RAM with 10 images with the result like this
image
But when im trying to add 1 new image where the image label contain another class with no RAM class label in it. The value of Precision and Recall in RAM class got 0 like this
image
I already try to troubleshooting it for a while and still got no idea, but so far i assume its because of the p[ci] and r[ci] calculation in which i give u the image before that makes the result 0 since the mAP still got the right value cause they do use the same value from precision and recall, do u know what cause it?

@glenn-jocher
Copy link
Member

@Rsphyxs hello!

I'm glad that my previous answer was helpful for you. Regarding your recent issue with the RAM class having 0 precision and recall when testing with a mixed image, it's possible that the calculation for precision and recall for the RAM class is being affected by the presence of other classes in the image.

This could be due to:

  1. The presence of other classes is causing misclassification of the RAM objects in the image: If other classes in the same image are being mistakenly detected as RAM objects, then the number of false positives (FP) will increase, which affects the precision calculation and could cause it to become 0. Similarly, if true RAM objects in the image are not being detected, then the number of true positives (TP) will be zero, which affects the recall calculation and could also cause it to become 0.

  2. The presence of other classes is affecting the calculation of TP and FP values: It's possible that the presence of other classes is affecting the calculation of TP and FP values for the RAM class. This could occur if the calculation of TP and FP values for each class is not being done independently of other classes in the same image. If that's the case, then the TP and FP values for the RAM class could be incorrectly calculated.

To investigate this issue further, I recommend that you compare the ground truth labels and the predicted labels for each image, and try to identify any misclassifications or missed detections. Additionally, you can try running the test with only those images that contain RAM class objects and see how the precision and recall values change.

If you're still having trouble, feel free to provide some additional information or share more details about your implementation. I'd be happy to assist you further.

@muhilhamms21
Copy link

Hi @glenn-jocher , Thanks again for your respond, but i have follow up question. If the Precision and Recall is really 0, should the mAP has the 0 value too? since mAP value come from the Precision Recall? Because in the image i give you before where the Precision and Recall is 0, the mAP still got a value and not 0. So i saw the Precision-Confidence Curve graph between the test where the Precision isnt 0
P_curve

And where the Precision is 0
P_curve

And it looks like the graph is pretty similiar, so i believe the Precision and Recall it got is no 0 but just printed 0 is that possible?

@glenn-jocher
Copy link
Member

@Rsphyxs hello!

Regarding your question about the mAP value when the precision and recall are both 0, it is possible for the mAP to have a non-zero value even if the precision and recall for a class are 0. The mAP value is a composite metric that takes into account the precision and recall values for all classes, so the contribution of a single class could be outweighed by the contributions of other classes.

Moreover, the precision-confidence graph you shared for the class where the precision is 0 looks similar to that where the precision is non-zero, which suggests that the precision and recall for the class are not actually zero but rather the values have been truncated to zero. This could be due to a problem with the way the TP and FP values are being calculated or how the precision and recall values are being printed.

If you could provide more details about the implementation, including the code that computes the TP and FP values and how you are printing the precision and recall values, it would be helpful in figuring out the problem.

Hope this helps! Let me know if you have any more questions or concerns.

@muhilhamms21
Copy link

muhilhamms21 commented May 24, 2023

Hi @glenn-jocher, for the code is actually just like what YOLOv5 use, because im using YOLOv7 which is based from YOLOv5. I already checked and its pretty much the same

[EDIT]
So i just got a glimpse on what happening, so i realized in this line of code
image
We will pick the Precision value of each class base on the index of F1 from all class max value right? So the reason of the 0 value in the precision and recall in class RAM is because when im adding a new image with different classes, it affect the F1 max value so it will pick the higher index value from each class in which the ram have 0 value in that index like in this image

Before adding new image the index that it takes from RAM precision is 141
image
Here is the F1 Confidence graph
F1_curve
As we can see theres still value in RAM graph since it still takes the earliest index

But after adding new image, the index is change from 141 to 818
image
Here is the F1 graph
F1_curve
Which is why the F1 max score is shifting heavily to higher index which theres no more value in RAM.

Is there any reason why we used overall F1 score from all class and not from each classes? And is it possible to change the code so it takes the index from each class F1 max score and not from mean of all F1 scores? And if i do that, is the result still valid?

@glenn-jocher
Copy link
Member

Hi @Rsphyxs,

I understand that you're facing an issue with the precision and recall values for a particular class in your YOLOv7 implementation, which is based on the YOLOv5 codebase. From the information you've provided, it seems that the max F1 score is being used to pick the precision and recall values for each class.

You speculated that the 0 value in precision and recall for the RAM class might be due to the fact that the max F1 score has shifted to a higher index after adding a new image. This occurs because the F1 scores and the max F1 score are calculated over all classes and not per class.

Regarding your question about why the max F1 score is used to pick the precision for each class, instead of taking the max precision value for each class, the reason is that the YOLOv5 and YOLOv7 codebases compute mAP based on max F1 score. The use of F1 score to compute mAP is due to its ability to balance both precision and recall. However, this approach may not be appropriate in all cases, and you may consider using the max precision value for each class instead.

If you decide to use max precision instead of max F1 score, the results should still be valid, but it may affect the mAP score as it is calculated based on F1 score. Nevertheless, it is important to keep in mind that changing this implementation may also have unintended effects on the model's performance.

Hope this helps! Let me know if you have any more questions or concerns.

@muhilhamms21
Copy link

muhilhamms21 commented May 24, 2023

Hello @glenn-jocher Thats soo much helpful thankyou, but is there a way i can get my TP and FP without the final Precision Recall value? since i really need TP and FP from each class. Can i get it from TPC and FPC?

@enesayan
Copy link

enesayan commented Jun 2, 2023

Hello @glenn-jocher Thats soo much helpful thankyou, but is there a way i can get my TP and FP without the final Precision Recall value? since i really need TP and FP from each class. Can i get it from TPC and FPC?

Did you solve the problem? I have the same problem and there is an inconsistency between test results and confusion matrix results.

@muhilhamms21
Copy link

Sadly no, for now I'm just using mAP 0.5 as a main parameter for my project

@glenn-jocher
Copy link
Member

@enesayan hello,

We're sorry to hear that you are facing issues with the precision and recall values for your YOLOv5 implementation. We understand that this is an important metric for your project, and we would like to help you resolve this issue.

Regarding your question, you can obtain the TP and FP values for each class by using the TPC and FPC arrays in the confusion matrix. These arrays contain the true positive count and false positive count for each class, respectively.

However, if there is an inconsistency between the test results and the confusion matrix results, this could indicate a problem with how the TP and FP values are being computed. Some common issues that could cause such discrepancies include incorrect ground truth labels, incorrect predicted labels, and inconsistent naming of classes.

To investigate this further, we recommend that you carefully review the ground truth and predicted labels for each image, and compare them with the labels in the confusion matrix. Additionally, you can try running more tests with different images to see if the same issue persists.

If you're still having trouble, please provide more information about your implementation and the specific issues you are facing so that we can assist you better.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants