roc curve #12981

gchinta1 · 2024-05-03T21:22:06Z

Search before asking

I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

hello Glenn, how are you? i am trying to make a roc curve after running validation but i am not doing it well. i read your answers about metrcis and test.py but i cannot find it. i tryed to do it from metrics and running validation mayde it works like confusion matrix but nothing. i tryed to make it from confusion matric numbers but i only get a point in graph. Can i make roc curve only from one example or i need to do them all together? Also the txts outputs has on confidense numbers. Can you help me please ? what i need to do to get roc after for my project?..thank you

Additional

No response

glenn-jocher · 2024-05-03T23:10:24Z

Hello! 😊

For generating a ROC curve after model validation, you’ll need a set of predictions and corresponding ground truth labels to compare against. The ROC curve cannot be accurately created from a single example; it requires aggregate data to evaluate performance across different thresholds.

Here’s a basic concept of what you need to do:

Use your model to make predictions over your validation dataset.
Save the model’s confidence scores for each prediction and the actual labels.
Calculate True Positive Rate (TPR) and False Positive Rate (FPR) at various threshold levels.
Plot TPR against FPR to form the ROC curve.

The output .txt files contain confidence scores as you noted, along with class predictions, which are what you need. You’ll have to compile this data from multiple examples to plot the curve.

If you require further detailed code or method implementations, I recommend checking out Python libraries like sklearn.metrics.roc_curve, which can greatly simplify these tasks.

Good luck with your project! 👍

gchinta1 · 2024-05-04T06:24:59Z

Hello! 😊

For generating a ROC curve after model validation, you’ll need a set of predictions and corresponding ground truth labels to compare against. The ROC curve cannot be accurately created from a single example; it requires aggregate data to evaluate performance across different thresholds.

Here’s a basic concept of what you need to do:

Use your model to make predictions over your validation dataset.

Save the model’s confidence scores for each prediction and the actual labels.

Calculate True Positive Rate (TPR) and False Positive Rate (FPR) at various threshold levels.

Plot TPR against FPR to form the ROC curve.

The output .txt files contain confidence scores as you noted, along with class predictions, which are what you need. You’ll have to compile this data from multiple examples to plot the curve.

If you require further detailed code or method implementations, I recommend checking out Python libraries like sklearn.metrics.roc_curve, which can greatly simplify these tasks.

Good luck with your project! 👍

Hello, i have a code for 3 examples adn works very well but when i but when put in a loop for all the csv files a have (i did it to be easier to process) it has nan result..also i only use one class one thing after detection so the class is '0'. This my code of 3 examples and i want to make it for 200

`import pandas as pd
from sklearn.metrics import roc_curve, auc
import matplotlib.pyplot as plt

file_path_predictions_1 = 'labelsval1\cju30ajhw09sx0988qyahx9s8.csv'
predictions_data_1 = pd.read_csv(file_path_predictions_1, header=None)

file_path_predictions_2 = 'labelsval1\cju16fpvhzypl0799p9phnlx6.csv'
predictions_data_2 = pd.read_csv(file_path_predictions_2, header=None)
file_path_predictions_3 = 'labelsval1\cju8dn0c3u2v50801k8rvq02f.csv'
predictions_data_3 = pd.read_csv(file_path_predictions_3, header=None)

y_scores_1 = predictions_data_1[1]
y_scores_2 = predictions_data_2[1]
y_scores_3= predictions_data_3[1]

all_scores = y_scores_1.tolist() + y_scores_2.tolist()+ y_scores_3.tolist()

y_true = [0] * len(y_scores_1) + [1] * len(y_scores_2)+ [1]* len(y_scores_3)
print(y_true)

fpr, tpr, thresholds = roc_curve(y_true, all_scores)
roc_auc = auc(fpr, tpr)

plt.figure()
plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic for Two CSVs')
plt.legend(loc="lower right")
plt.show()
` thank you

glenn-jocher · 2024-05-04T08:44:48Z

Hello! 😊

It looks like you're on the right track. The issue of receiving NaN results might occur if any of your CSV files have missing data, or possibly if the confidence scores are incorrect for generating ROC curves. Ensure that the datasets are clean and properly formatted before processing. Also, validate that the '1's and '0's are correctly assigned in your y_true list based on your intended class labels.

Here’s a streamlined way to handle multiple CSV files for your scenario:

import pandas as pd
from sklearn.metrics import roc_curve, auc
import matplotlib.pyplot as plt
import glob

# This will hold all your scores and true values
all_scores = []
y_true = []

# Loop through all CSV files in your directory
for csv_file in glob.glob('labelsval1/*.csv'):
    data = pd.read_csv(csv_file, header=None)
    scores = data[1].tolist()
    all_scores.extend(scores)
    # Ensure to update y_true based on your actual data specifics, using 0 or 1 accordingly.
    y_true.extend([class_label] * len(scores))  # Replace class_label with 0 or 1 as appropriate.

# Calculate ROC
fpr, tpr, thresholds = roc_curve(y_true, all_scores)
roc_auc = auc(fpr, tpr)

# Plotting
plt.figure()
plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic')
plt.legend(loc="lower right")
plt.show()

This code will concatenate all the prediction scores from different files and their corresponding true labels (make sure to set those correctly), then calculate and plot the ROC curve. Ensure the directory path and file patterns match your setup!

Hope this helps! 😊👍

gchinta1 · 2024-05-08T08:37:25Z

thank you for help again everything is running good

glenn-jocher · 2024-05-08T16:41:53Z

@gchinta1 you're welcome! I'm glad to hear everything is running smoothly. If you have any more questions or need further assistance down the line, feel free to reach out. Happy coding! 😊👍

gchinta1 added the question Further information is requested label May 3, 2024

gchinta1 closed this as completed May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roc curve #12981

roc curve #12981

gchinta1 commented May 3, 2024 •

edited

Loading

glenn-jocher commented May 3, 2024

gchinta1 commented May 4, 2024 •

edited

Loading

glenn-jocher commented May 4, 2024

gchinta1 commented May 8, 2024

glenn-jocher commented May 8, 2024

roc curve #12981

roc curve #12981

Comments

gchinta1 commented May 3, 2024 • edited Loading

Search before asking

Question

Additional

glenn-jocher commented May 3, 2024

gchinta1 commented May 4, 2024 • edited Loading

glenn-jocher commented May 4, 2024

gchinta1 commented May 8, 2024

glenn-jocher commented May 8, 2024

gchinta1 commented May 3, 2024 •

edited

Loading

gchinta1 commented May 4, 2024 •

edited

Loading