Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yolov5 confusion matrix with background FP=1 and TN=0 #11194

Closed
yiluny217 opened this issue Mar 19, 2023 · 24 comments
Closed

Yolov5 confusion matrix with background FP=1 and TN=0 #11194

yiluny217 opened this issue Mar 19, 2023 · 24 comments
Labels

Comments

@yiluny217
Copy link

yiluny217 commented Mar 19, 2023

Hello,

I was training a model to detect trucks in pictures and here is the result confusion matrix of my val data. Following the convention of reading a confusion matrix, I'll call TP for the upper left cell, FP for upper right cell, FN for lower left cell and TN for lower right cell. (please ignore the class 'bicycle' and 'person' because the original dataset only have trucks labeled but 'truck' was assigned a 'class=2' during the manual annotation)

confusion_matrix

For the 'background', there are 1 and 0. I searched online and found a lot of people are having the same issue, here are some examples:
yolov5 issue 10365
yolov5 issue 1665
stackoverflow
In yolov5 issue 1665, I noticed @glenn-jocher gave a brief explanation that 'columns are normalized', but I'm still quite confused. My I get a more clear explanation about why it happens and is there a possible way to fix it?

Another thing bothering me is that actually I didn't have any annotation of background in my training data, so I guess that's why TN=0?

@github-actions
Copy link
Contributor

github-actions bot commented Mar 19, 2023

👋 Hello @yiluny217, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics

@github-actions
Copy link
Contributor

github-actions bot commented Apr 19, 2023

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

@github-actions github-actions bot added Stale and removed Stale labels Apr 19, 2023
@github-actions
Copy link
Contributor

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label May 23, 2023
@jbezovsek
Copy link

Hello,

I was training a model to detect trucks in pictures and here is the result confusion matrix of my val data. Following the convention of reading a confusion matrix, I'll call TP for the upper left cell, FP for upper right cell, FN for lower left cell and TN for lower right cell. (please ignore the class 'bicycle' and 'person' because the original dataset only have trucks labeled but 'truck' was assigned a 'class=2' during the manual annotation)

confusion_matrix

For the 'background', there are 1 and 0. I searched online and found a lot of people are having the same issue, here are some examples: yolov5 issue 10365 yolov5 issue 1665 stackoverflow In yolov5 issue 1665, I noticed @glenn-jocher gave a brief explanation that 'columns are normalized', but I'm still quite confused. My I get a more clear explanation about why it happens and is there a possible way to fix it?

Another thing bothering me is that actually I didn't have any annotation of background in my training data, so I guess that's why TN=0?

I have the same issue of interpreting this kind of results, however my interpretation would be that the columns do not depend on each other as you could assume in 2x2 simple confusion matrix. The 2x2 confusion matrix that I have in mind should ideally look like [1, 0; 0 1]. Because the columns are normalized, the sum of the columns has to be 1, but the sum of the rows can be over 1. In your case I would say that if the actual object was truck, the model predicted truck in 74% and in 26% predicted background. I am still a little confused about the background, because @glenn-jocher here: #1665 (comment) said, that the background is not predicted, so this could be the reason for the results in background column?

@glenn-jocher
Copy link
Member

@jbezovsek thank you for sharing your concerns about interpreting the confusion matrix for YOLOv5. It can be confusing to understand the results of the matrix, especially when dealing with single-class detection and background.

You are correct that the columns of this matrix do not necessarily depend on each other, as in a simple 2x2 confusion matrix. In the case of YOLOv5, the columns represent the predicted classes, and they are normalized. As a result, the sum of each of the columns would be equal to 1.

Regarding the background class, according to the YOLOv5 developers, it is not predicted by the model. Therefore, it is possible that the TN value being 0 is a result of not having any true negative samples for the background class in the validation set.

Once again, thank you for your question, and please let us know if you have any further concerns about YOLOv5 or vision AI in general.

@github-actions github-actions bot removed the Stale label May 28, 2023
@ilhamalvindo
Copy link

ilhamalvindo commented May 31, 2023

Hello @glenn-jocher, is that mean for single class that has FP=1 and TN=0 on background class is because we don't have any samples of background on validation set. Not because our model is wrong, right?

Because i have the same issue above, my FP=1 and TN=0 for single class label

@glenn-jocher
Copy link
Member

Hello @ilhamalvindo, thank you for reaching out with your question regarding YOLOv5's confusion matrix and understanding the results.

You are correct. In a single-class detection with background, if the TN value is 0, it likely means that there were no true negative samples for the background class in the validation set. Therefore, it does not necessarily imply that your model is wrong. Similarly, if the FP value is 1, it could mean that there was only one false positive detected as a background.

Please let me know if you have any further questions or concerns.

@ilhamalvindo
Copy link

Thank you for the explanation! @glenn-jocher

@glenn-jocher
Copy link
Member

@ilhamalvindo you're welcome, happy to be of help! If you have any further questions or issues with YOLOv5, please don't hesitate to ask.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 1, 2023

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label Jul 1, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 12, 2023
@RyanTNN
Copy link

RyanTNN commented Nov 10, 2023

confusion_matrix
Hello @glenn-jocher . I have some questions. I quit not understanding the background FP and the background FN. This confusion matrix shows background FP 0.77 and background FN 0.22.

  1. What is exactly the meaning of background FP and FN?
  2. Does it affect on the prediction? Why does it affect on the prediction or why does it not affect on the prediction?
  3. As you know, changing conf 0.25 or 0.9 that only changes the object accuracy but does not change the background FP or FN. why?
  4. how can I reduce background FP and FN?
    Thank you!

@glenn-jocher
Copy link
Member

@RyanTNN hi there! It seems like your link to the confusion matrix image is not accessible. However, I will still address your questions based on the information provided.

  1. The "background FP" represents the false positive rate for the background class, i.e., the rate at which the model incorrectly predicts the presence of the background class when it's not actually there. On the other hand, "background FN" stands for the false negative rate for the background class, i.e., the rate at which the model fails to detect the background class when it is present in the image.

  2. The background FP and FN can affect the overall performance of the model, especially if the background class is being misclassified frequently, which might lead to incorrect predictions for other classes as well. However, in some scenarios, particularly in single-class detection tasks, the impact might be minimal depending on the specific use case.

  3. Changing the confidence threshold (conf) primarily affects the object accuracy as it determines the minimum confidence score required for an object to be considered as detected. It might not directly impact the background FP or FN if the background class is not being considered in the confidence threshold settings.

  4. To reduce background FP and FN, you may try various techniques such as refining the training data to include more diverse backgrounds, enhancing the model architecture, adjusting training hyperparameters, and possibly introducing data augmentation to expose the model to a wider variety of background scenarios.

Feel free to provide additional details or share the confusion matrix image for a more precise analysis or assistance.

@kevinkwabena
Copy link

Why does the confusion matrix give me 100% in the background in yolov7 and v8
confusion_matrix_normalized
@glenn-jocher

@glenn-jocher
Copy link
Member

@kevinkwabena hello! It seems there might be a misunderstanding. I'm the author and maintainer of the Ultralytics YOLOv5 repository, and YOLOv7 and v8 are not part of the Ultralytics projects. They are developed by different teams and may have different implementations and behaviors.

Regarding your question about a confusion matrix showing 100% in the background, this typically indicates that the model is predicting the background class for all the samples, which could be due to several reasons such as model overfitting, incorrect labeling, or issues with the validation dataset.

For specific help with YOLOv7 or v8, I would recommend reaching out to the maintainers of those repositories or checking their documentation and issues for similar cases and solutions.

If you have questions about YOLOv5, I'd be more than happy to assist you!

@Leonardbd
Copy link

confusion_matrix (2)

@glenn-jocher Hi! It seems the issue still persists for YOLOv5, I'm not observing any True Negatives (TN) or False Positives (FP) in the confusion matrix. This occurs despite my validation set including both negative and positive images, though labels are only provided for the positive ones. My YAML file specifies only one class, "smoke".

The problem remains the same whether I'm running train.py or val.py.

@glenn-jocher
Copy link
Member

@Leonardbd hello! Thanks for reaching out with your confusion matrix concern. 😊

Based on what you've described, it appears your model isn’t recognizing any True Negatives (TN) or False Positives (FP) because your validation set only includes labels for positive cases ("smoke"). For TN and FP to appear, there need to be instances where your model predicts "no smoke" correctly (TN) or incorrectly (FP), which requires labeled negative (no smoke) images in your dataset.

If you haven't already, ensure your dataset includes explicitly labeled negative images (images without smoke) and that they're correctly referenced in your dataset YAML file. This adjustment should help the model recognize and learn from both the presence and absence of smoke, potentially resolving the issue you're observing with the confusion matrix.

Quick tip: Make sure your 'train' and 'val' keys in the YAML file accurately reflect your dataset's structure, including paths to both positive and negative samples.

Let me know if this helps, or if you have further questions!

@Leonardbd
Copy link

@glenn-jocher thank you for your guidance!

Based on your advice, it seems I need to include a separate class in the YAML file for images without smoke, labeled as "nosmoke". I've updated my YAML file accordingly:

names:
0: smoke
1: nosmoke

However, I'm unsure about how to proceed with annotating the negative images (those without smoke). Since these images inherently lack the object of interest and thus don't have bounding boxes, how should their corresponding .txt annotation files be formatted? Should they simply include a "1" to denote the "nosmoke" class, and if so, how do we address the absence of bounding box coordinates?

I appreciate your assistance in clarifying this matter.

@glenn-jocher
Copy link
Member

@Leonardbd glad to hear you're making progress! 😊

For annotating negative images (no object of interest like "nosmoke"), you should create an empty .txt annotation file for each image. There's no need to include a "1" or any other class identifier in these files. Simply having the empty .txt file corresponding to the image tells YOLOv5 that there are no objects present in that image.

Here's a quick example:

  • For a negative image image123.jpg, you'll have a corresponding image123.txt file that is empty.

This effectively informs the model that "image123.jpg" contains no objects to be detected, helping it learn the "nosmoke" instances without needing explicit bounding boxes.

Keep up the good work, and feel free to reach out if you have more questions!

@Rashimingo
Copy link

Confusion-matrix
I seem to have the same problem as the other people before me but I dont think I understood what is happening since I did include some TN images in my single class dataset to classify and predict PCB which they are called "null" in roboflow, yet the confusion matrix is reading every pic as a PCB and the background is 0 which is kind of suspicious @glenn-jocher can you provide me with any explanation pls and thanks in advance

@glenn-jocher
Copy link
Member

Hi there! 👋

It sounds like you're experiencing an issue where the model predicts every image as containing a PCB. If your dataset includes negative samples ("null" in your case) but the confusion matrix still shows zero background detection, it might be due to how these "null" images are annotated or handled during training.

Ensure that:

  • Your "null" images have corresponding empty annotation files in your dataset. An empty .txt file for these images indicates to the model that there are no objects present.
  • Your training dataset configuration in your YAML does correctly reference these negative samples alongside the positive ones.

If both these are in place and you're still seeing issues, it might be worthwhile to double-check the balance and variety in your dataset or consider further tuning your model's hyperparameters.

Let me know if this helps or if there's anything else you'd like to explore! 😊

@Rashimingo
Copy link

Rashimingo commented May 13, 2024 via email

@glenn-jocher
Copy link
Member

Hi Rashed,

Good to hear from you! Regarding your question about removing the background row and column from the confusion matrix, it's technically feasible to exclude these in your visualization if they aren't relevant to your analysis. However, doing so might limit the insight you can gain regarding how well your model is distinguishing between PCBs and true negatives (background). Generally, it's useful to see all aspects of your model's performance, including background predictions, to assess any potential biases or issues.

If you still wish to modify the matrix, this typically involves adjusting the code that generates or visualizes the confusion matrix in your analysis scripts. You would remove the data associated with the background before plotting.

Let me know if you need further assistance or specific code help!

Cheers 😊

@vishakraj64
Copy link

Hi @glenn-jocher,

I want to clarify this, for all the anchors we get the object_Score which tells whether the object present in that region or not. if the that object_Score is 0 then it is consider as background - is it right?

In this answer, you have mentioned that there is no background class samples annotated on validation dataset[TN], that's why the FP showing 100% for the object class - that means for eg,
- in single class detection, if 14 or any number of anchor regions got detected as an object where it is actually a background region, then it will show 100% as the FP for the object class
- in multi class detection, if 15 anchor regions got detected as an object-A and 30 anchor regions got detected as an object-B where those regions are actually a background region, then the FP for object-A is 0.33 and for object-B is 0.66?

Is the above statements are correct, Could you clarify this, thanks

@glenn-jocher
Copy link
Member

Hi @vishakraj64,

Thank you for your detailed question! Let's clarify how the object score and background detection work in YOLOv5.

Object Score and Background Detection

In YOLOv5, each anchor box predicts an objectness score, which indicates the likelihood of an object being present in that region. If the objectness score is low (close to 0), it suggests that the region is likely background. However, YOLOv5 does not explicitly classify regions as "background"; instead, it focuses on detecting objects of interest.

Single-Class Detection

For single-class detection, if your model incorrectly predicts objects in regions that are actually background, these will be counted as False Positives (FP). For example:

  • If 14 anchor regions are incorrectly predicted as containing an object when they are actually background, these will contribute to the FP count for that object class.

Multi-Class Detection

For multi-class detection, the FP calculation is similar but distributed across multiple classes. For example:

  • If 15 anchor regions are incorrectly predicted as object-A and 30 anchor regions are incorrectly predicted as object-B, the FP rate for each class will be calculated based on the total number of incorrect predictions relative to the total predictions made.

Clarification on Your Statements

Your understanding is mostly correct, but let's refine it:

  • In single-class detection, if 14 anchor regions are incorrectly predicted as containing an object when they are actually background, these will all contribute to the FP count for that single object class.
  • In multi-class detection, if 15 anchor regions are incorrectly predicted as object-A and 30 anchor regions are incorrectly predicted as object-B, the FP rates will be calculated based on the total number of incorrect predictions for each class.

Example Calculation

For multi-class detection:

  • If there are 45 incorrect predictions (15 for object-A and 30 for object-B) out of a total of 45 predictions, the FP rate for object-A would be 15/45 = 0.33 and for object-B would be 30/45 = 0.66.

Next Steps

If you encounter any issues or bugs, please ensure you are using the latest versions of torch and YOLOv5 from our GitHub repository. If the issue persists, providing a minimum reproducible code example will help us investigate further. You can find guidance on creating one here.

Feel free to reach out with any more questions or clarifications!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants