Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple polygon per object #11476

Closed
1 task done
PixelFinder opened this issue May 3, 2023 · 20 comments
Closed
1 task done

Multiple polygon per object #11476

PixelFinder opened this issue May 3, 2023 · 20 comments
Labels
question Further information is requested Stale

Comments

@PixelFinder
Copy link

Search before asking

Question

How would you annotate heavily occluded objects resulting in multiple polygons per object? I made the following example.

image

When two cans are heavily occluded like in A. How would you annotate the blue can. My intuition thinks of option B, having two polygons per instance, in this case the blue can and assign the same instance ID to the two polygons. Another option is option C, where you also annotate the non-visible part of the blue can. However, you assign quite a lot of pixels to two instances, and I don't know how the model will react to this in terms of performance. In addition, annotation is more effort when using the roboflow smart polygon function.

But I can't seem to figure out how option B is allowed in YOLOv5. I know COCO allows multiple polygons per instance id, but as far as I know YOLO does not. Would you recommend to use another repo or could YOLOv5 actually handle annotations like option B?

Additional

No response

@PixelFinder PixelFinder added the question Further information is requested label May 3, 2023
@github-actions
Copy link
Contributor

github-actions bot commented May 3, 2023

👋 Hello @PixelFinder, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics

@glenn-jocher
Copy link
Member

Hi @PixelFinder,

Annotating heavily occluded objects can be a challenging task as each annotation option has its own drawbacks. However, as you mentioned, option B would be the suitable annotation for heavily occluded objects. In this case, you would need to create one annotation for each visible part of the object and assign the same instance ID to those annotations.

Regarding YOLOv5, it supports custom COCO-style annotations where multiple polygons can be assigned the same instance ID. You can create annotations for the visible and occluded parts of each object and assign the same instance ID to all polygons belonging to the same object. You can then use these annotations to train YOLOv5 for object detection tasks.

I hope this helps. Let me know if you have any further questions.

@PixelFinder
Copy link
Author

Hi @glenn-jocher,
Thank you for your clear response. It is great to hear YOLOv5 supports custom COCO-style annotations. I'm still a bit puzzled how this could be merged in the current code. As far as my knowledge goes, YOLO format has an txt file per frame/image where each row is one annotation. Could I add an instance ID here? Or could I adjust the code of YOLOv5 to use .json format of COCO, or maybe this is already possible without any alterations?

I couldn't find specific use cases of multiple polygons per object using YOLOv5 (but I have a strong preference to use this repository). Do you know any example or provide me with an example how to incorporate the custom COCO-style annotations?

@glenn-jocher
Copy link
Member

Hi @PixelFinder,

You're welcome! I'm glad my response was helpful.

Regarding merging custom COCO-style annotations in YOLOv5, you can manually add the instance ID to each row of the TXT file annotation for each polygon representing an object. Each annotation is a line in train.txt, so you could add the same instance ID to all annotations that belong to the same object. YOLOv5 will use these annotations to train object detection models. Alternatively, you can use the JSON format of COCO annotations. YOLOv5 supports the COCO format, and its JSON annotations can be converted to the TXT format with the provided coco2yolo.py script.

As for multiple polygons per object, the best approach would be to create a separate row in the TXT file annotation for each visible polygon of the object and assign the same instance ID to all polygons corresponding to the same object. For example, if an object is partially occluded in an image, you can create a separate annotation row for each visible part and assign the same instance ID to all those rows.

I don't have a specific example of this approach, but you may find some useful resources in the COCO dataset annotation format documentation and COCO JSON annotation format documentation.

I hope this helps. Let me know if you have any further questions!

@PixelFinder
Copy link
Author

PixelFinder commented May 4, 2023

Thank you @glenn-jocher! Just for my understanding (and maybe saving a lot of headaches at a later stage).
The format for instance segmentation with an instance ID should be:
< class> <instance_id> <x_1> <y_1> <x_2> <y_2> ... <x_n> <y_n>

Is that correct?

@glenn-jocher
Copy link
Member

You're welcome, @PixelFinder!

Regarding your question, yes, that is correct. The format for instance segmentation with an instance ID should be:

<class> <instance_id> <x_1> <y_1> <x_2> <y_2> ... <x_n> <y_n>

Where <class> is the object class, <instance_id> is a unique identifier for the instance, and <x_i>, <y_i> are the pixel coordinates of the i-th vertex of the polygon annotation.

I hope this clears up any confusion. Let me know if you have any additional questions!

@ryouchinsa
Copy link

Related issue.
ultralytics/ultralytics#979

Using the JSON2YOLO script, you can merge multiple polygons in the COCO format into a polygon in the YOLOv5/v8 format.

def merge_multi_segment(segments):
    """Merge multi segments to one list.
    Find the coordinates with min distance between each segment,
    then connect these coordinates with one thin line to merge all 
    segments into one.
    """
  • COCO format before converting
{
    "annotations": [
    {
        "area": 594876,
        "bbox": [328, 832, 780, 2252],
        "category_id": 1,
        "id": 1,
        "image_id": 1,
        "iscrowd": 0,
        "segmentation": [
            [493, 985, 496, 961, 503, 926, 527, 881, 569, 848, 624, 832, 701, 838, 767, 860, 790, 931, 803, 963, 802, 972, 846, 970, 896, 969, 896, 977, 875, 982, 847, 984, 793, 987, 791, 1001, 783, 1009, 785, 1022, 791, 1024, 787, 1027, 795, 1041, 804, 1059, 811, 1072, 810, 1081, 800, 1089, 788, 1092, 783, 1098, 784, 1115, 780, 1120, 774, 1123, 778, 1126, 778, 1136, 775, 1140, 767, 1140, 763, 1146, 767, 1164, 754, 1181, 759, 1212, 751, 1264, 815, 1283, 839, 1303, 865, 1362, 880, 1442, 902, 1525, 930, 1602, 953, 1640, 996, 1699, 1021, 1773, 1039, 1863, 1060, 1920, 1073, 1963, 1089, 1982, 1102, 2013, 1107, 2037, 1107, 2043, 1099, 2046, 1097, 2094, 1089, 2123, 1074, 2137, 1066, 2153, 1033, 2172, 1024, 2166, 1024, 2166, 1023, 2129, 1019, 2093, 1004, 2057, 996, 2016, 1000, 1979, 903, 1814, 860, 1727, 820, 1647, 772, 1547, 695, 1637, 625, 1736, 556, 1854, 495, 1986, 459, 2110, 446, 1998, 449, 1913, 401, 1819, 362, 1720, 342, 1575, 328, 1440, 335, 1382, 348, 1330, 366, 1294, 422, 1248, 437, 1222, 450, 1190, 466, 1147, 482, 1107, 495, 1076, 506, 1019, 497, 1016],
            [878, 2293, 868, 2335, 855, 2372, 843, 2413, 838, 2445, 820, 2497, 806, 2556, 805, 2589, 809, 2622, 810, 2663, 807, 2704, 793, 2785, 772, 2866, 742, 2956, 725, 3000, 724, 3013, 740, 3024, 757, 3029, 778, 3033, 795, 3033, 812, 3032, 812, 3046, 803, 3052, 791, 3063, 771, 3069, 745, 3070, 733, 3074, 719, 3077, 702, 3075, 680, 3083, 664, 3082, 631, 3072, 601, 3061, 558, 3058, 553, 3039, 558, 3023, 566, 3001, 568, 2983, 566, 2960, 572, 2912, 571, 2859, 567, 2781, 572, 2698, 576, 2643, 583, 2613, 604, 2568, 628, 2527, 637, 2500, 636, 2468, 629, 2445, 621, 2423, 673, 2409, 726, 2388, 807, 2344, 878, 2293]
        ]
    }],
    "categories": [
    {
        "id": 1,
        "name": "person"
    }],
    "images": [
    {
        "file_name": "jesse-hammer-4fWuS52jENk-unsplash.jpg",
        "height": 3422,
        "id": 1,
        "width": 2738
    }]
}
  • YOLOv5/v8 format after converting
0 0.373996 0.632963 0.373996 0.632963 0.37363 0.622151 0.372169 0.611631 0.366691 0.60111 0.363769 0.589129 0.36523 0.578317 0.329803 0.530099 0.314098 0.504676 0.299489 0.481297 0.281958 0.452075 0.253835 0.478375 0.228269 0.507306 0.203068 0.541788 0.180789 0.580362 0.167641 0.616598 0.162893 0.583869 0.163988 0.55903 0.146457 0.53156 0.132213 0.50263 0.124909 0.460257 0.119795 0.420807 0.122352 0.403857 0.1271 0.388662 0.133674 0.378141 0.154127 0.364699 0.159606 0.357101 0.164354 0.34775 0.170197 0.335184 0.176041 0.323495 0.180789 0.314436 0.184806 0.297779 0.181519 0.296902 0.180058 0.287843 0.181154 0.28083 0.183711 0.270602 0.192476 0.257452 0.207816 0.247808 0.227904 0.243133 0.256026 0.244886 0.280131 0.251315 0.288532 0.272063 0.29328 0.281414 0.292915 0.284044 0.308985 0.28346 0.327246 0.283168 0.327246 0.285506 0.319576 0.286967 0.30935 0.287551 0.289627 0.288428 0.288897 0.292519 0.285975 0.294857 0.286706 0.298656 0.288897 0.29924 0.287436 0.300117 0.290358 0.304208 0.293645 0.309468 0.296202 0.313267 0.295836 0.315897 0.292184 0.318235 0.287801 0.319112 0.285975 0.320865 0.28634 0.325833 0.284879 0.327294 0.282688 0.328171 0.284149 0.329047 0.284149 0.33197 0.283053 0.333139 0.280131 0.333139 0.278671 0.334892 0.280131 0.340152 0.275383 0.34512 0.27721 0.354179 0.274288 0.369375 0.297663 0.374927 0.306428 0.380771 0.315924 0.398013 0.321402 0.421391 0.329438 0.445646 0.339664 0.468147 0.348064 0.479252 0.363769 0.496493 0.3729 0.518118 0.379474 0.544418 0.387144 0.561075 0.391892 0.573641 0.397736 0.579193 0.402484 0.588252 0.40431 0.595266 0.40431 0.597019 0.401388 0.597896 0.400657 0.611923 0.397736 0.620397 0.392257 0.624489 0.389335 0.629164 0.377283 0.634717 0.373996 0.632963 0.320672 0.670076 0.31702 0.68235 0.312272 0.693162 0.307889 0.705143 0.306063 0.714494 0.299489 0.72969 0.294375 0.746932 0.29401 0.756575 0.295471 0.766219 0.295836 0.7782 0.294741 0.790181 0.289627 0.813852 0.281958 0.837522 0.271001 0.863822 0.264792 0.87668 0.264427 0.880479 0.27027 0.883694 0.276479 0.885155 0.284149 0.886324 0.290358 0.886324 0.296567 0.886032 0.296567 0.890123 0.29328 0.891876 0.288897 0.895091 0.281592 0.896844 0.272096 0.897136 0.267714 0.898305 0.2626 0.899182 0.256392 0.898597 0.248356 0.900935 0.242513 0.900643 0.23046 0.897721 0.219503 0.894506 0.203798 0.893629 0.201972 0.888077 0.203798 0.883402 0.20672 0.876973 0.207451 0.871712 0.20672 0.864991 0.208912 0.850964 0.208546 0.835476 0.207085 0.812683 0.208912 0.788428 0.210373 0.772355 0.212929 0.763589 0.220599 0.750438 0.229364 0.738457 0.232652 0.730567 0.232286 0.721216 0.22973 0.714494 0.226808 0.708065 0.2458 0.703974 0.265157 0.697838 0.294741 0.68498 0.320672 0.670076 0.320672 0.670076

スクリーンショット 2023-05-07 12 44 43

@PixelFinder
Copy link
Author

@glenn-jocher After some annotation work, I fail in getting the training started with the instance ID. I get the error:
ValueError: not enough values to unpack (expected 3, got 0)

I have my annotations in the .txt file as the following:
0 1952 0.7642045454545454 0.4965277777777778 0.7471590909090909 0.4722222222222222 0.7443181818181818 0.4722222222222222 0.7215909090909091 0.4479166666666667 0.7102272727272727 0.4444444444444444 0.7073863636363636 0.4409722222222222 0.7017045454545454 0.4409722222222222 0.6988636363636364 0.4375 0.6789772727272727 0.4340277777777778 0.6761363636363636 0.4305555555555556 0.6704545454545454 0.4305555555555556 0.6676136363636364 0.4270833333333333 0.6619318181818182 0.4270833333333333
0 1952 0.3181818181818182 0.3784722222222222 0.3096590909090909 0.3784722222222222 0.3125 0.3854166666666667 0.30113636363636365 0.3993055555555556 0.29545454545454547 0.3993055555555556 0.29261363636363635 0.3958333333333333 0.2897727272727273 0.3993055555555556 0.2784090909090909 0.3993055555555556 0.2840909090909091 0.3993055555555556 0.2897727272727273 0.40625 0.3210227272727273 0.40625 0.3210227272727273 0.3958333333333333 0.3181818181818182 0.3923611111111111

I have two polygons of instance 1952 corresponding to class 0.
The error is dropped when I remove all the instance IDs from the .txt label files and training works just fine. So I already traced it back to the problem being the instance ID in the labels.

How do I include the instance IDs into training?

@github-actions
Copy link
Contributor

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label Jun 30, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 10, 2023
@Altricch
Copy link

Altricch commented Sep 7, 2023

You're welcome, @PixelFinder!

Regarding your question, yes, that is correct. The format for instance segmentation with an instance ID should be:

<class> <instance_id> <x_1> <y_1> <x_2> <y_2> ... <x_n> <y_n>

Where <class> is the object class, <instance_id> is a unique identifier for the instance, and <x_i>, <y_i> are the pixel coordinates of the i-th vertex of the polygon annotation.

I hope this clears up any confusion. Let me know if you have any additional questions!

Hey Glenn,

I am not entirely sure how we can apply multiple polygons to the same instance for transfer learning in YOLOv8-seg.

Would appreciate any insights!

Thanks in advance,

@YoungjaeDev
Copy link

@ryouchinsa

In conclusion, will there be no problem in training if the polygons separated for the same instance in yolov5-seg are written in yolo format with independent rows?

@ryouchinsa
Copy link

Hi @youngjae-avikus,
Ideally multiple polygons per an instance should be merged into a polygon using the JSON2YOLO script.
But, if there are multiple polygons per an instance and each polygon are independently written in the YOLO format in different rows, I think there are no problems. After training and processing the image, you need to post-process to merge separated polygons into a polygon.

@YoungjaeDev
Copy link

@ryouchinsa

I don't quite understand.

  1. First, isn’t each row recognized as a different instance in the Yolo format?
  2. Then, is it possible to convert the Yolo format back to json?

@YoungjaeDev
Copy link

@ryouchinsa
I understand that it is ultimately expressed as one row, but with one row, how can I know whether the polygon shape is B or C as shown in the picture in the question?

@ryouchinsa
Copy link

Hi @youngjae-avikus,
To clearly understand, please prepare this kind of COCO format file using your annotation tool.
Label upper polygon and lower polygon, and export as multiple polygons.

{
    "annotations": [
    {
        "area": 594876,
        "bbox": [328, 832, 780, 2252],
        "category_id": 1,
        "id": 1,
        "image_id": 1,
        "iscrowd": 0,
        "segmentation": [
            [493, 985, 496, 961, 503, 926, 527, 881, 569, 848, 624, 832, 701, 838, 767, 860, 790, 931, 803, 963, 802, 972, 846, 970, 896, 969, 896, 977, 875, 982, 847, 984, 793, 987, 791, 1001, 783, 1009, 785, 1022, 791, 1024, 787, 1027, 795, 1041, 804, 1059, 811, 1072, 810, 1081, 800, 1089, 788, 1092, 783, 1098, 784, 1115, 780, 1120, 774, 1123, 778, 1126, 778, 1136, 775, 1140, 767, 1140, 763, 1146, 767, 1164, 754, 1181, 759, 1212, 751, 1264, 815, 1283, 839, 1303, 865, 1362, 880, 1442, 902, 1525, 930, 1602, 953, 1640, 996, 1699, 1021, 1773, 1039, 1863, 1060, 1920, 1073, 1963, 1089, 1982, 1102, 2013, 1107, 2037, 1107, 2043, 1099, 2046, 1097, 2094, 1089, 2123, 1074, 2137, 1066, 2153, 1033, 2172, 1024, 2166, 1024, 2166, 1023, 2129, 1019, 2093, 1004, 2057, 996, 2016, 1000, 1979, 903, 1814, 860, 1727, 820, 1647, 772, 1547, 695, 1637, 625, 1736, 556, 1854, 495, 1986, 459, 2110, 446, 1998, 449, 1913, 401, 1819, 362, 1720, 342, 1575, 328, 1440, 335, 1382, 348, 1330, 366, 1294, 422, 1248, 437, 1222, 450, 1190, 466, 1147, 482, 1107, 495, 1076, 506, 1019, 497, 1016],
            [878, 2293, 868, 2335, 855, 2372, 843, 2413, 838, 2445, 820, 2497, 806, 2556, 805, 2589, 809, 2622, 810, 2663, 807, 2704, 793, 2785, 772, 2866, 742, 2956, 725, 3000, 724, 3013, 740, 3024, 757, 3029, 778, 3033, 795, 3033, 812, 3032, 812, 3046, 803, 3052, 791, 3063, 771, 3069, 745, 3070, 733, 3074, 719, 3077, 702, 3075, 680, 3083, 664, 3082, 631, 3072, 601, 3061, 558, 3058, 553, 3039, 558, 3023, 566, 3001, 568, 2983, 566, 2960, 572, 2912, 571, 2859, 567, 2781, 572, 2698, 576, 2643, 583, 2613, 604, 2568, 628, 2527, 637, 2500, 636, 2468, 629, 2445, 621, 2423, 673, 2409, 726, 2388, 807, 2344, 878, 2293]
        ]
    }],
}

Then convert the COCO format file using the JSON2YOLO script. Those 2 polygons are merged into a polygon with narrow 2 lines which have 0 width.
Using your annotation tool, you can view the merged polygon shape in the YOLO segmentation format.

@YoungjaeDev
Copy link

@ryouchinsa
I'm trying to understand something.
I am a Linux (Ubuntu) and PC user. Is there an annotation tool you can recommend that allows me to annotate both COCO and YOLO formats at once and check the results?

@ryouchinsa
Copy link

ryouchinsa commented Dec 7, 2023

Hi @youngjae-avikus,
Using Labelme and the group ID feature, you can label multiple polygons per an instance and export to the COCO format.

@glenn-jocher
Copy link
Member

@ryouchinsa hello everyone,

I'd like to clarify a few points regarding the YOLOv5 annotation format and handling of multiple polygons per object:

  1. YOLOv5 Annotation Format: YOLOv5 typically uses a bounding box format for object detection, which is a single rectangle per object. For segmentation tasks, YOLOv5 can handle polygon annotations, but the standard format does not natively support multiple polygons for a single instance.

  2. Multiple Polygons per Object: If you have multiple non-contiguous regions for a single object due to occlusion, you would typically need to handle this during the annotation process. Some annotation tools may allow you to assign the same instance ID to multiple polygons, but this is not standard for YOLOv5.

  3. Training with Multiple Polygons: If you have multiple polygons for a single instance, you would need to merge them into a single polygon if possible, or treat them as separate instances during training. YOLOv5 does not natively support training with multiple polygons per instance.

  4. Post-Processing: If you decide to train with separate polygons as separate instances, you may need to implement post-processing steps to merge the detections into a single instance based on proximity, overlap, or other criteria.

  5. Annotation Conversion: Converting between COCO and YOLO formats can be done with various scripts and tools, but be aware that converting multiple polygons into a single instance may not be straightforward and could require custom processing.

  6. Annotation Tools: There are various annotation tools available that can export to COCO or YOLO formats. However, the ability to annotate multiple polygons per object and export them correctly for YOLOv5 training may vary. You may need to check the documentation of the specific tool you are using to understand its capabilities.

Remember, the key to successful training is consistent and accurate annotations. If your dataset requires multiple polygons per instance, ensure that your annotation process and training pipeline are aligned to handle this scenario effectively.

For more detailed information on YOLOv5's capabilities and how to prepare your data, please refer to the Ultralytics documentation.

@ryouchinsa
Copy link

@glenn-jocher
Thanks for the detailed explanation.

For the merged polygon, if you could separate it into multiple polygons when training, it would be very helpful. Because when converting the merged polygon to the mask image, sometimes narrow white lines appear in the background.
In our product RectLabel, when we read the merged polygon in the YOLO format, our algorithm separates it into multiple polygons. So that when converting to the mask image, narrow lines do not appear in the background.

Example merged polygon and the mask image.
https://twitter.com/rectlabel/status/1749660321550360973

@glenn-jocher
Copy link
Member

Hi @ryouchinsa,

I appreciate your feedback and the work you're doing with RectLabel. It's great to hear about tools that can handle the intricacies of annotation conversion effectively.

Regarding the separation of merged polygons during training, this is indeed a complex issue. The YOLOv5 segmentation models are designed to work with the standard YOLO format, which does not natively support multiple polygons per instance. However, the community is always evolving, and we welcome contributions that can enhance the capabilities of YOLOv5, including better handling of complex annotation scenarios.

For the issue of narrow white lines appearing in the mask image, this is a known challenge with merged polygons. It's encouraging to hear that RectLabel has an algorithm to separate merged polygons to avoid this problem.

As the YOLOv5 project continues to develop, we'll keep an eye on such enhancements and consider how they might be integrated into future versions. In the meantime, users should continue to use the best tools available to them for their specific annotation and training needs.

Thank you for sharing your insights and for contributing to the machine learning community. Your efforts help improve the overall experience for users working on object detection and segmentation tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

5 participants