Finetuning Google Open Images Pretrained YOLO with MSCOCO #1444

saitarslanboun · 2020-11-19T02:37:33Z

❔Question

I consider pretraining YOLOv5 small setting with Google Open Images Object Detection dataset https://storage.googleapis.com/openimages/web/download.html. The dataset includes general domain categories with ~15 M box samples. After the pretraining is done, I will fine-tune the model on MSCOCO dataset.

I would like to do it, if I can improve AP by ~7%. Do you think that it is possible, and I have logical expectation? Unfortunately, I could not find anywhere anyone have tried an Open Images pretrained object detector with MSCOCO training.

When I will fine-tune, all the layers will be initiated with the pretrained weights, except the Detect layer, since the number of classes changes.

glenn-jocher · 2020-11-19T10:43:07Z

@saitarslanboun sure that sounds reasonable. OI and COCO have many intersecting classes. One issue with OI in general is that the quality of the annotations varies greatly by image. Perhaps more classes were annotated in later versions, because many images lack labels for all classes, like faces for example, which are labelled in some images but not others. This leaves the dataset difficult to implement directly perhaps.

saitarslanboun · 2020-11-19T13:32:35Z

Thanks for your answer, @glenn-jocher ! Just to make it sure, because it will take probably about a month, do you really believe that I can increase the accuracy of small YOLO from 37 to 44 if I pretrain with open images fully, 300 epochs, and finetune on MSCOCO?

glenn-jocher · 2020-11-19T13:39:21Z

Oh, no, Im not providing forward numerical projections, I'm simply agreeing larger datasets improve results.

cszer · 2020-11-22T15:56:30Z

Use not OpenImages but https://www.objects365.org/overview.html , according to their paper i think you will get some improvements on COCO

glenn-jocher · 2020-11-22T16:03:29Z

@cszer interesting, thanks! It may make sense to provide pretrained YOLOv5 weights on the Objects365 dataset then for improved finetuning performance on smaller datasets according to their paper.

We'd need to export their labels into YOLO format and set up some training runs...

saitarslanboun · 2020-11-23T10:18:16Z

Yes @cszer, I also plan to do so. Lets see what happens. :) @glenn-jocher, I need to eventually do that as well, but don't know when I will start training.

glenn-jocher · 2020-11-23T10:47:50Z

@saitarslanboun got it. We'd ideally want to make a objects365.yaml that would autodownload the images and create the labels in the right format just like voc.yaml and coco.yaml. If you have free time and are working with this dataset please consider submitting a PR in the future to help other users :)

github-actions · 2020-12-24T00:54:02Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Silmeria112 · 2021-04-07T02:19:39Z

@saitarslanboun, Hi do you start the training? I'm very interesting about the results if you can share...

saitarslanboun · 2021-04-07T02:23:14Z

Unfortunately @Silmeria112 , I did not do that task.

glenn-jocher · 2021-04-07T17:03:49Z

@Silmeria112 Objects365 looks very interesting. 2M images is about about 20X larger than COCO, so this might use about >400 GB of storage, with a single epoch talking about 20X one COCO epoch, though I'd imagine that you could train far fewer epochs than 300 as the dataset is larger.

Ideally X amount of time spent training 365 would be more beneficial than the same amount of time spent training COCO.

glenn-jocher · 2021-04-07T17:05:29Z

@saitarslanboun @Silmeria112 if you guys get started training the 365 dataset please consider submitting a PR with a objects365.yaml and get_objects365.sh script to help everyone else get started easier with the same trainings!

Silmeria112 · 2021-04-09T03:22:19Z

Hi @glenn-jocher,@saitarslanboun I do want to start train the 365 dataset. However when I checked the dataset, I found that there's quit some crowd bboxes annotation, even for bboxes which I think they're not very crowded. As I understand currently the preprocess of yolo will ignore these crowd bboxes, right? Then there's a lot of missing label objects.

I don't have enough GPU resource currently but maybe I can start the training two weeks later.

saitarslanboun · 2021-04-09T03:51:27Z

@Silmeria112 , here is a chance for you to make contribution for Yolov5, adding crowded boxes training functionality :)

saitarslanboun · 2021-04-09T03:53:21Z

Or you can label them differently. For example, you would have two different classes for person, and person(crowded). Then the model will learn crowded person and single person objects differently.

glenn-jocher · 2021-04-09T10:49:03Z

@Silmeria112 yes we've opted to ignore 'iscrowd' boxes in the COCO dataset, so we'd probably want the same behavior in Objects365. That's unfortunate that there's FN's in the dataset labels (missed objects).

OIv6 has many missing objects as well. I think maybe the earlier versions of the dataset were not fully labelled with all of the current classes, so you have to be very careful with which parts of the dataset you use, or train a teacher on the well labelled parts to review the not so well labelled parts.

ferdinandl007 · 2021-04-24T10:35:47Z

I'm currently training YoloV5l on Objects365 and got it up to (0.35 mAP_0.5) slightly higher than the one in the paper form Objects365 after about a week of training on 8x A100
I'm also using the maximum Batch size the the GPU allows of 58 on 1.
Also running hyper parameter search with a subset of the data set.
Does anyone else have some more tips to improve the accuracy?
Before I start another training run for a week!

glenn-jocher · 2021-04-24T12:30:49Z

@ferdinandl007 wow!! A week on an 8x A100 will get you about 500 epochs of YOLOv5x6 on COCO at 1280. How many epochs and at what image size are you training?

I would highly recommend running DDP from within our docker container also even if you think you have a good linux environment as it produces the fastest trainings in our experience. It's really easy, you just need to pass in your dataset directory instead of /coco here:

t=ultralytics/yolov5:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all -v "$(pwd)"/coco:/usr/src/coco $t

Also lastly, can you submit a PR with your objects365.yaml file to help people get started faster on this dataset in the future? The recent VisDrone PR #2882 is a good example of how to do this, and also if you have a convert script into YOLO format you could place that in data/scripts.

glenn-jocher · 2021-04-24T12:32:30Z

ferdinandl007 · 2021-04-24T13:30:54Z

@glenn-jocher Sure I can make a PR with the objects365.yaml and hyper parameters I found, I currently got 280 generations done, going to leave it running over the weekend so should be at about 500 then.
I tried both docker and bear metal and the results were basically identical currently I'm doing the training on bear metal. Currently takes about an hour and 20 minutes to complete one epochs on the full data set with --img 640 --batch 464 --sync-bn and DDP mode.
Might try training on 24X A100 but from previous testing my data link between the machines is too slow currently to make multi Node feasible for me so I'm using them for hyper parameters search right now.

In terms of auto downloading object 365 that might be quite difficult you have to have a WeChat account to authenticate the download with. In Edition the connection to the the downloads keep failing Took me about a week to download the whole thing �I tried scripting but that didn't really work as the website did not let me download with ‘wget’ basically always failed immediately when I did that so had to use chrome and download it one by one 😅

In terms of the script for conversion I can attach that to the PR when I get time.

glenn-jocher · 2021-04-24T14:09:20Z

@ferdinandl007 yeah you're right, probably just a objects365.yaml with no download: field then. I tried to download the dataset earlier but ran into the same issues. I created an account even though I don't speak chinese and managed to get a couple of the zip files downloaded, but gave up due to the complications.

If Docker and local environment are the same speed then that means your local environment is very well configured!

If Objects365 is like COCO, then you will probably get better results training at larger image sizes with the P6 models, i.e. instead of python train.py --img 640 --cfg yolov5l.yaml you'll probably get better results with python train.py --img 1280 --cfg yolov5l6.yaml. You'll have to reduce your batch size by about 4 since the images have 4x as many pixels at 1280, but this is probably a good thing because your batch size may be too large.

I try to avoid training at batch-sizes over 128 because then the steps between optimizer updates become quite large and training actually starts to take longer (more epochs). There's a sweet spot somewhere in the --batch-size space maybe around --batch 100, but pushing this to 464 is probably slowing down your training substantially, especially in the early epochs.

--sync definitely helps in early training, but I think final mAP may be largely unaffected by --sync, we still need to do a study on this.

ferdinandl007 · 2021-04-24T18:59:30Z

@glenn-jocher
Thank you for the tips :) Yeah I did notice that I had significantly faster training when using 124 batch-size But it only utilised about 60% of system resources so I increased batch-size to fully utilises available resources, as my thinking was I would be less likely to get stuck on shallow gradients.
But will try using smaller sizes in the future, also in regards to performance of yolov5l6 is it still capable running in real-time on a iOS devices after pruning and quantisation?
also went training at larger resolutions will inference not take a significant performance penalty?
Or can I continue using the 640 resolution during inference and that worked pretty well for me previously in my applications.

Silmeria112 · 2021-04-25T07:31:59Z

@ferdinandl007 Greate news that you're getting the result of Objects 365. I think many people may also want to know the transfer/generalization ability of pretrain weights got from Object 365, especially for people who want to train on customize dataset. Do you have any plan to check that, for example comparing performance on VOC of yolo with pretrain on Coco and Object 365?

glenn-jocher · 2021-04-25T12:38:59Z

@ferdinandl007 well it's important to differentiate between GPU utilization and memory. I think you can still reach high utilization rates (i.e. around 90%) even without saturating the GPU memory. Especially with some of the high end GPUs available today like the 80 GB A100's it won't always make sense to use up 100% of your memory.

In regards to speed, the P6 models run at about the same speed as the P5 models. Their main disadvantage is size, they have about 50% more parameters than the P5 models, but all of these extra parameters are in stride-64 convolution layers which are very fast (the slowest convolutions are the P1, P2 layer convolutions that conversely have the fewest parameters).

Independently of model type, yes larger images will run inference more slowly as the v5.0 README shows. But one major advantage of training at 1280 is that you can still run inference at lower values, i.e. 320, 640, 960 etc. up to 1280. If you train at 640 you will only get good inference results up to 640 and lower.

Also one last note is that P6 models trained at 640 also produce better mAP than P5 models trained at 640.

P5 vs P6 timing example is here:

# PyTorch Hub
import torch

# Model
model5 = torch.hub.load('ultralytics/yolov5', 'yolov5s')
model6 = torch.hub.load('ultralytics/yolov5', 'yolov5s6')

# Images
imgs = ['zidane.jpg', 'bus.jpg']
for f in imgs:  # download 2 images
    print(f'Downloading {f}...')
    torch.hub.download_url_to_file('https://github.com/ultralytics/yolov5/releases/download/v1.0/' + f, f)

# Inference (batch-size 20)
model5(imgs * 10).print()
model6(imgs * 10).print()

ferdinandl007 · 2021-04-26T08:22:28Z

@glenn-jocher Thank you for this clarification I followed your suggestion and started training at 1280 with with my previously trained yolov5l model and noticed significant mAP increases to 0.42 after one epoch however � processing Time is now about 5-6 hours per epoch at 128 Batch size on x8 A100s probably have to calculate about two weeks for training.
Also I will try out the P6 models and report back on performance :)
What hardware do you typically use for training your models?
How well does it scale over an increasing number of GPUs 16+?
Was there a study already?
also does it have support for AMD Rocm GPUs Mi50/Mi100?

@Silmeria112 Right now there are no plans to test transfer learning abilities, but if I get time, I may give it a shot and have a look at performance increases. There should definitely be some based on what I read in the paper of object 365 where they did the same with Rcnn.

Silmeria112 · 2021-04-26T10:15:38Z

@ferdinandl007 @glenn-jocher Hi I'm planning to do the train soon and I did a small statistical work on the current v2 version of object365. For training set:

total imgs: 1,742,292
imgs with a least one "iscrowd" bbox: 934,754
total bboxes: 25,407,633
"iscrowd" bboxes: 2,521,368

So now the dataset is much bigger than that reported in the paper (608K imgs). However there's a lot of "iscrowd" bboxes which is ignored during yolo preprocess. I think there may be a few ways to tackle that (ignore the anchors overlapping "iscrowd" bboxes / replace pixel in the "iscrowd" area with constant value... ), but the simplest way is not using images with "iscrowd" bbox. Then we have 807,538 imgs which is still larger than the number in the paper.

A question to @ferdinandl007, what's the label of val set you're using? I only find a submit sample json from the websit, which doesn't seems the ground truth label.

marvision-ai · 2021-04-30T17:24:15Z

@ferdinandl007,
Very interesting! Do you think it would be possible to make those models trained on this dataset publicly available to play around with? Most of us do not have that hardware budget 😉

ferdinandl007 · 2021-04-30T17:41:05Z

@ferdinandl007 as far as I discovered they included all labels in that single label file the 5 GB Jason. But I was also confused about that too at the beginning it's not very well documented I must say!
@marvision-ai In terms of publishing the weights I have to check this with some people in my company as we are using this for some internal research.

Silmeria112 · 2021-05-04T15:09:31Z

Hi, I would like to share my test on yolov5s. First I train object365 samples without iscrowd bboxes for 50 epoch with defualt setting(hyp.scratch.yaml) and then use the weights as pretain to train on Coco sets for 300 epochs with default setting and 0.1x smaller lr setting. Here are result.

Model	lr	AP50	mAP
yolov5s default	0.01	55.4	36.7
yolov5s object365->coco	0.01	56.9	36.7
yolov5s object365->coco	0.001	56.9	37.1

ferdinandl007 · 2021-05-05T20:03:45Z

@Silmeria112 awesome results, so there definitely was some gain but not significantly, very interesting!
What was the accuracy you were able to obtain on object365?

ferdinandl007 · 2021-05-05T20:05:57Z

@Silmeria112 how did you end up tackling your iscrowd problem? Did you replace the pixels with constant value? Or just filter them out?

Silmeria112 · 2021-05-05T23:31:48Z

@ferdinandl007 filter them out. For the acc on object365, I still can not find out where is the annotations for the val set. So can not measure that.

ferdinandl007 · 2021-05-06T07:43:01Z

@Silmeria112 I think the main annotation file contains all annotations as I have about < 70,000 images missing which is roughly equal to the validation set. When I did the conversion so are use them to create a subset for my validation set after downloading them and putting them all in the same folder structure

Silmeria112 · 2021-05-08T02:14:12Z

I checked a few images from the val set and still can not find the labels from the big json file (zhiyuan_objv2_train.json). Is this file you're using?

glenn-jocher · 2021-05-08T18:07:19Z

@ferdinandl007 I think zhiyuan_objv2_train.json contains labels for every image in the train set (the 50 patches), but its a mystery to me where the validation image labels lie. Test set would naturally be missing them but validation set normally comes with labels.

I think I'm just going to use our autosplit() function to create a 'YOLOv5 official' val split using 99% and 1% fractions.

yolov5/utils/datasets.py

Lines 1047 to 1054 in 251aeaf

    
           def autosplit(path='../coco128', weights=(0.9, 0.1, 0.0), annotated_only=False): 
        
               """ Autosplit a dataset into train/val/test splits and save path/autosplit_*.txt files 
        
               Usage: from utils.datasets import *; autosplit('../coco128') 
        
               Arguments 
        
                   path:           Path to images directory 
        
                   weights:        Train, val, test weights (list) 
        
                   annotated_only: Only use images with an annotated txt file 
        
               """

krishnaadithya · 2021-07-29T14:19:44Z

@Silmeria112
Can you upload the pretrained yolov5s trained on object365 dataset?

Silmeria112 · 2021-09-10T08:37:06Z

@Silmeria112
Can you upload the pretrained yolov5s trained on object365 dataset?

Sorry, I can not access the pretrained weights now. So I can't upload that.

wangsun1996 · 2022-12-20T08:38:07Z

I'm currently training YoloV5l on Objects365 and got it up to (0.35 mAP_0.5) slightly higher than the one in the paper form Objects365 after about a week of training on 8x A100 I'm also using the maximum Batch size the the GPU allows of 58 on 1. Also running hyper parameter search with a subset of the data set. Does anyone else have some more tips to improve the accuracy? Before I start another training run for a week!

Could you provide a yolov5.pt on object365(such as yolov5s/yolov5s6/or other on obj365 dataset)?Thank you very much!

wangsun1996 · 2022-12-20T08:51:42Z

Hi, I would like to share my test on yolov5s. First I train object365 samples without iscrowd bboxes for 50 epoch with defualt setting(hyp.scratch.yaml) and then use the weights as pretain to train on Coco sets for 300 epochs with default setting and 0.1x smaller lr setting. Here are result.
Model lr AP50 mAP
yolov5s default 0.01 55.4 36.7
yolov5s object365->coco 0.01 56.9 36.7
yolov5s object365->coco 0.001 56.9 37.1

Could you provide any yolov5.pt on obj365(such as yolov5s.pt or yolov5s6.pt）? Thank you very much!

glenn-jocher · 2022-12-20T14:50:03Z

https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5m_Objects365.pt

saitarslanboun added the question Further information is requested label Nov 19, 2020

github-actions bot added the Stale label Dec 24, 2020

github-actions bot closed this as completed Dec 30, 2020

Finetuning Google Open Images Pretrained YOLO with MSCOCO #1444

Finetuning Google Open Images Pretrained YOLO with MSCOCO #1444

Comments

saitarslanboun commented Nov 19, 2020 • edited Loading

❔Question

glenn-jocher commented Nov 19, 2020

saitarslanboun commented Nov 19, 2020

glenn-jocher commented Nov 19, 2020

cszer commented Nov 22, 2020

glenn-jocher commented Nov 22, 2020

saitarslanboun commented Nov 23, 2020

glenn-jocher commented Nov 23, 2020

github-actions bot commented Dec 24, 2020

Silmeria112 commented Apr 7, 2021

saitarslanboun commented Apr 7, 2021

glenn-jocher commented Apr 7, 2021

glenn-jocher commented Apr 7, 2021

Silmeria112 commented Apr 9, 2021

saitarslanboun commented Apr 9, 2021

saitarslanboun commented Apr 9, 2021

glenn-jocher commented Apr 9, 2021

ferdinandl007 commented Apr 24, 2021

glenn-jocher commented Apr 24, 2021 • edited Loading

glenn-jocher commented Apr 24, 2021 • edited Loading

ferdinandl007 commented Apr 24, 2021

glenn-jocher commented Apr 24, 2021

ferdinandl007 commented Apr 24, 2021

Silmeria112 commented Apr 25, 2021

glenn-jocher commented Apr 25, 2021 • edited Loading

ferdinandl007 commented Apr 26, 2021 • edited Loading

Silmeria112 commented Apr 26, 2021

marvision-ai commented Apr 30, 2021

ferdinandl007 commented Apr 30, 2021

Silmeria112 commented May 4, 2021

ferdinandl007 commented May 5, 2021

ferdinandl007 commented May 5, 2021

Silmeria112 commented May 5, 2021 • edited Loading

ferdinandl007 commented May 6, 2021

Silmeria112 commented May 8, 2021

glenn-jocher commented May 8, 2021

krishnaadithya commented Jul 29, 2021

Silmeria112 commented Sep 10, 2021

wangsun1996 commented Dec 20, 2022

wangsun1996 commented Dec 20, 2022

glenn-jocher commented Dec 20, 2022

saitarslanboun commented Nov 19, 2020 •

edited

Loading

glenn-jocher commented Apr 24, 2021 •

edited

Loading

glenn-jocher commented Apr 24, 2021 •

edited

Loading

glenn-jocher commented Apr 25, 2021 •

edited

Loading

ferdinandl007 commented Apr 26, 2021 •

edited

Loading

Silmeria112 commented May 5, 2021 •

edited

Loading