Dataset Preparations

Get The PASCAL VOC Dataset:

We use VOC 07+12 protocol, i.e., the train set contains VOC 2007 trainval + VOC 2012 trainval, and the test set contains VOC 2007 test.

wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar
wget https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar
tar xf VOCtrainval_11-May-2012.tar
tar xf VOCtrainval_06-Nov-2007.tar
tar xf VOCtest_06-Nov-2007.tar

Get The 3 Application Datasets:

Download at our links: Face Mask, Fruit, Helmet.

Download at Kaggle: Face Mask, Fruit, Helmet.

If you download at our links, you can use it after unzip and put them to the proper dir.

If you download at Kaggle links, you need to process the annotations to XML format if the data label is in YOLO format. And there are some considerations:

For face mask dataset, we use the union of train set and valid set for training.
For fruit dataset, we found some images have errors. Please use our train.txt and test.txtindata/fruit/VOC2007/ImageSets/Main` to process the normal data.
For helmet dataset, we use the union of valid set and test set for evaluation.

Get The MS COCO Dataset:

We use COCO train2017 for training and val2017 for evaluation.

python tools/misc/download_dataset.py --dataset-name coco2017

Put all the above datasets in the following dir

mmdetection
├── data
    ├── VOCdevkit
        ├── VOC2007
            ├── Annotations
                ├── 00001.xml
                    00002.xml
                    ......
            ├── ImageSets
                ├── Main
                    ├── train.txt
                        test.txt
            ├── JPEGImages
                ├── 00001.jpg
                    00002.jpg
                    ......
            ├── labels
            ├── SegmentationClass
            ├── SegmentationObject
        ├── VOC2012
            ├── Annotations
            ├── ImageSets
            ├── JPEGImages
            ├── labels
            ├── SegmentationClass
            ├── SegmentationObject
    ├── facemask
        ├── VOC2007
            ├──Annotations
            ├──ImageSets
            ├──JPEGImages
    ├── fruit
        ├── VOC2007
            ├──Annotations
            ├──ImageSets
            ├──JPEGImages
    ├── helmet
        ├── VOC2007
            ├──Annotations
            ├──ImageSets
            ├──JPEGImages
    ├── coco
        ├── annotations
            ├── instances_train2017.json
                instances_val2017.json
        ├── train2017
            ├── 000000000009.jpg
                000000000025.jpg
                ......
        ├── val2017
            ├── 000000000139.jpg
                000000000285.jpg
                ......
        ├── test2017

The breif information about the 5 datasets:

Dataset	Training set	Test set	#Classes
PASCAL VOC	16551	4952	20
Face Mask	5865	1035	2
Fruit	3836	639	11
Helmet	15887	6902	2
MS COCO	118291	5000	80

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataset_preparation.md

dataset_preparation.md

Dataset Preparations

Get The PASCAL VOC Dataset:

Get The 3 Application Datasets:

Get The MS COCO Dataset:

Files

dataset_preparation.md

Latest commit

History

dataset_preparation.md

File metadata and controls

Dataset Preparations

Get The PASCAL VOC Dataset:

Get The 3 Application Datasets:

Get The MS COCO Dataset: