Skip to content

Commit

Permalink
[MOT] add MOT data (PaddlePaddle#2789)
Browse files Browse the repository at this point in the history
* add mot data

* fix operators, source

* fix data source transform

* fix parse_dataset register_op

* fix scale_factor, RandomAffine

* add assert for check

* fix ci
  • Loading branch information
nemonameless committed May 11, 2021
1 parent d5da5e6 commit 9ffa308
Show file tree
Hide file tree
Showing 11 changed files with 1,272 additions and 6 deletions.
29 changes: 29 additions & 0 deletions configs/datasets/mot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
metric: MOTDet
num_classes: 1

TrainDataset:
!MOTDataSet
dataset_dir: dataset/mot
image_lists: ['mot17.train', 'caltech.train', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train']
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']

EvalDataset:
!MOTDataSet
dataset_dir: dataset/mot
image_lists: ['citypersons.val', 'caltech.val'] # for detection
# image_lists: ['caltech.10k.val', 'cuhksysu.val', 'prw.val'] # for reid
data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']

TestDataset:
!ImageFolder
dataset_dir: dataset/mot

EvalMOTDataset:
!ImageFolder
dataset_dir: dataset/mot
keep_ori_im: False # set True if save visualization images or video

TestMOTDataset:
!MOTVideoDataset
dataset_dir: dataset/mot
keep_ori_im: False
222 changes: 222 additions & 0 deletions docs/tutorials/PrepareMOTDataSet.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,222 @@
# MOT Dataset
* **MIXMOT**
We use the same training data as [JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) and [FairMOT](https://github.com/ifzhang/FairMOT) in this part and we call it "MIXMOT". Please refer to their [DATA ZOO](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/DATASET_ZOO.md) to download and prepare all the training data including Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17 and MOT16.

* **2DMOT15 and MOT20**
[2DMOT15](https://motchallenge.net/data/2D_MOT_2015/) and [MOT20](https://motchallenge.net/data/MOT20/) can be downloaded from the official webpage of MOT challenge. After downloading, you should prepare the data in the following structure:
```
MOT15
|——————images
| └——————train
| └——————test
└——————labels_with_ids
└——————train
MOT20
|——————images
| └——————train
| └——————test
└——————labels_with_ids
└——————train
```
Annotations of these several relevant datasets are provided in a unified format. If you want to use these datasets, please **follow their licenses**,
and if you use any of these datasets in your research, please cite the original work (you can find the BibTeX in the bottom).
## Data Format
All the datasets have the following structure:
```
Caltech
|——————images
| └——————00001.jpg
| |—————— ...
| └——————0000N.jpg
└——————labels_with_ids
└——————00001.txt
|—————— ...
└——————0000N.txt
```
Every image has a corresponding annotation text. Given an image path,
the annotation text path can be generated by replacing the string `images` with `labels_with_ids` and replacing `.jpg` with `.txt`.

In the annotation text, each line is describing a bounding box and has the following format:
```
[class] [identity] [x_center] [y_center] [width] [height]
```
The field `[class]` should be `0`. Only single-class multi-object tracking is supported in this version.

The field `[identity]` is an integer from `0` to `num_identities - 1`, or `-1` if this box has no identity annotation.

***Note** that the values of `[x_center] [y_center] [width] [height]` are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1.

## Final Dataset root
```
dataset/mot
|——————image_lists
|——————caltech.10k.val
|——————caltech.train
|——————caltech.val
|——————citypersons.train
|——————citypersons.val
|——————cuhksysu.train
|——————cuhksysu.val
|——————eth.train
|——————mot16.train
|——————mot17.train
|——————prw.train
|——————prw.val
|——————Caltech
|——————Cityscapes
|——————CUHKSYSU
|——————ETHZ
|——————MOT15
|——————MOT16
|——————MOT17
|——————MOT20
|——————PRW
```

## Download

### Caltech Pedestrian
Baidu NetDisk:
[[0]](https://pan.baidu.com/s/1sYBXXvQaXZ8TuNwQxMcAgg)
[[1]](https://pan.baidu.com/s/1lVO7YBzagex1xlzqPksaPw)
[[2]](https://pan.baidu.com/s/1PZXxxy_lrswaqTVg0GuHWg)
[[3]](https://pan.baidu.com/s/1M93NCo_E6naeYPpykmaNgA)
[[4]](https://pan.baidu.com/s/1ZXCdPNXfwbxQ4xCbVu5Dtw)
[[5]](https://pan.baidu.com/s/1kcZkh1tcEiBEJqnDtYuejg)
[[6]](https://pan.baidu.com/s/1sDjhtgdFrzR60KKxSjNb2A)
[[7]](https://pan.baidu.com/s/18Zvp_d33qj1pmutFDUbJyw)

Google Drive: [[annotations]](https://drive.google.com/file/d/1h8vxl_6tgi9QVYoer9XcY9YwNB32TE5k/view?usp=sharing) ,
please download all the images `.tar` files from [this page](http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/datasets/USA/) and unzip the images under `Caltech/images`

You may need [this tool](https://github.com/mitmul/caltech-pedestrian-dataset-converter) to convert the original data format to jpeg images.
Original dataset webpage: [CaltechPedestrians](http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/)
### CityPersons
Baidu NetDisk:
[[0]](https://pan.baidu.com/s/1g24doGOdkKqmbgbJf03vsw)
[[1]](https://pan.baidu.com/s/1mqDF9M5MdD3MGxSfe0ENsA)
[[2]](https://pan.baidu.com/s/1Qrbh9lQUaEORCIlfI25wdA)
[[3]](https://pan.baidu.com/s/1lw7shaffBgARDuk8mkkHhw)

Google Drive:
[[0]](https://drive.google.com/file/d/1DgLHqEkQUOj63mCrS_0UGFEM9BG8sIZs/view?usp=sharing)
[[1]](https://drive.google.com/file/d/1BH9Xz59UImIGUdYwUR-cnP1g7Ton_LcZ/view?usp=sharing)
[[2]](https://drive.google.com/file/d/1q_OltirP68YFvRWgYkBHLEFSUayjkKYE/view?usp=sharing)
[[3]](https://drive.google.com/file/d/1VSL0SFoQxPXnIdBamOZJzHrHJ1N2gsTW/view?usp=sharing)

Original dataset webpage: [Citypersons pedestrian detection dataset](https://bitbucket.org/shanshanzhang/citypersons)

### CUHK-SYSU
Baidu NetDisk:
[[0]](https://pan.baidu.com/s/1YFrlyB1WjcQmFW3Vt_sEaQ)

Google Drive:
[[0]](https://drive.google.com/file/d/1D7VL43kIV9uJrdSCYl53j89RE2K-IoQA/view?usp=sharing)

Original dataset webpage: [CUHK-SYSU Person Search Dataset](http://www.ee.cuhk.edu.hk/~xgwang/PS/dataset.html)

### PRW
Baidu NetDisk:
[[0]](https://pan.baidu.com/s/1iqOVKO57dL53OI1KOmWeGQ)

Google Drive:
[[0]](https://drive.google.com/file/d/116_mIdjgB-WJXGe8RYJDWxlFnc_4sqS8/view?usp=sharing)

Original dataset webpage: [Person Search in the Wild datset](http://www.liangzheng.com.cn/Project/project_prw.html)

### ETHZ (overlapping videos with MOT-16 removed):
Baidu NetDisk:
[[0]](https://pan.baidu.com/s/14EauGb2nLrcB3GRSlQ4K9Q)

Google Drive:
[[0]](https://drive.google.com/file/d/19QyGOCqn8K_rc9TXJ8UwLSxCx17e0GoY/view?usp=sharing)

Original dataset webpage: [ETHZ pedestrian datset](https://data.vision.ee.ethz.ch/cvl/aess/dataset/)

### MOT-17
Baidu NetDisk:
[[0]](https://pan.baidu.com/s/1lHa6UagcosRBz-_Y308GvQ)

Google Drive:
[[0]](https://drive.google.com/file/d/1ET-6w12yHNo8DKevOVgK1dBlYs739e_3/view?usp=sharing)

Original dataset webpage: [MOT-17](https://motchallenge.net/data/MOT17/)

### MOT-16 (for evaluation )
Baidu NetDisk:
[[0]](https://pan.baidu.com/s/10pUuB32Hro-h-KUZv8duiw)

Google Drive:
[[0]](https://drive.google.com/file/d/1254q3ruzBzgn4LUejDVsCtT05SIEieQg/view?usp=sharing)

Original dataset webpage: [MOT-16](https://motchallenge.net/data/MOT16/)


# Citation
Caltech:
```
@inproceedings{ dollarCVPR09peds,
author = "P. Doll\'ar and C. Wojek and B. Schiele and P. Perona",
title = "Pedestrian Detection: A Benchmark",
booktitle = "CVPR",
month = "June",
year = "2009",
city = "Miami",
}
```
Citypersons:
```
@INPROCEEDINGS{Shanshan2017CVPR,
Author = {Shanshan Zhang and Rodrigo Benenson and Bernt Schiele},
Title = {CityPersons: A Diverse Dataset for Pedestrian Detection},
Booktitle = {CVPR},
Year = {2017}
}
@INPROCEEDINGS{Cordts2016Cityscapes,
title={The Cityscapes Dataset for Semantic Urban Scene Understanding},
author={Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Rehfeld, Timo and Enzweiler, Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt},
booktitle={Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2016}
}
```
CUHK-SYSU:
```
@inproceedings{xiaoli2017joint,
title={Joint Detection and Identification Feature Learning for Person Search},
author={Xiao, Tong and Li, Shuang and Wang, Bochao and Lin, Liang and Wang, Xiaogang},
booktitle={CVPR},
year={2017}
}
```
PRW:
```
@inproceedings{zheng2017person,
title={Person re-identification in the wild},
author={Zheng, Liang and Zhang, Hengheng and Sun, Shaoyan and Chandraker, Manmohan and Yang, Yi and Tian, Qi},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={1367--1376},
year={2017}
}
```
ETHZ:
```
@InProceedings{eth_biwi_00534,
author = {A. Ess and B. Leibe and K. Schindler and and L. van Gool},
title = {A Mobile Vision System for Robust Multi-Person Tracking},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08)},
year = {2008},
month = {June},
publisher = {IEEE Press},
keywords = {}
}
```
MOT-16&17:
```
@article{milan2016mot16,
title={MOT16: A benchmark for multi-object tracking},
author={Milan, Anton and Leal-Taix{\'e}, Laura and Reid, Ian and Roth, Stefan and Schindler, Konrad},
journal={arXiv preprint arXiv:1603.00831},
year={2016}
}
```
36 changes: 36 additions & 0 deletions ppdet/data/reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -271,3 +271,39 @@ def __init__(self,
super(TestReader, self).__init__(sample_transforms, batch_transforms,
batch_size, shuffle, drop_last,
drop_empty, num_classes, **kwargs)


@register
class EvalMOTReader(BaseDataLoader):
__shared__ = ['num_classes']

def __init__(self,
sample_transforms=[],
batch_transforms=[],
batch_size=1,
shuffle=False,
drop_last=False,
drop_empty=True,
num_classes=1,
**kwargs):
super(EvalMOTReader, self).__init__(sample_transforms, batch_transforms,
batch_size, shuffle, drop_last,
drop_empty, num_classes, **kwargs)


@register
class TestMOTReader(BaseDataLoader):
__shared__ = ['num_classes']

def __init__(self,
sample_transforms=[],
batch_transforms=[],
batch_size=1,
shuffle=False,
drop_last=False,
drop_empty=True,
num_classes=1,
**kwargs):
super(TestMOTReader, self).__init__(sample_transforms, batch_transforms,
batch_size, shuffle, drop_last,
drop_empty, num_classes, **kwargs)
2 changes: 2 additions & 0 deletions ppdet/data/source/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,11 @@
from . import widerface
from . import category
from . import keypoint_coco
from . import mot

from .coco import *
from .voc import *
from .widerface import *
from .category import *
from .keypoint_coco import *
from .mot import *
18 changes: 18 additions & 0 deletions ppdet/data/source/category.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,28 @@ def get_categories(metric_type, anno_file=None, arch=None):
elif metric_type.lower() == 'keypointtopdowncocoeval':
return (None, {'id': 'keypoint'})

elif metric_type.lower() in ['mot', 'motdet', 'reid']:
return _mot_category()

else:
raise ValueError("unknown metric type {}".format(metric_type))


def _mot_category():
"""
Get class id to category id map and category id
to category name map of mot dataset
"""
label_map = {'person': 0}
label_map = sorted(label_map.items(), key=lambda x: x[1])
cats = [l[0] for l in label_map]

clsid2catid = {i: i for i in range(len(cats))}
catid2name = {i: name for i, name in enumerate(cats)}

return clsid2catid, catid2name


def _coco17_category():
"""
Get class id to category id map and category id
Expand Down
16 changes: 11 additions & 5 deletions ppdet/data/source/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,13 @@ def __getitem__(self, idx):
return self.transform(roidb)

def check_or_download_dataset(self):
self.dataset_dir = get_dataset_path(self.dataset_dir, self.anno_path,
self.image_dir)
if isinstance(self.anno_path, list):
for path in self.anno_path:
self.dataset_dir = get_dataset_path(self.dataset_dir, path,
self.image_dir)
else:
self.dataset_dir = get_dataset_path(self.dataset_dir,
self.anno_path, self.image_dir)

def set_kwargs(self, **kwargs):
self.mixup_epoch = kwargs.get('mixup_epoch', -1)
Expand Down Expand Up @@ -134,19 +139,18 @@ class ImageFolder(DetDataset):
def __init__(self,
dataset_dir=None,
image_dir=None,
anno_path=None,
sample_num=-1,
use_default_label=None,
keep_ori_im=False,
**kwargs):
super(ImageFolder, self).__init__(
dataset_dir,
image_dir,
anno_path,
sample_num=sample_num,
use_default_label=use_default_label)
self.keep_ori_im = keep_ori_im
self._imid2path = {}
self.roidbs = None
self.sample_num = sample_num

def check_or_download_dataset(self):
return
Expand Down Expand Up @@ -178,6 +182,8 @@ def _load_images(self):
if self.sample_num > 0 and ct >= self.sample_num:
break
rec = {'im_id': np.array([ct]), 'im_file': image}
if self.keep_ori_im:
rec.update({'keep_ori_im': 1})
self._imid2path[ct] = image
ct += 1
records.append(rec)
Expand Down
Loading

0 comments on commit 9ffa308

Please sign in to comment.