BoxMOTS

This is the official pytorch implementation for our weakly supervised MOTS work: Towards High Quality Multi-Object Tracking and Segmentation without Mask Supervision. This project includes four parts: main model, data association method, optical flow model, and shadow detection model.

Highlights

Box supervised multi-object tracking and segmentation model. Only bounding box labels are used in the training stage.
Superior performance than previous works. 12.4% improvement on sMOTSA, 7.3% improvement on MOTSA, and 8.2% improvement on MOTSP on the KITTI MOTS dataset.
Flexible modules. Optical flow model and shadow detection model are used on-demand. They can also be replaced by more advanced optical flow/shadow detection models to achieve better performance.

Visualization Results

BoxMOTS visualization results on KITTI MOTS, BDD100K MOTS, MOSE (VOS dataset), and YouTube-VIS 2019 (VIS dataset) (from top to bottom). For results on MOSE and YouTube-VIS 2019, the BoxMOTS model trained on KITTI MOTS is used to directly make predictions on MOSE and YouTube-VIS 2019, without training on these two datasets.

Abstract

Recently studies have shown the potential of weakly supervised multi-object tracking and segmentation, but the drawbacks of coarse pseudo mask label and limited utilization of temporal information remain to be unresolved. To address these issues, we present a framework that directly uses box label to supervise the segmentation network without resorting to pseudo mask label. In addition, we propose to fully exploit the temporal information from two perspectives. Firstly, we integrate optical flow-based pairwise consistency to ensure mask consistency across frames, thereby improving mask quality for segmentation. Secondly, we propose a temporally adjacent pair-based sampling strategy to adapt instance embedding learning for data association in tracking. We combine these techniques into an end-to-end deep model, named BoxMOTS, which requires only box annotation without mask supervision. Extensive experiments demonstrate that our model surpasses current state-of-the-art by a large margin, and produces promising results on KITTI MOTS and BDD100K MOTS.

Main Model

Main model generates detection, segmentation, and object embedding results. This part is contained in the boxmots folder. Please go to the README file under that folder for usage details.

Data Association Method

We use DeepSORT for data association, based on both motion and appearance information. This part is contained in the StrongSORT folder. Please go to the the README file under that folder for usage details.

Optical Flow Model

We use the GMA method to generate optical flow results for the KITTI MOTS and BDD100K MOTS training sets. Optical flow results are used to train the main model. This part is contained in the GMA folder. Please go to the the README file under that folder for usage details.

Shadow Detection Model

We use the SSIS method to detect the shadow and remove it from the car-like object's segmentation result. Shadow detection results are used in the inference process. This part is contained in the SSIS folder. Please go to the README file under that folder for usage details.

TODO

Citation

If you find this project helpful, feel free to cite our work.

@article{cheng2024towards,
  title={Towards High Quality Multi-Object Tracking and Segmentation without Mask Supervision},
  author={Cheng, Wensheng and Wu, Yi and Wu, Zhenyu and Ling, Haibin and Hua, Gang},
  journal={IEEE Transactions on Image Processing},
  year={2024},
  publisher={IEEE}
}

Acknowledgements

Thanks AdelaiDet for the BoxInst implementation.
Thanks StrongSORT for the DeepSORT implementation.
Thanks GMA for the optical flow model.
Thanks SSIS for the shadow detection model.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
GMA		GMA
SSIS		SSIS
StrongSORT		StrongSORT
boxmots		boxmots
vis_results		vis_results
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BoxMOTS

Highlights

Visualization Results

Abstract

Main Model

Data Association Method

Optical Flow Model

Shadow Detection Model

TODO

Citation

Acknowledgements

About

Releases

Packages

Languages

License

Spritea/BoxMOTS

Folders and files

Latest commit

History

Repository files navigation

BoxMOTS

Highlights

Visualization Results

Abstract

Main Model

Data Association Method

Optical Flow Model

Shadow Detection Model

TODO

Citation

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages