We integrate dual modalities (foregrounds' motion and appearance), and then eliminating features without representativeness of foreground through attention-module-guided selective-connection structures. It is in an end-to-end training manner and to achieve scene adaptation in the plug and play style. Experiments indicate the proposed method significantly outperforms the state-of-the-art deep models and background subtraction methods in untrained scenes -- LIMU and LASIESTA.
Our work is based on our group accepeted work foreground segmentation model STAM. Code uses Tensorflow 1.13, CUDN 10.1.
The structure of the proposed Hierarchical Optical Flow Attention Model (HOFAM).
Comparison to the baseline on DOTA for oriented object detection with ResNet-101. The figures with blue boxes are the results of the baseline and pink boxes are the results of our proposed CG-Net.
Method | Mean Dice | Recall | Precision | F-measure |
---|---|---|---|---|
HOFAM | 0.9466 | 0.9661 | 0.9893 | 0.9776 |
You need download checkpoint first, and place it in checkpoint/(here)
Refer to selflow to calculate different optical flows
Merge vidoe frame + hierarchical optical flow + ground truth like dataset/demo_data/test_000155.png
Prepare and Generate tfrecode file
change data path and run tfrecode.py
parameters setting
1. Change tfrecode file path in model.py line 137
2. Change train and test dataset in model.py line 209 and 659
3. Change --phase(train or test) in main.py and run main.py
start train or test
$ run main.py
Hierarchical optical flow (orange border) and foreground segmentation results.
Visualization of Attention Module results.
Comparison results of foreground segmentation of small objects with different losses.
Comparison results of different model on crossscene dataset LIMU. Each column has five images and there are video frame, segmented results of HOFAM, PSPNet, DeepLabV3+ and STAM, from left to right. Green: False Positive, Red: False Negative.
Comparison results of different Model on cross-scene dataset LASIESTA.