Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide more training information #805

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
[submodule "caffe-fast-rcnn"]
path = caffe-fast-rcnn
url = https://github.com/rbgirshick/caffe-fast-rcnn.git
branch = fast-rcnn
url = https://github.com/denny1108/caffe-fast-rcnn.git
227 changes: 28 additions & 199 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,217 +1,46 @@
### Disclaimer
### Multispectral Deep Neural Networks for Pedestrian Detection
Editted by Jingjing Liu, Rutgers University.

The official Faster R-CNN code (written in MATLAB) is available [here](https://github.com/ShaoqingRen/faster_rcnn).
If your goal is to reproduce the results in our NIPS 2015 paper, please use the [official code](https://github.com/ShaoqingRen/faster_rcnn).
<img src="examples/fusion_models.png" width="900px" height="250px"/>

This repository contains a Python *reimplementation* of the MATLAB code.
This Python implementation is built on a fork of [Fast R-CNN](https://github.com/rbgirshick/fast-rcnn).
There are slight differences between the two implementations.
In particular, this Python port
- is ~10% slower at test-time, because some operations execute on the CPU in Python layers (e.g., 220ms / image vs. 200ms / image for VGG16)
- gives similar, but not exactly the same, mAP as the MATLAB version
- is *not compatible* with models trained using the MATLAB code due to the minor implementation differences
- **includes approximate joint training** that is 1.5x faster than alternating optimization (for VGG16) -- see these [slides](https://www.dropbox.com/s/xtr4yd4i5e0vw8g/iccv15_tutorial_training_rbg.pdf?dl=0) for more information
Code used in reproducing results in our paper [Multispectral deep neural networks for pedestrian detection](http://paul.rutgers.edu/~jl1322/papers/BMVC16_liu.pdf) by Jingjing Liu, Shaoting Zhang, Shu Wang, and Dimitris N. Metaxas. BMVC 2016. [[project link]](http://paul.rutgers.edu/~jl1322/multispectral.htm).

# *Faster* R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
This repository is a folk of py-faster-rcnn [offciel code](https://github.com/rbgirshick/py-faster-rcnn), written by Ross Girshick. For how to install the required softwares and set up the code in right configuration, e.g., Caffe, pycaffe, please refer to their original [README.md](https://github.com/rbgirshick/py-faster-rcnn/blob/master/README.md).

By Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (Microsoft Research)
### Download pretrained models
[VGG16 model on caltech](https://drive.google.com/open?id=0ByrJI3mShdW6WVBxQldmdnE2S2s) trained on Caltech pedestrian dataset.

This Python implementation contains contributions from Sean Bell (Cornell) written during an MSR internship.
[VGG16 model on kaist (RGB input)](https://drive.google.com/open?id=0ByrJI3mShdW6LWNqT0tYQ3JteW8) trained on Kaist pedestrian dataset.

Please see the official [README.md](https://github.com/ShaoqingRen/faster_rcnn/blob/master/README.md) for more details.
[VGG16 model on kaist (multispectral input)](https://drive.google.com/open?id=0ByrJI3mShdW6R3R1dkE4QlNQUUk) trained on Kaist multispectral dataset.

Faster R-CNN was initially described in an [arXiv tech report](http://arxiv.org/abs/1506.01497) and was subsequently published in NIPS 2015.
Save these models to `models/caltech/VGG16/`, `models/kaist/VGG16/`, and `models/kaist_fusion/VGG16/`, respectively.

### License

Faster R-CNN is released under the MIT License (refer to the LICENSE file for details).

### Citing Faster R-CNN

If you find Faster R-CNN useful in your research, please consider citing:

@inproceedings{renNIPS15fasterrcnn,
Author = {Shaoqing Ren and Kaiming He and Ross Girshick and Jian Sun},
Title = {Faster {R-CNN}: Towards Real-Time Object Detection
with Region Proposal Networks},
Booktitle = {Advances in Neural Information Processing Systems ({NIPS})},
Year = {2015}
}

### Contents
1. [Requirements: software](#requirements-software)
2. [Requirements: hardware](#requirements-hardware)
3. [Basic installation](#installation-sufficient-for-the-demo)
4. [Demo](#demo)
5. [Beyond the demo: training and testing](#beyond-the-demo-installation-for-training-and-testing-models)
6. [Usage](#usage)

### Requirements: software

1. Requirements for `Caffe` and `pycaffe` (see: [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html))

**Note:** Caffe *must* be built with support for Python layers!

```make
# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1
```

You can download my [Makefile.config](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/Makefile.config) for reference.
2. Python packages you might not have: `cython`, `python-opencv`, `easydict`
3. [Optional] MATLAB is required for **official** PASCAL VOC evaluation only. The code now includes unofficial Python evaluation code.

### Requirements: hardware

1. For training smaller networks (ZF, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
2. For training Fast R-CNN with VGG16, you'll need a K40 (~11G of memory)
3. For training the end-to-end version of Faster R-CNN with VGG16, 3G of GPU memory is sufficient (using CUDNN)

### Installation (sufficient for the demo)

1. Clone the Faster R-CNN repository
```Shell
# Make sure to clone with --recursive
git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git
```

2. We'll call the directory that you cloned Faster R-CNN into `FRCN_ROOT`

*Ignore notes 1 and 2 if you followed step 1 above.*

**Note 1:** If you didn't clone Faster R-CNN with the `--recursive` flag, then you'll need to manually clone the `caffe-fast-rcnn` submodule:
```Shell
git submodule update --init --recursive
```
**Note 2:** The `caffe-fast-rcnn` submodule needs to be on the `faster-rcnn` branch (or equivalent detached state). This will happen automatically *if you followed step 1 instructions*.

3. Build the Cython modules
```Shell
cd $FRCN_ROOT/lib
make
```

4. Build Caffe and pycaffe
```Shell
cd $FRCN_ROOT/caffe-fast-rcnn
# Now follow the Caffe installation instructions here:
# http://caffe.berkeleyvision.org/installation.html

# If you're experienced with Caffe and have all of the requirements installed
# and your Makefile.config in place, then simply do:
make -j8 && make pycaffe
```

5. Download pre-computed Faster R-CNN detectors
```Shell
cd $FRCN_ROOT
./data/scripts/fetch_faster_rcnn_models.sh
```

This will populate the `$FRCN_ROOT/data` folder with `faster_rcnn_models`. See `data/README.md` for details.
These models were trained on VOC 2007 trainval.

### Demo
### Run demos
Run `sh ./run_demo.sh caltech` for images from Caltech.

*After successfully completing [basic installation](#installation-sufficient-for-the-demo)*, you'll be ready to run the demo.
Run `sh ./run_demo.sh kaist-color` for images from Kaist.

To run the demo
```Shell
cd $FRCN_ROOT
./tools/demo.py
```
The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007.

### Beyond the demo: installation for training and testing models
1. Download the training, validation, test data and VOCdevkit

```Shell
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
```

2. Extract all of these tars into one directory named `VOCdevkit`

```Shell
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
```

3. It should have this basic structure

```Shell
$VOCdevkit/ # development kit
$VOCdevkit/VOCcode/ # VOC utility code
$VOCdevkit/VOC2007 # image sets, annotations, etc.
# ... and several other directories ...
```

4. Create symlinks for the PASCAL VOC dataset

```Shell
cd $FRCN_ROOT/data
ln -s $VOCdevkit VOCdevkit2007
```
Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.
5. [Optional] follow similar steps to get PASCAL VOC 2010 and 2012
6. [Optional] If you want to use COCO, please see some notes under `data/README.md`
7. Follow the next sections to download pre-trained ImageNet models

### Download pre-trained ImageNet models

Pre-trained ImageNet models can be downloaded for the three networks described in the paper: ZF and VGG16.

```Shell
cd $FRCN_ROOT
./data/scripts/fetch_imagenet_models.sh
```
VGG16 comes from the [Caffe Model Zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo), but is provided here for your convenience.
ZF was trained at MSRA.

### Usage

To train and test a Faster R-CNN detector using the **alternating optimization** algorithm from our NIPS 2015 paper, use `experiments/scripts/faster_rcnn_alt_opt.sh`.
Output is written underneath `$FRCN_ROOT/output`.

```Shell
cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
# --set EXP_DIR seed_rng1701 RNG_SEED 1701
```

("alt opt" refers to the alternating optimization training algorithm described in the NIPS paper.)
Run `sh ./run_demo.sh kaist-fusion` for multispectral images from Kaist.

To train and test a Faster R-CNN detector using the **approximate joint training** method, use `experiments/scripts/faster_rcnn_end2end.sh`.
Output is written underneath `$FRCN_ROOT/output`.
### Caltech results
<img src="examples/caltech_result_1.png" width="400px" height="400px"/> <img src="examples/caltech_result_2.png" width="400px" height="400px"/>

```Shell
cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
# --set EXP_DIR seed_rng1701 RNG_SEED 1701
```

This method trains the RPN module jointly with the Fast R-CNN network, rather than alternating between training the two. It results in faster (~ 1.5x speedup) training times and similar detection accuracy. See these [slides](https://www.dropbox.com/s/xtr4yd4i5e0vw8g/iccv15_tutorial_training_rbg.pdf?dl=0) for more details.

Artifacts generated by the scripts in `tools` are written in this directory.
### KAIST results
<img src="examples/kaist_result_1.png" width="400px" height="400px"/> <img src="examples/kaist_result_2.png" width="400px" height="400px"/>

Trained Fast R-CNN networks are saved under:
### License

```
output/<experiment directory>/<dataset name>/
```
Our code is released under the MIT License (refer to the LICENSE file for details).

Test outputs are saved under:
### Citing our paper
If you find our work useful in your research, please consider citing:

```
output/<experiment directory>/<dataset name>/<network snapshot name>/
@article{liu2016multispectral,
title={Multispectral deep neural networks for pedestrian detection},
author={Liu, Jingjing and Zhang, Shaoting and Wang, Shu and Metaxas, Dimitris N},
journal={arXiv preprint arXiv:1611.02644},
year={2016}
}
```
2 changes: 1 addition & 1 deletion caffe-fast-rcnn
Submodule caffe-fast-rcnn updated 405 files
8 changes: 8 additions & 0 deletions data/demo_pedestrian/annotations/set06_V002_I00779.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
% bbGt version=3
person 466 183 19 57 0 0 0 0 0 0 0
person 511 183 18 51 0 0 0 0 0 0 0
person 541 174 18 57 0 0 0 0 0 0 0
person 561 172 24 59 0 0 0 0 0 0 0
person 579 153 37 118 1 579 153 37 118 0 0
person 458 179 20 47 1 458 179 14 45 0 0
person 491 180 17 47 0 0 0 0 0 0 0
14 changes: 14 additions & 0 deletions data/demo_pedestrian/annotations/set07_V000_I01619.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
% bbGt version=3
person 149 151 63 125 0 0 0 0 0 0 0
person 39 152 70 147 0 0 0 0 0 0 0
person 11 150 47 130 0 0 0 0 0 0 0
person 568 161 21 53 1 569 161 18 16 0 0
person 527 158 21 54 1 528 159 20 33 0 0
person 230 172 23 51 0 0 0 0 0 0 0
person 254 166 30 56 0 0 0 0 0 0 0
person 297 169 23 54 0 0 0 0 0 0 0
person 15 196 21 43 1 15 196 1 43 0 0
person 179 171 26 58 1 187 171 12 57 0 0
person 140 173 19 53 0 0 0 0 0 0 0
person 5 185 20 52 1 5 185 1 52 0 0
people 328 177 25 47 0 0 0 0 0 0 0
12 changes: 12 additions & 0 deletions data/demo_pedestrian/annotations/set07_V000_I01829.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
% bbGt version=3
person 7 188 19 64 1 7 188 17 64 0 0
person 432 137 96 239 0 0 0 0 0 0 0
person 191 142 108 208 0 0 0 0 0 0 0
person 388 138 66 213 1 388 140 65 211 0 0
person 423 177 29 63 1 451 177 1 63 0 0
person 317 184 34 62 0 0 0 0 0 0 0
person 81 147 64 130 0 0 0 0 0 0 0
person 224 147 53 196 1 270 147 1 196 0 0
person 435 159 87 202 1 481 159 2 202 0 0
people 46 190 38 60 0 0 0 0 0 0 0
person 384 188 15 36 0 0 0 0 0 0 0
3 changes: 3 additions & 0 deletions data/demo_pedestrian/annotations/set07_V002_I01379.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
% bbGt version=3
person 475 215 36 78 0 0 0 0 0 0 0
person 507 214 33 69 0 0 0 0 0 0 0
17 changes: 17 additions & 0 deletions data/demo_pedestrian/annotations/set08_V001_I02559.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
% bbGt version=3
person 107 217 25 62 0 0 0 0 0 0 0
person 128 220 28 71 0 0 0 0 0 0 0
person 253 219 32 83 0 0 0 0 0 0 0
person 295 218 33 88 2 0 0 0 0 0 0
person 423 223 36 77 0 0 0 0 0 0 0
person 160 220 34 66 0 0 0 0 0 0 0
person 300 218 57 145 0 0 0 0 0 0 0
person 14 217 39 103 0 0 0 0 0 0 0
people 447 217 77 57 0 0 0 0 0 0 0
person 193 244 25 48 0 0 0 0 0 0 0
person 336 224 46 102 0 0 0 0 0 0 0
person 530 218 23 46 1 0 0 0 0 0 0
people 563 219 76 86 0 0 0 0 0 0 0
people 223 221 29 45 0 0 0 0 0 0 0
person 64 217 22 46 0 0 0 0 0 0 0
person 80 217 21 46 0 0 0 0 0 0 0
4 changes: 4 additions & 0 deletions data/demo_pedestrian/annotations/set09_V000_I00899.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
% bbGt version=3
person 502 154 37 98 0 0 0 0 0 0 0
person 483 162 23 77 0 0 0 0 0 0 0
person 436 158 37 87 1 439 159 33 80 0 0
3 changes: 3 additions & 0 deletions data/demo_pedestrian/annotations/set09_V000_I00939.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
% bbGt version=3
person 67 219 39 104 0 0 0 0 0 0 0
person 105 221 36 84 0 0 0 0 0 0 0
1 change: 1 addition & 0 deletions data/demo_pedestrian/annotations/set10_V001_I01159.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
% bbGt version=3
Binary file added data/demo_pedestrian/color/set06_V002_I00779.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/demo_pedestrian/color/set07_V000_I01829.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/demo_pedestrian/color/set07_V002_I01379.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/demo_pedestrian/color/set08_V001_I02559.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/demo_pedestrian/color/set09_V000_I00899.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/demo_pedestrian/color/set09_V000_I00939.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/demo_pedestrian/color/set10_V001_I01159.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/caltech_result_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/caltech_result_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/caltech_result_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/caltech_result_4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/fusion_models.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/kaist_result_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/kaist_result_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading