rbgirshick · icbcbicc · Mar 11, 2017 · Mar 11, 2017 · Mar 11, 2017 · Mar 11, 2017
diff --git a/.gitmodules b/.gitmodules
@@ -1,4 +1,3 @@
 [submodule "caffe-fast-rcnn"]
 	path = caffe-fast-rcnn
-	url = https://github.com/rbgirshick/caffe-fast-rcnn.git
-	branch = fast-rcnn
+	url = https://github.com/denny1108/caffe-fast-rcnn.git
diff --git a/README.md b/README.md
@@ -1,217 +1,46 @@
-### Disclaimer
+### Multispectral Deep Neural Networks for Pedestrian Detection
+Editted by Jingjing Liu, Rutgers University.
 
-The official Faster R-CNN code (written in MATLAB) is available [here](https://github.com/ShaoqingRen/faster_rcnn).
-If your goal is to reproduce the results in our NIPS 2015 paper, please use the [official code](https://github.com/ShaoqingRen/faster_rcnn).
+<img src="examples/fusion_models.png" width="900px" height="250px"/>
 
-This repository contains a Python *reimplementation* of the MATLAB code.
-This Python implementation is built on a fork of [Fast R-CNN](https://github.com/rbgirshick/fast-rcnn).
-There are slight differences between the two implementations.
-In particular, this Python port
- - is ~10% slower at test-time, because some operations execute on the CPU in Python layers (e.g., 220ms / image vs. 200ms / image for VGG16)
- - gives similar, but not exactly the same, mAP as the MATLAB version
- - is *not compatible* with models trained using the MATLAB code due to the minor implementation differences
- - **includes approximate joint training** that is 1.5x faster than alternating optimization (for VGG16) -- see these [slides](https://www.dropbox.com/s/xtr4yd4i5e0vw8g/iccv15_tutorial_training_rbg.pdf?dl=0) for more information
+Code used in reproducing results in our paper [Multispectral deep neural networks for pedestrian detection](http://paul.rutgers.edu/~jl1322/papers/BMVC16_liu.pdf) by Jingjing Liu, Shaoting Zhang, Shu Wang, and Dimitris N. Metaxas. BMVC 2016. [[project link]](http://paul.rutgers.edu/~jl1322/multispectral.htm).
 
-# *Faster* R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
+This repository is a folk of py-faster-rcnn [offciel code](https://github.com/rbgirshick/py-faster-rcnn), written by Ross Girshick. For how to install the required softwares and set up the code in right configuration, e.g., Caffe, pycaffe, please refer to their original [README.md](https://github.com/rbgirshick/py-faster-rcnn/blob/master/README.md).
 
-By Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (Microsoft Research)
+### Download pretrained models
+[VGG16 model on caltech](https://drive.google.com/open?id=0ByrJI3mShdW6WVBxQldmdnE2S2s) trained on Caltech pedestrian dataset.
 
-This Python implementation contains contributions from Sean Bell (Cornell) written during an MSR internship.
+[VGG16 model on kaist (RGB input)](https://drive.google.com/open?id=0ByrJI3mShdW6LWNqT0tYQ3JteW8) trained on Kaist pedestrian dataset.
 
-Please see the official [README.md](https://github.com/ShaoqingRen/faster_rcnn/blob/master/README.md) for more details.
+[VGG16 model on kaist (multispectral input)](https://drive.google.com/open?id=0ByrJI3mShdW6R3R1dkE4QlNQUUk) trained on Kaist multispectral dataset.
 
-Faster R-CNN was initially described in an [arXiv tech report](http://arxiv.org/abs/1506.01497) and was subsequently published in NIPS 2015.
+Save these models to `models/caltech/VGG16/`, `models/kaist/VGG16/`, and `models/kaist_fusion/VGG16/`, respectively.
 
-### License
-
-Faster R-CNN is released under the MIT License (refer to the LICENSE file for details).
-
-### Citing Faster R-CNN
-
-If you find Faster R-CNN useful in your research, please consider citing:
-
-    @inproceedings{renNIPS15fasterrcnn,
-        Author = {Shaoqing Ren and Kaiming He and Ross Girshick and Jian Sun},
-        Title = {Faster {R-CNN}: Towards Real-Time Object Detection
-                 with Region Proposal Networks},
-        Booktitle = {Advances in Neural Information Processing Systems ({NIPS})},
-        Year = {2015}
-    }
-
-### Contents
-1. [Requirements: software](#requirements-software)
-2. [Requirements: hardware](#requirements-hardware)
-3. [Basic installation](#installation-sufficient-for-the-demo)
-4. [Demo](#demo)
-5. [Beyond the demo: training and testing](#beyond-the-demo-installation-for-training-and-testing-models)
-6. [Usage](#usage)
-
-### Requirements: software
-
-1. Requirements for `Caffe` and `pycaffe` (see: [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html))
-
-  **Note:** Caffe *must* be built with support for Python layers!
-
-  ```make
-  # In your Makefile.config, make sure to have this line uncommented
-  WITH_PYTHON_LAYER := 1
-  # Unrelatedly, it's also recommended that you use CUDNN
-  USE_CUDNN := 1
-  ```
-
-  You can download my [Makefile.config](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/Makefile.config) for reference.
-2. Python packages you might not have: `cython`, `python-opencv`, `easydict`
-3. [Optional] MATLAB is required for **official** PASCAL VOC evaluation only. The code now includes unofficial Python evaluation code.
-
-### Requirements: hardware
-
-1. For training smaller networks (ZF, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
-2. For training Fast R-CNN with VGG16, you'll need a K40 (~11G of memory)
-3. For training the end-to-end version of Faster R-CNN with VGG16, 3G of GPU memory is sufficient (using CUDNN)
-
-### Installation (sufficient for the demo)
-
-1. Clone the Faster R-CNN repository
-  ```Shell
-  # Make sure to clone with --recursive
-  git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git
-  ```
-
-2. We'll call the directory that you cloned Faster R-CNN into `FRCN_ROOT`
-
-   *Ignore notes 1 and 2 if you followed step 1 above.*
-
-   **Note 1:** If you didn't clone Faster R-CNN with the `--recursive` flag, then you'll need to manually clone the `caffe-fast-rcnn` submodule:
-    ```Shell
-    git submodule update --init --recursive
-    ```
-    **Note 2:** The `caffe-fast-rcnn` submodule needs to be on the `faster-rcnn` branch (or equivalent detached state). This will happen automatically *if you followed step 1 instructions*.
-
-3. Build the Cython modules
-    ```Shell
-    cd $FRCN_ROOT/lib
-    make
-    ```
-
-4. Build Caffe and pycaffe
-    ```Shell
-    cd $FRCN_ROOT/caffe-fast-rcnn
-    # Now follow the Caffe installation instructions here:
-    #   http://caffe.berkeleyvision.org/installation.html
-
-    # If you're experienced with Caffe and have all of the requirements installed
-    # and your Makefile.config in place, then simply do:
-    make -j8 && make pycaffe
-    ```
-
-5. Download pre-computed Faster R-CNN detectors
-    ```Shell
-    cd $FRCN_ROOT
-    ./data/scripts/fetch_faster_rcnn_models.sh
-    ```
-
-    This will populate the `$FRCN_ROOT/data` folder with `faster_rcnn_models`. See `data/README.md` for details.
-    These models were trained on VOC 2007 trainval.
-
-### Demo
+### Run demos
+Run `sh ./run_demo.sh caltech` for images from Caltech.
 
-*After successfully completing [basic installation](#installation-sufficient-for-the-demo)*, you'll be ready to run the demo.
+Run `sh ./run_demo.sh kaist-color` for images from Kaist.
 
-To run the demo
-```Shell
-cd $FRCN_ROOT
-./tools/demo.py
-```
-The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007.
-
-### Beyond the demo: installation for training and testing models
-1. Download the training, validation, test data and VOCdevkit
-
-	```Shell
-	wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
-	wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
-	wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
-	```
-
-2. Extract all of these tars into one directory named `VOCdevkit`
-
-	```Shell
-	tar xvf VOCtrainval_06-Nov-2007.tar
-	tar xvf VOCtest_06-Nov-2007.tar
-	tar xvf VOCdevkit_08-Jun-2007.tar
-	```
-
-3. It should have this basic structure
-
-	```Shell
-  	$VOCdevkit/                           # development kit
-  	$VOCdevkit/VOCcode/                   # VOC utility code
-  	$VOCdevkit/VOC2007                    # image sets, annotations, etc.
-  	# ... and several other directories ...
-  	```
-
-4. Create symlinks for the PASCAL VOC dataset
-
-	```Shell
-    cd $FRCN_ROOT/data
-    ln -s $VOCdevkit VOCdevkit2007
-    ```
-    Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.
-5. [Optional] follow similar steps to get PASCAL VOC 2010 and 2012
-6. [Optional] If you want to use COCO, please see some notes under `data/README.md`
-7. Follow the next sections to download pre-trained ImageNet models
-
-### Download pre-trained ImageNet models
-
-Pre-trained ImageNet models can be downloaded for the three networks described in the paper: ZF and VGG16.
-
-```Shell
-cd $FRCN_ROOT
-./data/scripts/fetch_imagenet_models.sh
-```
-VGG16 comes from the [Caffe Model Zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo), but is provided here for your convenience.
-ZF was trained at MSRA.
-
-### Usage
-
-To train and test a Faster R-CNN detector using the **alternating optimization** algorithm from our NIPS 2015 paper, use `experiments/scripts/faster_rcnn_alt_opt.sh`.
-Output is written underneath `$FRCN_ROOT/output`.
-
-```Shell
-cd $FRCN_ROOT
-./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set ...]
-# GPU_ID is the GPU you want to train on
-# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
-# --set ... allows you to specify fast_rcnn.config options, e.g.
-#   --set EXP_DIR seed_rng1701 RNG_SEED 1701
-```
-
-("alt opt" refers to the alternating optimization training algorithm described in the NIPS paper.)
+Run `sh ./run_demo.sh kaist-fusion` for multispectral images from Kaist.
 
-To train and test a Faster R-CNN detector using the **approximate joint training** method, use `experiments/scripts/faster_rcnn_end2end.sh`.
-Output is written underneath `$FRCN_ROOT/output`.
+### Caltech results
+<img src="examples/caltech_result_1.png" width="400px" height="400px"/> <img src="examples/caltech_result_2.png" width="400px" height="400px"/>
 
-```Shell
-cd $FRCN_ROOT
-./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] [--set ...]
-# GPU_ID is the GPU you want to train on
-# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
-# --set ... allows you to specify fast_rcnn.config options, e.g.
-#   --set EXP_DIR seed_rng1701 RNG_SEED 1701
-```
-
-This method trains the RPN module jointly with the Fast R-CNN network, rather than alternating between training the two. It results in faster (~ 1.5x speedup) training times and similar detection accuracy. See these [slides](https://www.dropbox.com/s/xtr4yd4i5e0vw8g/iccv15_tutorial_training_rbg.pdf?dl=0) for more details.
-
-Artifacts generated by the scripts in `tools` are written in this directory.
+### KAIST results
+<img src="examples/kaist_result_1.png" width="400px" height="400px"/> <img src="examples/kaist_result_2.png" width="400px" height="400px"/>
 
-Trained Fast R-CNN networks are saved under:
+### License
 
-```
-output/<experiment directory>/<dataset name>/
-```
+Our code is released under the MIT License (refer to the LICENSE file for details).
 
-Test outputs are saved under:
+### Citing our paper
+If you find our work useful in your research, please consider citing:
 
 ```
-output/<experiment directory>/<dataset name>/<network snapshot name>/
+@article{liu2016multispectral,
+  title={Multispectral deep neural networks for pedestrian detection},
+  author={Liu, Jingjing and Zhang, Shaoting and Wang, Shu and Metaxas, Dimitris N},
+  journal={arXiv preprint arXiv:1611.02644},
+  year={2016}
+}
 ```
diff --git a/caffe-fast-rcnn b/caffe-fast-rcnn
diff --git a/data/demo_pedestrian/annotations/set06_V002_I00779.txt b/data/demo_pedestrian/annotations/set06_V002_I00779.txt
@@ -0,0 +1,8 @@
+% bbGt version=3
+person 466 183 19 57 0 0 0 0 0 0 0
+person 511 183 18 51 0 0 0 0 0 0 0
+person 541 174 18 57 0 0 0 0 0 0 0
+person 561 172 24 59 0 0 0 0 0 0 0
+person 579 153 37 118 1 579 153 37 118 0 0
+person 458 179 20 47 1 458 179 14 45 0 0
+person 491 180 17 47 0 0 0 0 0 0 0
diff --git a/data/demo_pedestrian/annotations/set07_V000_I01619.txt b/data/demo_pedestrian/annotations/set07_V000_I01619.txt
@@ -0,0 +1,14 @@
+% bbGt version=3
+person 149 151 63 125 0 0 0 0 0 0 0
+person 39 152 70 147 0 0 0 0 0 0 0
+person 11 150 47 130 0 0 0 0 0 0 0
+person 568 161 21 53 1 569 161 18 16 0 0
+person 527 158 21 54 1 528 159 20 33 0 0
+person 230 172 23 51 0 0 0 0 0 0 0
+person 254 166 30 56 0 0 0 0 0 0 0
+person 297 169 23 54 0 0 0 0 0 0 0
+person 15 196 21 43 1 15 196 1 43 0 0
+person 179 171 26 58 1 187 171 12 57 0 0
+person 140 173 19 53 0 0 0 0 0 0 0
+person 5 185 20 52 1 5 185 1 52 0 0
+people 328 177 25 47 0 0 0 0 0 0 0
diff --git a/data/demo_pedestrian/annotations/set07_V000_I01829.txt b/data/demo_pedestrian/annotations/set07_V000_I01829.txt
@@ -0,0 +1,12 @@
+% bbGt version=3
+person 7 188 19 64 1 7 188 17 64 0 0
+person 432 137 96 239 0 0 0 0 0 0 0
+person 191 142 108 208 0 0 0 0 0 0 0
+person 388 138 66 213 1 388 140 65 211 0 0
+person 423 177 29 63 1 451 177 1 63 0 0
+person 317 184 34 62 0 0 0 0 0 0 0
+person 81 147 64 130 0 0 0 0 0 0 0
+person 224 147 53 196 1 270 147 1 196 0 0
+person 435 159 87 202 1 481 159 2 202 0 0
+people 46 190 38 60 0 0 0 0 0 0 0
+person 384 188 15 36 0 0 0 0 0 0 0
diff --git a/data/demo_pedestrian/annotations/set07_V002_I01379.txt b/data/demo_pedestrian/annotations/set07_V002_I01379.txt
@@ -0,0 +1,3 @@
+% bbGt version=3
+person 475 215 36 78 0 0 0 0 0 0 0
+person 507 214 33 69 0 0 0 0 0 0 0
diff --git a/data/demo_pedestrian/annotations/set08_V001_I02559.txt b/data/demo_pedestrian/annotations/set08_V001_I02559.txt
@@ -0,0 +1,17 @@
+% bbGt version=3
+person 107 217 25 62 0 0 0 0 0 0 0
+person 128 220 28 71 0 0 0 0 0 0 0
+person 253 219 32 83 0 0 0 0 0 0 0
+person 295 218 33 88 2 0 0 0 0 0 0
+person 423 223 36 77 0 0 0 0 0 0 0
+person 160 220 34 66 0 0 0 0 0 0 0
+person 300 218 57 145 0 0 0 0 0 0 0
+person 14 217 39 103 0 0 0 0 0 0 0
+people 447 217 77 57 0 0 0 0 0 0 0
+person 193 244 25 48 0 0 0 0 0 0 0
+person 336 224 46 102 0 0 0 0 0 0 0
+person 530 218 23 46 1 0 0 0 0 0 0
+people 563 219 76 86 0 0 0 0 0 0 0
+people 223 221 29 45 0 0 0 0 0 0 0
+person 64 217 22 46 0 0 0 0 0 0 0
+person 80 217 21 46 0 0 0 0 0 0 0
diff --git a/data/demo_pedestrian/annotations/set09_V000_I00899.txt b/data/demo_pedestrian/annotations/set09_V000_I00899.txt
@@ -0,0 +1,4 @@
+% bbGt version=3
+person 502 154 37 98 0 0 0 0 0 0 0
+person 483 162 23 77 0 0 0 0 0 0 0
+person 436 158 37 87 1 439 159 33 80 0 0
diff --git a/data/demo_pedestrian/annotations/set09_V000_I00939.txt b/data/demo_pedestrian/annotations/set09_V000_I00939.txt
@@ -0,0 +1,3 @@
+% bbGt version=3
+person 67 219 39 104 0 0 0 0 0 0 0
+person 105 221 36 84 0 0 0 0 0 0 0
diff --git a/data/demo_pedestrian/annotations/set10_V001_I01159.txt b/data/demo_pedestrian/annotations/set10_V001_I01159.txt
@@ -0,0 +1 @@
+% bbGt version=3
diff --git a/data/demo_pedestrian/color/set06_V002_I00779.jpg b/data/demo_pedestrian/color/set06_V002_I00779.jpg
diff --git a/data/demo_pedestrian/color/set07_V000_I01619.jpg b/data/demo_pedestrian/color/set07_V000_I01619.jpg
diff --git a/data/demo_pedestrian/color/set07_V000_I01829.jpg b/data/demo_pedestrian/color/set07_V000_I01829.jpg
diff --git a/data/demo_pedestrian/color/set07_V002_I01379.jpg b/data/demo_pedestrian/color/set07_V002_I01379.jpg
diff --git a/data/demo_pedestrian/color/set08_V001_I02559.jpg b/data/demo_pedestrian/color/set08_V001_I02559.jpg
diff --git a/data/demo_pedestrian/color/set09_V000_I00899.jpg b/data/demo_pedestrian/color/set09_V000_I00899.jpg
diff --git a/data/demo_pedestrian/color/set09_V000_I00939.jpg b/data/demo_pedestrian/color/set09_V000_I00939.jpg
diff --git a/data/demo_pedestrian/color/set10_V001_I01159.jpg b/data/demo_pedestrian/color/set10_V001_I01159.jpg
diff --git a/data/demo_pedestrian/thermal/set07_V002_I01379.jpg b/data/demo_pedestrian/thermal/set07_V002_I01379.jpg
diff --git a/data/demo_pedestrian/thermal/set08_V001_I02559.jpg b/data/demo_pedestrian/thermal/set08_V001_I02559.jpg
diff --git a/data/demo_pedestrian/thermal/set09_V000_I00939.jpg b/data/demo_pedestrian/thermal/set09_V000_I00939.jpg
diff --git a/data/demo_pedestrian/thermal/set10_V001_I01159.jpg b/data/demo_pedestrian/thermal/set10_V001_I01159.jpg
diff --git a/examples/caltech_result_1.png b/examples/caltech_result_1.png
diff --git a/examples/caltech_result_2.png b/examples/caltech_result_2.png
diff --git a/examples/caltech_result_3.png b/examples/caltech_result_3.png
diff --git a/examples/caltech_result_4.png b/examples/caltech_result_4.png
diff --git a/examples/fusion_models.png b/examples/fusion_models.png
diff --git a/examples/kaist_result_1.png b/examples/kaist_result_1.png
diff --git a/examples/kaist_result_2.png b/examples/kaist_result_2.png