Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
yenchiah committed May 18, 2020
1 parent 7e78b84 commit 01aadb0
Showing 1 changed file with 11 additions and 11 deletions.
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Yen-Chia Hsu, Ting-Hao (Kenneth) Huang, Ting-Yao Hu, Paul Dille, Sean Prendi, Ry

![This figure shows different types of videos (high-opacity smoke, low-opacity smoke, steam, and steam with smoke).](back-end/data/dataset/2020-02-24/smoke-type.gif)

The following figures show some examples of how the [I3D model](https://arxiv.org/abs/1705.07750) recognizes industrial smoke. The heatmaps (red and yellow areas on top of the images) show the locations where the model thinks have smoke emissions. The examples are from the testing set with different camera views, which means that the model never sees these views at the training stage. These visualizations are generated by using the [Grad-CAM](https://arxiv.org/abs/1610.02391) technique. The x-axis indicates time.
The following figures show how the [I3D model](https://arxiv.org/abs/1705.07750) recognizes industrial smoke. The heatmaps (red and yellow areas on top of the images) indicate where the model thinks have smoke emissions. The examples are from the testing set with different camera views, which means that the model never sees these views at the training stage. These visualizations are generated by using the [Grad-CAM](https://arxiv.org/abs/1610.02391) technique. The x-axis indicates time.

![Example of the smoke recognition result.](back-end/data/dataset/2020-02-24/0-1-2019-01-17-6007-928-6509-1430-180-180-3906-1547732890-1547733065-grad-cam.png)

Expand Down Expand Up @@ -196,7 +196,7 @@ sudo screen -X quit
# Keep looking at the screen log
tail -f screenlog.0
```
Process and save all videos into rgb frames (under deep-smoke-machine/back-end/data/rgb/) and optical flow frames (under deep-smoke-machine/back-end/data/flow/). Because computing optical flow takes a very long time, by default this script will only process rgb frames. If you need the optical flow frames, change the flow_type to 1 in the [process_videos.py](back-end/www/process_videos.py) script.
Process and save all videos into RGB frames (under deep-smoke-machine/back-end/data/rgb/) and optical flow frames (under deep-smoke-machine/back-end/data/flow/). Because computing optical flow takes a very long time, by default, this script will only process RGB frames. If you need the optical flow frames, change the flow_type to 1 in the [process_videos.py](back-end/www/process_videos.py) script.
```sh
python process_videos.py

Expand All @@ -219,7 +219,7 @@ python extract_features.py i3d-flow ../data/saved_i3d/af00751-i3d-flow/model/300
sh bg.sh python extract_features.py i3d-rgb
sh bg.sh python extract_features.py i3d-flow
```
Train the model with cross-validation on all dataset splits, using different hyper-parameters. The model will be trained on the training set and validated on the validation set. Pretrained weights are obtained from the [pytorch-i3d repository](https://github.com/piergiaj/pytorch-i3d). By default, the information of the trained I3D model will be placed in the deep-smoke-machine/back-end/data/saved_i3d/ folder. For the description of the models, please refer to our technical report. Note that by default the pytorch [DistributedDataParallel](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html) GPU parallel computing is enabled (see [i3d_learner.py](back-end/www/i3d_learner.py)).
Train the model with cross-validation on all dataset splits, using different hyper-parameters. The model will be trained on the training set and validated on the validation set. Pretrained weights are obtained from the [pytorch-i3d repository](https://github.com/piergiaj/pytorch-i3d). By default, the information of the trained I3D model will be placed in the deep-smoke-machine/back-end/data/saved_i3d/ folder. For the description of the models, please refer to our technical report. Note that by default the PyTorch [DistributedDataParallel](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html) GPU parallel computing is enabled (see [i3d_learner.py](back-end/www/i3d_learner.py)).
```sh
python train.py [method] [optional_model_path]

Expand Down Expand Up @@ -276,15 +276,15 @@ Recommended training strategy:
5. Repeat step 2, 3, and 4 until convergence

# <a name="code-structure"></a>Code infrastructure
This section explains the code infrastructure related to the I3D model training and testing in the [deep-smoke-machine/back-end/www/](back-end/www/) folder. Later in this section, I will describe how to build your own model and integrate it with the current pipeline. This code assumes that you are familiar with the [pytorch deep learning framework](https://pytorch.org/). If you do not know pytorch, I recommend checking [their tutorial page](https://pytorch.org/tutorials/) first.
This section explains the code infrastructure related to the I3D model training and testing in the [deep-smoke-machine/back-end/www/](back-end/www/) folder. Later in this section, I will describe how to build your own model and integrate it with the current pipeline. This code assumes that you are familiar with the [PyTorch deep learning framework](https://pytorch.org/). If you do not know PyTorch, I recommend checking [their tutorial page](https://pytorch.org/tutorials/) first.
- [base_learner.py](back-end/www/base_learner.py)
- The abstract class for creating model learners. You will need to implement the fit and test function. This script provides shared functions, such as model loading, model saving, data augmentation, and progress logging.
- [i3d_learner.py](back-end/www/i3d_learner.py)
- This script inherits the base_learner.py script for training the I3D models. This script contains code for back-propagation (e.g., loss function, learning rate scheduler, video batch loading) and GPU parallel computing (pytorch DistributedDataParallel).
- This script inherits the base_learner.py script for training the I3D models. This script contains code for back-propagation (e.g., loss function, learning rate scheduler, video batch loading) and GPU parallel computing (PyTorch DistributedDataParallel).
- [check_models.py](back-end/www/check_models.py)
- Check if a developed model runs in simple cases. This script is used for debugging when developing new models.
- [smoke_video_dataset.py](back-end/www/smoke_video_dataset.py)
- Definition of the dataset. This script inherits the pytorch Dataset class for creating the DataLoader, which can be used to provide batches iteratively when training the models.
- Definition of the dataset. This script inherits the PyTorch Dataset class for creating the DataLoader, which can be used to provide batches iteratively when training the models.
- [opencv_functional.py](back-end/www/opencv_functional.py)
- A special utility function that mimics [torchvision.transforms.functional](https://pytorch.org/docs/stable/_modules/torchvision/transforms/functional.html), designed for processing video frames and augmenting video data.
- [video_transforms.py](back-end/www/video_transforms.py)
Expand All @@ -303,7 +303,7 @@ If you want to develop your own model, here are the steps that I recommend.
We are working on a pipeline for recognizing smoke emissions using existing camera data, and we will release our best pre-trained models here.

# <a name="dataset"></a>Dataset
We include our publicly released dataset (a snapshot of the [smoke labeling tool](http://smoke.createlab.org/) on 2/24/2020) [metadata_02242020.json](back-end/data/dataset/2020-02-24/metadata_02242020.json) file under the deep-smoke-machine/back-end/data/dataset/ folder. The json file contains an array, with each element in the array representing the metadata for a video. Each element is a dictionary with keys and values, explained below:
We include our publicly released dataset (a snapshot of the [smoke labeling tool](http://smoke.createlab.org/) on 2/24/2020) [metadata_02242020.json](back-end/data/dataset/2020-02-24/metadata_02242020.json) file under the deep-smoke-machine/back-end/data/dataset/ folder. The JSON file contains an array, with each element in the array representing the metadata for a video. Each element is a dictionary with keys and values, explained below:
- camera_id
- ID of the camera (0 means [clairton1](http://mon.createlab.org/#v=3703.5,970,0.61,pts&t=456.42&ps=25&d=2020-04-06&s=clairton1&bt=20200406&et=20200407), 1 means [braddock1](http://mon.createlab.org/#v=2868.5,740.5,0.61,pts&t=540.67&ps=25&d=2020-04-07&s=braddock1&bt=20200407&et=20200408), and 2 means [westmifflin1](http://mon.createlab.org/#v=1722.89321,1348.42994,0.806,pts&t=704.33&ps=25&d=2020-04-07&s=westmifflin1&bt=20200407&et=20200408))
- view_id
Expand All @@ -326,11 +326,11 @@ We include our publicly released dataset (a snapshot of the [smoke labeling tool
- The format of the file_name is [camera_id]-[view_id]-[year]-[month]-[day]-[bound_left]-[bound_top]-[bound_right]-[bound_bottom]-[video_height]-[video_width]-[start_frame_number]-[start_epoch_time]-[end_epoch_time]
- bound_left, bound_top, bound_right, and bound_bottom mean the bounding box of the video clip in the panarama

Note that the url_root and url_part point to videos with 180 by 180 resolutions. We also provide a higher resolution (320 by 320) version of the videos. Simply replace the "/180/" with "/320/" in the url_root, and also replace the "-180-180-" with "-320-320-" in the url_part. For example, see the following:
Note that the url_root and url_part point to videos with 180 by 180 resolutions. We also provide a higher resolution (320 by 320) version of the videos. Replace the "/180/" with "/320/" in the url_root, and also replace the "-180-180-" with "-320-320-" in the url_part. For example, see the following:
- URL for the 180 by 180 version: https://smoke.createlab.org/videos/180/2019-06-24/0-7/0-7-2019-06-24-3504-1067-4125-1688-180-180-9722-1561410615-1561410790.mp4
- URL for the 320 by 320 version: https://smoke.createlab.org/videos/320/2019-06-24/0-7/0-7-2019-06-24-3504-1067-4125-1688-320-320-9722-1561410615-1561410790.mp4

Each video is reviewed by at lease two citizen science volunteers (or one researcher who received the [smoke reading training](https://www.eta-is-opacity.com/resources/method-9/)). Our technical report describes the details of the labeling and quality control mechanism. The state of the label (label_state and label_state_admin) in the metadata_02242020.json is briefly explained below.
Each video is reviewed by at least two citizen science volunteers (or one researcher who received the [smoke reading training](https://www.eta-is-opacity.com/resources/method-9/)). Our technical report describes the details of the labeling and quality control mechanism. The state of the label (label_state and label_state_admin) in the metadata_02242020.json is briefly explained below.
- 23 : strong positive
- Two volunteers both agree (or one researcher says) that the video has smoke.
- 16 : strong negative
Expand All @@ -348,7 +348,7 @@ Each video is reviewed by at lease two citizen science volunteers (or one resear
- -1 : no data, no discord
- No data. If label_state_admin is -1, it means that the label is produced solely by citizen science volunteers. If label_state is -1, it means that the label is produced solely by researchers. Otherwise, the label is jointly produced by both citizen science volunteers and researchers. Please refer to our technical report about these three cases.

After running the [split_metadata.py](back-end/www/split_metadata.py) script, the "label_state" and "label_state_admin" keys in the dictionary will be aggregated into the final label, represented by the new "label" key (see the json files in the generated deep-smoke-machine/back-end/data/split/ folder). Positive (value 1) and negative (value 0) labels mean if the video clip has smoke emissions or not, respectively.
After running the [split_metadata.py](back-end/www/split_metadata.py) script, the "label_state" and "label_state_admin" keys in the dictionary will be aggregated into the final label, represented by the new "label" key (see the JSON files in the generated deep-smoke-machine/back-end/data/split/ folder). Positive (value 1) and negative (value 0) labels mean if the video clip has smoke emissions or not, respectively.

Also, the dataset will be divided into several splits, based on camera views or dates. The file names (without ".json" file extension) are listed below. The Split S<sub>0</sub>, S<sub>1</sub>, S<sub>2</sub>, S<sub>3</sub>, S<sub>4</sub>, and S<sub>5</sub> correspond to the ones indicated in the technical report.

Expand Down Expand Up @@ -393,7 +393,7 @@ The following shows the split of S<sub>3</sub> by time sequence, where the farth
- Testing set of S<sub>3</sub>
- 2019-02-03, 2019-02-04, 2019-03-14, 2019-04-01, 2019-04-07, 2019-04-09, 2019-05-15, 2019-06-24, 2019-07-26, 2019-08-11

The dataset contains 12,567 clips with 19 distinct views from cameras on three sites that monitored three different industrial facilities. The clips are from 30 days that spans four seasons in two years in the daytime. The following provide examples and the distribution of labels for each camera view, with the format [camera_id]-[view_id]:
The dataset contains 12,567 clips with 19 distinct views from cameras on three sites that monitored three different industrial facilities. The clips are from 30 days that spans four seasons in two years in the daytime. The following provides examples and the distribution of labels for each camera view, with the format [camera_id]-[view_id]:

![This figure shows a part of the dataset.](back-end/data/dataset/2020-02-24/dataset_1.png)

Expand Down

0 comments on commit 01aadb0

Please sign in to comment.