Skip to content

A deep learning model utilizing CNN and LSTM to recognize activity from video. (To be used for bench-marking hardware accelerator)

Notifications You must be signed in to change notification settings

suraj-maniyar/Activity-Recognition-From-Video

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Activity-Recognition-From-Video

Original Paper

The original data used for training by the paper can be found here

The videos are taken from UCF-101 dataset. In this implementation I have focused on recognizing 7 activities from

The base model used for all the codes is VGG-16. First a classification model is trained on the dataset to only classify individual images without any temporal information. This model checkpoint is then used to initialize a LRCN model which takes into account temporal information from the frames.

Some sample images from the dataset are shown below:

Keras

The Keras_V1 code uses the traditional numpy arrays and model.fit() function for training. Though it achieves similar performance, the fit() method does not allow to use parallel processing for loading the data. To run the code: extract the downloaded zip file into the Keras_V1/data folder.

The Keras_V2 utilizes the model.fit_generator() functionality which is capable of using parallel processing for data loading and pre-processing. It also allowed me to train on larger dataset as the script loads the data in the memory batch-wise. Use the data provided in the following link as the training data which has been split into train, val and test set. Extract the data from the given link into the data/ folder of the main repository. This data will also be used by the PyTorch implementation.

PyTorch

The classification model is made end-to-end trainable.
The convolutional model is used to extract features from an image. Since I was unble to download the models directly from my HPC cluster, I have saved the convolutional part of the pretrained models from torchvision and resored it later in the code. The pretrained models can be downloaded from here. Extract the contents of the downloaded folder into the pretrained/ directory.

Results

References

About

A deep learning model utilizing CNN and LSTM to recognize activity from video. (To be used for bench-marking hardware accelerator)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages