This repository contains the frontend part of the web application implemented using ReactJS
Backend using Flask: https://github.com/shakib1729/urban_sound_backend
Jupyter Notebooks: https://github.com/shakib1729/urban-sound-cnn-jupyter
Deployed on Heroku: https://urban-sound.herokuapp.com/
The goal of this project is to classify urban sounds into ten classes ('air_conditioner', 'car_horn', 'children_playing', 'dog_bark', 'drilling', 'engine_idling', 'gun_shot', 'jackhammer', 'siren', 'street_music').
This project classifies the sounds by first getting the visual representation of the sound and then using a CNN classifier to classify the sounds.
These visual representations of sounds are called spectrograms.
In a spectrogram representation plot — one axis represents the time, the second axis represents frequencies and the colors represent magnitude (amplitude) of the observed frequency at a particular time.
For example, the Spectrogram of a 'siren' sound is:
Using these spectrogram images, we classify the sounds
UrbanSound8k dataset: https://urbansounddataset.weebly.com
It contains 8732 labeled sound excerpts (<=4s) of urban sounds from 10 classes: air_conditioner, car_horn, children_playing, dog_bark, drilling, enginge_idling, gun_shot, jackhammer, siren, and street_music.
- Librosa - To load the sound files.
- Matplotlib - To save the spectrogram of the audio files.
- NumPy - For array manipulation.
- Keras - To build the Convolutional neural network and to load the image files.
- Scikit-learn - To split the dataset into training and testing part and also for analyzing the performance of the model using sklearn.metrics.
- Python Imaging Library (PIL) - The load_img() function of Keras loads an image into PIL format.
- Tensorflow - Keras uses TensorFlow for its backend.
- Flask - To deploy the CNN model.
- Load the audio files using librosa and save their spectrograms using matplotlib.
- Create 'X' and 'Y' training dataset by loading the saved spectrograms using Keras.
- Build and train the CNN Model using training dataset. After training, save the model.
- Build API for the CNN Model using Flask which will serve as the backend of the web application.
- Build the frontend part of the web application using ReactJS which calls the API created using Flask.
- Deploy the web application on Heroku.
- Model containing 2 dropout layers (dropout rates as 0.5 and 0.5 respectively) and learning_rate=0.001 (default value):
Train Accuracy: 64.6%
Test Accuracy: 59.9%
The plot of accuracy on the training and validation datasets over training epochs.:
(This plot shows the accuracies while training where dropout layers are only activated for training dataset, hence train accuracy varies than when no dropout layer is activated)
- Model containing 1 dropout layer (dropout rate as 0.5) and learning_rate=0.001 (default value):
Train Accuracy: 97%
Test Accuracy: 79.4%
The plot of accuracy on the training and validation datasets over training epochs.:
- Model containing 2 dropout layers (dropout rates as 0.5 and 0.5 respectively) and learning_rate=0.0005:
Train Accuracy: 90.19%
Test Accuracy: 80.2%
The plot of accuracy on the training and validation datasets over training epochs.:
- Model containing 2 dropout layers (dropout rates as 0.3 and 0.5 respectively) and learning_rate=0.0005:
Train Accuracy: 96.4%
Test Accuracy: 83.3%
The plot of accuracy on the training and validation datasets over training epochs.: