Skip to content

shakib1729/urban_sound_frontend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sound Classification using Images - Urban Sound Classification


This repository contains the frontend part of the web application implemented using ReactJS
Backend using Flask: https://github.com/shakib1729/urban_sound_backend
Jupyter Notebooks: https://github.com/shakib1729/urban-sound-cnn-jupyter

About the project.

The goal of this project is to classify urban sounds into ten classes ('air_conditioner', 'car_horn', 'children_playing', 'dog_bark', 'drilling', 'engine_idling', 'gun_shot', 'jackhammer', 'siren', 'street_music').
This project classifies the sounds by first getting the visual representation of the sound and then using a CNN classifier to classify the sounds.
These visual representations of sounds are called spectrograms.
In a spectrogram representation plot — one axis represents the time, the second axis represents frequencies and the colors represent magnitude (amplitude) of the observed frequency at a particular time.

For example, the Spectrogram of a 'siren' sound is:

The Spectrogram of a 'jackhammer' sound is:


Using these spectrogram images, we classify the sounds

Dataset:

UrbanSound8k dataset: https://urbansounddataset.weebly.com
It contains 8732 labeled sound excerpts (<=4s) of urban sounds from 10 classes: air_conditioner, car_horn, children_playing, dog_bark, drilling, enginge_idling, gun_shot, jackhammer, siren, and street_music.

Libraries/Frameworks used:

  1. Librosa - To load the sound files.
  2. Matplotlib - To save the spectrogram of the audio files.
  3. NumPy - For array manipulation.
  4. Keras - To build the Convolutional neural network and to load the image files.
  5. Scikit-learn - To split the dataset into training and testing part and also for analyzing the performance of the model using sklearn.metrics.
  6. Python Imaging Library (PIL) - The load_img() function of Keras loads an image into PIL format.
  7. Tensorflow - Keras uses TensorFlow for its backend.
  8. Flask - To deploy the CNN model.

  React - To build the front-end part of the web application


Implementation:

  1. Load the audio files using librosa and save their spectrograms using matplotlib.
  2. Create 'X' and 'Y' training dataset by loading the saved spectrograms using Keras.
  3. Build and train the CNN Model using training dataset. After training, save the model.
  4. Build API for the CNN Model using Flask which will serve as the backend of the web application.
  5. Build the frontend part of the web application using ReactJS which calls the API created using Flask.
  6. Deploy the web application on Heroku.

Results:

  1. Model containing 2 dropout layers (dropout rates as 0.5 and 0.5 respectively) and learning_rate=0.001 (default value):
    Train Accuracy: 64.6%
    Test Accuracy: 59.9%
    The plot of accuracy on the training and validation datasets over training epochs.:


    (This plot shows the accuracies while training where dropout layers are only activated for training dataset, hence train accuracy varies than when no dropout layer is activated)

  2. Model containing 1 dropout layer (dropout rate as 0.5) and learning_rate=0.001 (default value):
    Train Accuracy: 97%
    Test Accuracy: 79.4%
    The plot of accuracy on the training and validation datasets over training epochs.:



  3. Model containing 2 dropout layers (dropout rates as 0.5 and 0.5 respectively) and learning_rate=0.0005:
    Train Accuracy: 90.19%
    Test Accuracy: 80.2%
    The plot of accuracy on the training and validation datasets over training epochs.:



  4. Model containing 2 dropout layers (dropout rates as 0.3 and 0.5 respectively) and learning_rate=0.0005:
    Train Accuracy: 96.4%
    Test Accuracy: 83.3%
    The plot of accuracy on the training and validation datasets over training epochs.:



About

Sound Classification using Images - Urban Sound Classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published