Speech-Classification

This repository is made for the Machine Learning course project - 2022.

Introduction

The goal of this project is to implement an audio classification system, which:

first reads in an audio,
and then recognizes the class(label) of this audio.

Classes

The data is divided into four class based on emotion:

anger, happiness, sadness, neutral

Also The data is divided into two class based on sex:

male, female

Method

Features: MFCCs (Mel-frequency cepstral coefficients), Spectral Centroid, Spectral Bandwidth, Rolloff, Melspectrogram, Spectral Contrast, Spectral Flatness are computed from the raw audio using librosa package.

Classifier: SVM (Support Vector Machine) is adopted for gender classificatioin, and CNN (Convolutional Neural Network) is adopted for emotion classificatioin

Result

Gender Classificatioin

Train Acc.	Test Acc.	F1 Score.
0.999	0.95440	0.9543

Emotion Classificatioin

Train Acc.	Test Acc.
0.601	0.4742

Download Data

Voice data can be downloaded from here.

Install dependencies

In a Python3 virtual environment run:

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech-Classification

Introduction

Classes

Method

Result

Download Data

Install dependencies

About

Releases

Packages

Languages

MohammadRoodbari/Speech-Classification

Folders and files

Latest commit

History

Repository files navigation

Speech-Classification

Introduction

Classes

Method

Result

Download Data

Install dependencies

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages