Speech Commands Recognition

Training Deep Learning models using Google Speech Commands Dataset, implemented in PyTorch.

Features

Training and testing basic ConvNets and TDNNs.
Standard Train, Test, Valid folders for the Google Speech Commands Dataset v0.02.
Dataset loader for standard Kaldi speech data folders (files and pipes).

Requirements

Python 3.6+
PyTorch
SoX

To install SoX on Mac with Homebrew:

brew install sox

on Linux:

sudo apt-get install sox

Usage

Google Speech Commands Dataset (v0.02)

To download and extract the Google Speech Commands Dataset run the following command:

./download_audio.sh

Training

Use python3 run.py --help for more parameters and options.

python3 run.py --arc VGG16 --checkpoint VGG16 --num_workers 10

Results (Isolated word recognition, Speech Commands v0.02, 36 words)

Accuracy results for the validation and test sets using the default parameters (VGG16) and with data augmentation (VGG16 + sp)

Model	Valid acc.	Test acc.	parameters and options
VGG16	96.3%	96.4%	default
VGG16 + sp	96.6%	96.7%	--train_path data/train_training_sp

The augmented training dataset train_training_sp is an speed perturbed version of the train_training dataset. It was obtained using the Kaldi script perturb_data_dir_speed_3way.sh

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
local		local
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download_audio.sh		download_audio.sh
gcommands_loader.py		gcommands_loader.py
mfsc.py		mfsc.py
model.py		model.py
run.py		run.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Commands Recognition

Features

Requirements

Usage

Google Speech Commands Dataset (v0.02)

Training

Results (Isolated word recognition, Speech Commands v0.02, 36 words)

About

Releases

Packages

Languages

License

jarfo/gcommands

Folders and files

Latest commit

History

Repository files navigation

Speech Commands Recognition

Features

Requirements

Usage

Google Speech Commands Dataset (v0.02)

Training

Results (Isolated word recognition, Speech Commands v0.02, 36 words)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages