GitHub - Jenarthanan14/Unified-Voice-Embedding-Using-Multi-task-Learning

Hive-MTL: Unified Voice Embedding through Multi-task Learning.

Speech technologies is one of the evolving and highly demanded area for the past few decades due to the huge progress brought by machine learning technology. Especially the past decade has brought tremendous progress which includes the introduction of conversational agents. In this work we describe a multi-task deep metric learning system to learn a single unified audio embedding which can be used to power our multiple audio/speaker specific tasks. The solution we present not only allows us to train for multiple application objectives in a single deep neural network architecture, but takes advantage of correlated information in the combination of all training data from each application to generate a unified embedding that outperforms all specialized embeddings previously deployed for audio/speaker specific task.

Architecture Diagram

Getting started

Install dependencies

Requirements

tensorflow>=2.0
keras>=2.3.1
python>=3.6

pip install -r requirements.txt

If you see this error: libsndfile not found, run this: sudo apt-get install libsndfile-dev.

Training

The code for training is available in this repository.

sudo chmod -R 777 hive-mtl/ # Give write permision to hive-mtl
pip uninstall -y tensorflow && pip install tensorflow-gpu
./hive-mtl download_librispeech # Download Librispeech dataset
./hive-mtl build_mfcc
./hive-mtl build_model_inputs
./hive-mtl train_mtl

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

Authors

Rajenthiran Jenarthanan
Lakshikka Sithamparanathan
Saranya Uthayakumar

See also the list of contributors who participated in this project.

References

Deep Speaker : An End-to-End Neural Speaker Embedding System by Chao Li, Xiaokong Ma, Bing Jiang, Xiangang Li.

Acknowledgments

Ketharan Suntharam
Sathiyakugan Balakirshnan

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Data Set		Data Set
MTL base model		MTL base model
TestModel1		TestModel1
clustering		clustering
hive-mtl		hive-mtl
spectrogram		spectrogram
Architecture.png		Architecture.png
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hive-MTL: Unified Voice Embedding through Multi-task Learning.

Architecture Diagram

Getting started

Install dependencies

Requirements

Training

Contributing

Authors

References

Acknowledgments

About

Releases

Packages

Contributors 3

Languages

License

Jenarthanan14/Unified-Voice-Embedding-Using-Multi-task-Learning

Folders and files

Latest commit

History

Repository files navigation

Hive-MTL: Unified Voice Embedding through Multi-task Learning.

Architecture Diagram

Getting started

Install dependencies

Requirements

Training

Contributing

Authors

References

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages