Audio preprocessing #76

predestination · 2019-09-17T08:29:35Z

Hey, what are the possible Audio Pre-processing steps that can be used to improve transcript quality? Is there any library in python for denoising or audio enhancement without using deep learning ( as it is taking lot of time for a small audio clip). ?

tonanhngo · 2019-09-18T22:09:10Z

Hi, if you expect most of your input is noisy or is unique in certain ways (like speaker accent, background noise), then it's better to train the custom acoustic model with this type of audio. The IBM Debater uses this approach and was able to reduce the error rate to ~5%. If you have a few audio clips and want to do noise reduction, I did a quick search and saw a few options:

https://pypi.org/project/noisereduce/
https://pypi.org/project/logmmse/
https://docs.scipy.org/doc/scipy/reference/tutorial/signal.html
But it appears you would need to have the right reference noise audio to process against.

predestination · 2019-09-19T07:45:10Z

Thank you for the reply, I tried noisereduce and logmmse earllier but it didn't improve the transcript quality. Will check the scipy signal.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio preprocessing #76

Audio preprocessing #76

predestination commented Sep 17, 2019

tonanhngo commented Sep 18, 2019

predestination commented Sep 19, 2019

Audio preprocessing #76

Audio preprocessing #76

Comments

predestination commented Sep 17, 2019

tonanhngo commented Sep 18, 2019

predestination commented Sep 19, 2019