Skip to content

Flask API that receives ogg audio files and returns the text transcription

Notifications You must be signed in to change notification settings

rmazzine/AudioTranscriberAPI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Transcriber API

This is a simple flask API that accepts a ogg base64 data (although it may be compatible to other types of audio formats), converts to WAV (using librosa and soundfile) and then transcribe using vosk, returning the text transcribed.

The current code is with the Portuguese-BR model, however, it can be easily changed to other vosk model (https://alphacephei.com/vosk/models).

How to run (development env)

Install packages

pip install -r requirements

Go to flask API folder

cd ./flaskapp

Start flask server (http://localhost:5000)

flask run

How to run (production env)

Instead running a flask server, use gunicorn WSGI HTTP server

gunicorn -w 1 --bind 0.0.0.0:3800 wsgi

Create docker image

To create a docker image, build it with:

docker build -t audiotranscriberapi .

Then run it port-forwarding the required port

docker run -p 3800:3800 audiotranscriberapi

How to use

It's recommended to use an API tool like Postman.

On Headers: Include the key Content-Type with value application/json as we will send the base64 audio data using a JSON format.

In Body: Create a JSON where the data key has the base64 audio data, for example:

{
  "data": "BASE64DATA"
}

Finally on URL field, select the POST method and send the JSON to the following address: http://localhost:5000/transcribe.

If successful, it will return a JSON with code 200 and the transcribed text in data.

About

Flask API that receives ogg audio files and returns the text transcription

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published