ASR-Accuracy-Tool 🔈

🎙️ A powerful Flask-based web application that leverages the latest Hugging Face ASR models to provide real-time speech-to-text (STT) transcripts with an intuitive user interface for easy correction. Perfect for enhancing the quality of training datasets for ASR models, building awesome NLP Application driving by Accurate text data, and much more.

Screenshots 🎥 of Application

Home Page - It shows an simple form where you get to choose directory which contains your audio files. This could also be directory which contains even more directories. It allows both relative as well as absolute path.

Processing Page - This is a dynamic and real-time page based on celery background task that gets updated every 10 seconds with new transcriptions (if they are available). It shows you overall progress based on number of segments total possible. Additionally, it contains an editable column which can be used for corrections. It also allows user to listen to complete audio as they continue to generate.

🎬 Video Demo Coming Soon...

Features:

Real-time audio-to-text conversion using state-of-the-art ASR models from Hugging Face. User-friendly interface for reviewing and correcting transcripts. Seamless integration with Hugging Face's model hub for easy model selection and updates. Export corrected transcripts in common formats for training and analysis. Built with scalability in mind for handling large datasets.

Why Use It:

Enhance the accuracy of your ASR models by easily creating high-quality training datasets. Correct and fine-tune ASR transcripts with ease, all powered by cutting-edge Hugging Face models.

Stay Updated: ⭐

🔍 Stay tuned for regular updates as we incorporate the latest advancements in ASR technology!

To-Do Improvements 🚧

This project is open for community. You are welcome to join me. I am primarily focusing on the following improvements.

Add custom models for Speech Recognition
Add support to Mac & Windows Platforms
Memory Optimization of shared resources instead of single model instance per concurrent instance inside celery
Add support for more audio extensions
Auto Setup and configuration scripts which allows more robustness to changes
Improvement to this documentation

Other contributions are also welcome. It will be slightly less in priority but thanks a lot for your inputs.

Contributions Welcome:

👩‍💻 Welcome contributions from the community to make this tool even more powerful and accessible to everyone. Join me in creating a better ASR use-cases world!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
speech.py		speech.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ASR-Accuracy-Tool 🔈

Screenshots 🎥 of Application

Features:

Why Use It:

Stay Updated: ⭐

To-Do Improvements 🚧

Contributions Welcome:

About

Languages

License

inboxpraveen/ASR-Accuracy-Tool

Folders and files

Latest commit

History

Repository files navigation

ASR-Accuracy-Tool 🔈

Screenshots 🎥 of Application

Features:

Why Use It:

Stay Updated: ⭐

To-Do Improvements 🚧

Contributions Welcome:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages