Skip to content

Handwritten text is translated to the audio output of several languages using Deep learning and Google Translate

Notifications You must be signed in to change notification settings

BhargavMaganti/Text-to-speech

Repository files navigation

Text_to_Speech

A combination of Deep Learning and Google Translate to convert handwritten text to audio output.

This project takes documented and handwritten text as input and provides translated output in audio format available in 108 different languages. The backbone of the project is the handwritten text detection model which is trained using transfer learning on RESNET50.

The input for the model was combined from a dataset available on Kaggle with the MNIST dataset and all the images were resized to (32,32) . The total image count on which model was trained was 4,42,451 .

The model was trained for 50 epochs on SGD optimizer and training and validation accuracy of 96.53% and 96.81% respectively were recorded.

The classification report for every character:

The model was trained on Tensorflow 2.1.0 and OpenCV 4.2.0.

The trained model file is available on https://github.com/sanskar-hasija/Text_to_Speech/blob/main/Trained%20Model/model.h5

Also , with the help of Pytesseract library, documented text is converted and later translated . One example of documented detection is as follows:

The translated output for the above image is :

About

Handwritten text is translated to the audio output of several languages using Deep learning and Google Translate

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published