This project focuses on two main parts: training a model to recognize American Sign Language (ASL) gestures using a dataset, and predicting gestures through a camera feed while converting predictions into speech.
The model.py
script is responsible for training a convolutional neural network (CNN) model to recognize ASL gestures. The key steps include:
- Importing necessary libraries.
- Loading and preprocessing the ASL gesture dataset.
- Defining and compiling the CNN model architecture.
- Training the model using the preprocessed dataset.
- Saving the trained model as
smnist.h5
.
To train the model, execute:
python model.py
The prediction.py
script predicts ASL gestures in real-time through a camera feed. It further converts these predictions into speech using the Google Text-to-Speech (gTTS) library. The script's steps are as follows:
-
Importing Required Libraries: Import the necessary libraries for the script.
-
Loading the Pre-trained CNN Model: Load the pre-trained CNN model (
smnist.h5
) for gesture recognition. -
Initializing the Camera Feed: Initialize the camera feed to capture video frames.
-
Capturing and Processing Video Frames: Continuously capture and process video frames from the camera.
-
Gesture Prediction: Press the Space key to predict the ASL gesture. The predicted gesture will be displayed on the screen.
-
Closing the Window: Press the ESC key to close the video feed window.
-
Converting Predictions into Speech using gTTS: Utilize the gTTS library to convert predictions into speech.
-
Playing the Generated Speech: Play the generated speech using an appropriate audio player.
To predict and convert ASL gestures, run the following command:
python prediction.py
- Python (version 3.x)
- Required libraries (install using pip):
pip install numpy pandas matplotlib seaborn keras scikit-learn opencv-python mediapipe gTTS
This project makes use of the Mediapipe library for hand gesture recognition and the Google Text-to-Speech (gTTS) library for speech synthesis. The ASL gesture dataset used for this project was obtained from Kaggle and serves as the primary dataset for training and testing the ASL gesture recognition model.
This project is licensed under the MIT License. Refer to the LICENSE
file for more information.
Feel free to contribute and adapt this project to enhance accessibility and communication for individuals using American Sign Language.