Perform Optical Character Recognition (OCR) on images using Tesseract OCR engine with OpenCV preprocessing.
This project provides a Python script to perform OCR on images using the Tesseract OCR engine with preprocessing done using OpenCV. It allows extracting text from images in various formats and provides options to display the processed image with bounding boxes around recognized text and save the output as a text file.
- Preprocessing: Uses OpenCV to preprocess the input image for better OCR accuracy.
- Text Extraction: Extracts text from images using Tesseract OCR engine.
- Output Formatting: Displays the recognized text and optionally the processed image with bounding boxes around recognized text.
- Output Saving: Saves the extracted text to a text file for further analysis or use.
- Live Camera OCR: Implement functionality to enable real-time OCR as an option.
- Clone the repository:
git clone https://github.com/real0x0a1/ocr-opencv.git
- Install dependencies:
pip3 install -r requirements.txt
- Run the script
main.py
:
python main.py
- Follow the prompts to provide the path to the image file and select options for display and output.
Contributions are welcome! Fork the repository and submit a pull request.
Please open an issue on the GitHub repository for any bugs or feature requests.
real0x0a1 (Ali)