Skip to content

Created an attention lstm based ocr model. Created API and a simple webpage that lets users upload cropped images and get json response.

Notifications You must be signed in to change notification settings

abhisharsinha/OCR_project

Repository files navigation

HOW TO RUN


#Python-3.7.5
pip install -r requirements.txt
To generate annotations for image dir (saves "image_filename predicted_text" into the output file):
./my_run.sh <Image_Dir> <Output_File_Path>

To run the web app:

  1. Using default model:
    python app.py ./exported-model/default-model
  2. Using modified model:
    python app.py ./exported-model/custom-model

PROJECT STRUCTURE


├── attention-ocr ---> library to train and test ocr model
├── checkpoints ---> contains model checkpoints at different training steps
├── datasets ---> contains training and test data
├── exported-model ---> contains models exported in SavedModel format
├── history ---> contains loss vs steps data and plot, and model config for every train run
├── static
   ├── index.html ---> webpage where user can upload images to get json response
├── test_logs ---> contains predicted vs actual texts for each test run
├── text_renderer ---> library to generate text images
└── utility_scripts
   ├── create_annotate_data.py ---> makes annotation file generated by text-renderer suitable for use by attention-ocr
   ├── crop_backgrounds.py ---> splits an image into 6x6 grid and saves each part
   ├── gen_email_mob.py ---> generates fake emails and numbers
   ├── get_alphabet.py ---> gives the alphabet set by traversing through labels
   ├── merge_tfrecords.py ---> merges multiple tfrecords files into a single file
   ├── only_english.py ---> removes non-english and non-numeric labels from annotations
   └── testing_inception.ipynb ---> used to get index of mixed5 layer from layers list of keras inceptionV3 model
├── app.py ---> flask web app that lets users upload text images and returns json response
├── daily_logs.md ---> daily log of completed tasks
├── instructions.txt ---> instructions to run app
├── make_predictions.py ---> generates predictions in "<img_name> <predicted_text>" format
├── my_run.sh ---> runs make_predictions.py
├── requirements.txt ---> requirements to run the web app
├── train_model.sh ---> used to train model

About

Created an attention lstm based ocr model. Created API and a simple webpage that lets users upload cropped images and get json response.

Topics

Resources

Stars

Watchers

Forks