Skip to content

Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVION and PaddlePaddle. (将PaddleOCR模型做了转换,采用ONNXRuntime推理,速度很快)

License

Notifications You must be signed in to change notification settings

RapidAI/RapidOCR

Repository files navigation

Shows an illustrated sun in light mode and a moon with stars in dark mode.
 
Open source OCR for the security of the digital world
 

Open in Colab PyPI SemVer2.0

简体中文 | English

Introduction

💖 The fastest running, most widely supported, completely open source and free multi-platform, multi-language OCR known to support rapid offline deployment. It features ONNXRuntime inference engine inference, which is 4~5 times faster than PaddlePaddle inference engine and has no memory leak problem.

Supported Languages: The default is Chinese and English, other language recognition requires self-service conversion. For specific reference here.

Cause: PaddleOCR is not well engineered, and to make it easier for people to do OCR inference on various ends, we converted the model in PaddleOCR to ONNX format and ported it to various platforms using Python/C++/Java/C#.

Name Source: Light, fast, economical and smart. OCR technology based on deep learning technology focuses on artificial intelligence advantages and small models, with speed as the mission and effect as the leading role.

Usage:

  • If the existing model in the repo meets the requirements → RapidOCR deployment can be used.
  • Not meeting requirements → Based on PaddleOCR. Fine-tune your own data → RapidOCR deployment.

If this repo is helpful to you, please click on a small star ⭐ Bah!

Visualization

Demo

Installation

pip install rapidocr_onnxruntime

Usage

from rapidocr_onnxruntime import RapidOCR

engine = RapidOCR()

img_path = 'tests/test_files/ch_en_num.jpg'
result, elapse = engine(img_path)
print(result)
print(elapse)

Documentation

Full documentation can be found on docs, in Chinese.

Acknowledgements

  • Many thanks to DeliciaLaniD for fixing the misplaced start position of scan animation in ocrweb.
  • Many thanks to zhsunlight for the suggestion about parameterized call GPU reasoning and the careful and thoughtful testing.
  • Many thanks to lzh111222334 for fixing some bugs of rec preprocessing under python version.
  • Many thanks to AutumnSun1996 for the suggestion in the #42.
  • Many thanks to DeadWood8 for providing the document which packages rapidocr_web to exe by Nuitka.
  • Many thanks to Loovelj for fixing the bug of sorting the text boxes. For details see issue 75.

Code Contributors

Important

If you want to sponsor the project, you can directly click the Buy me a coffee image, please write a note (e.g. your github account name) to facilitate adding to the sponsorship list below.

Sponsor Applied Products
-

Citation

If you find this project useful in your research, please consider cite:

@misc{RapidOCR 2021,
    title={{Rapid OCR}: OCR Toolbox},
    author={RapidAI Team},
    howpublished = {\url{https://github.com/RapidAI/RapidOCR}},
    year={2021}
}

Stargazers over time

Stargazers over time

License

The copyright of the OCR model is held by Baidu, while the copyrights of all other engineering scripts are retained by the repository's owner.

This project is released under the Apache 2.0 license.