-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some questions #1
Comments
Hi, Thank you for you feedback and questions. And I will explain my method here:
For your questions, I try to give some responses:
|
I'm not tested it yet, but I think if you train yolo for text detection it will give great result, after that even EasyOCR/Tesseract could be used for text recognition. My colleague use yolov7 for detecting texts in official documents, and it worked great. You could use it to finally completed the pipeline. You could use SynthTabNet dataset, for training since it contains bbox for each texts in the cells. |
Before I ask questions, let me report what I found when I test the model you've trained.
Questions:
Overall I actually impressed with training result of your model, even if it's only small part of Pub1M it's still impressive that it's not overfitted. I've trained PaddleOCR for table recognition and somehow it always overfitted.
The text was updated successfully, but these errors were encountered: