-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orient images before processing OCR #324
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR 🙏
Two high-level comments:
- might be better to split your main function & focus this PR on angle estimation (as a standalone feature with your method, which we'll integrate in the predictors in another PR)
- as mentioned, the angle estimation part is not a reading feature so we'd better move this to
doctr.documents.utils
ordoctr.models.utils
🤷♂️
Once this is done, we're gonna have to add typing annotations & a small unittest 👍
Let me know what you think !
Hello @fg-mindee
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the edits! First, could you merge "main" into your branch? There are some conflicts apparently. Sorry my previous review wasn't clear enough, but I believe it would be better to consider several PRs here: 1 to estimate the orientation (this one), another later on to integrate it in the predictor and making sure we can reproject the end results, and a last one to check whether we should make this a default by doing a performance check.
I might be wrong but I suspect that this method might work quite well for some specific documents with well-defined lines (french ID cards for instance), but it would lack generalization to other use cases. So let's investigate this carefully :) (and make unittests for each PR)
Hello @fg-mindee |
Codecov Report
@@ Coverage Diff @@
## main #324 +/- ##
==========================================
+ Coverage 94.22% 94.76% +0.53%
==========================================
Files 66 83 +17
Lines 2788 3359 +571
==========================================
+ Hits 2627 3183 +556
- Misses 161 176 +15
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost good to merge! A few adjustments and the rest looks good to me!
Great ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the edits!
The repo already has an evaluation script in |
Following our discussion on #225 I propose the following PR to orientate the document before the processing with the ocr_predictor. In my experience, this helps both the
detection_predictor
and therecognition_predictor
.