Orient images before processing OCR #324

Rob192 · 2021-06-22T19:24:52Z

Following our discussion on #225 I propose the following PR to orientate the document before the processing with the ocr_predictor. In my experience, this helps both the detection_predictorand the recognition_predictor.

fg-mindee

Thanks a lot for the PR 🙏

Two high-level comments:

might be better to split your main function & focus this PR on angle estimation (as a standalone feature with your method, which we'll integrate in the predictors in another PR)
as mentioned, the angle estimation part is not a reading feature so we'd better move this to doctr.documents.utils or doctr.models.utils 🤷‍♂️

Once this is done, we're gonna have to add typing annotations & a small unittest 👍

Let me know what you think !

doctr/documents/reader.py

Rob192 · 2021-06-23T12:31:24Z

Hello @fg-mindee
Thank you for the throrough review. I implemented most of you comments. I am still thinking how to do two things :

I need to find a way to build back the original image at the end of the OCRPredictor. My first idea would be to change the DocumentBuilder to accept a page_angle parameter that would describe the global page orientation ? I would then modify visualize_page in doctr.utils.visualization to rotate back everything. Please tell me your thoughts on this matter, I am afraid I do not have the complete picture here.
I still need to add some testing. I will do that ASAP.

fg-mindee

Thanks for the edits! First, could you merge "main" into your branch? There are some conflicts apparently. Sorry my previous review wasn't clear enough, but I believe it would be better to consider several PRs here: 1 to estimate the orientation (this one), another later on to integrate it in the predictor and making sure we can reproject the end results, and a last one to check whether we should make this a default by doing a performance check.

I might be wrong but I suspect that this method might work quite well for some specific documents with well-defined lines (french ID cards for instance), but it would lack generalization to other use cases. So let's investigate this carefully :) (and make unittests for each PR)

doctr/models/_utils.py

doctr/models/utils.py

doctr/models/_utils.py

Rob192 · 2021-07-01T08:08:19Z

Hello @fg-mindee
Thank you for your detailled review. I split the PRs into three different one.
Could you please point me to a dataset for benchmarking the changes for the image rotation ? Or even better, to an evaluation script that you would use for the release of an update.
Thanks :)

codecov · 2021-07-01T13:42:00Z

Codecov Report

Merging #324 (98c7efd) into main (5e16bf4) will increase coverage by 0.53%.
The diff coverage is 96.00%.

@@            Coverage Diff             @@
##             main     #324      +/-   ##
==========================================
+ Coverage   94.22%   94.76%   +0.53%     
==========================================
  Files          66       83      +17     
  Lines        2788     3359     +571     
==========================================
+ Hits         2627     3183     +556     
- Misses        161      176      +15

Flag	Coverage Δ
unittests	`94.76% <96.00%> (+0.53%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
doctr/models/_utils.py	`93.93% <96.00%> (+0.60%)`	⬆️
doctr/datasets/utils.py	`92.68% <0.00%> (-1.77%)`	⬇️
doctr/utils/geometry.py	`100.00% <0.00%> (ø)`
doctr/models/__init__.py	`100.00% <0.00%> (ø)`
doctr/models/recognition/zoo.py	`100.00% <0.00%> (ø)`
doctr/models/recognition/__init__.py	`100.00% <0.00%> (ø)`
doctr/transforms/modules/__init__.py	`100.00% <0.00%> (ø)`
doctr/transforms/functional/__init__.py	`100.00% <0.00%> (ø)`
doctr/transforms/functional/tensorflow.py	`100.00% <0.00%> (ø)`
doctr/models/recognition/sar.py
... and 34 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5e16bf4...98c7efd. Read the comment docs.

fg-mindee

Almost good to merge! A few adjustments and the rest looks good to me!

test/test_models.py

doctr/models/_utils.py

Rob192 · 2021-07-01T14:02:43Z

Great !

fg-mindee

Thanks a lot for the edits!

fg-mindee · 2021-07-01T15:30:33Z

Hello @fg-mindee
Thank you for your detailled review. I split the PRs into three different one.
Could you please point me to a dataset for benchmarking the changes for the image rotation ? Or even better, to an evaluation script that you would use for the release of an update.
Thanks :)

The repo already has an evaluation script in scripts/evaluate.py with support of several public datasets! But the tricky part is that those aren't rotated, so we'll have to add some test-time augmentation to check this

Feat: First implementation of orient_image

fea6ab7

fg-mindee self-assigned this Jun 22, 2021

fg-mindee added type: enhancement Improvement module: models Related to doctr.models labels Jun 22, 2021

fg-mindee suggested changes Jun 22, 2021

View reviewed changes

doctr/documents/reader.py Outdated Show resolved Hide resolved

doctr/documents/reader.py Outdated Show resolved Hide resolved

doctr/documents/reader.py Outdated Show resolved Hide resolved

doctr/documents/reader.py Outdated Show resolved Hide resolved

refactor: move estimate_orientation inside OCRPredictor

9425f92

fg-mindee suggested changes Jun 24, 2021

View reviewed changes

Rob192 added 8 commits June 29, 2021 15:59

style: format code back to original style

fd788ae

refactor: correct typing

da5e040

feat: Improve estimation of orientation

941bffc

Merge branch 'main' into orient_image

a1859b2

fix: remove rotation of documents

b3200c2

style: fix formatting issues

66c8e5c

test: add test for estimate_orientation

0a86240

style: code cleaning

626950e

fg-mindee reviewed Jul 1, 2021

View reviewed changes

test/test_models.py Outdated Show resolved Hide resolved

doctr/models/_utils.py Outdated Show resolved Hide resolved

doctr/models/_utils.py Show resolved Hide resolved

Rob192 and others added 2 commits July 1, 2021 15:50

fix: typing annotations

7dfc831

fix: reuse mock_image to create mock_bitmap

98c7efd

fg-mindee added this to the 0.3.1 milestone Jul 1, 2021

fg-mindee approved these changes Jul 1, 2021

View reviewed changes

fg-mindee merged commit aaebbe9 into mindee:main Jul 1, 2021

fg-mindee mentioned this pull request Jul 1, 2021

[models] detect page orientation #225

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Orient images before processing OCR #324

Orient images before processing OCR #324

Rob192 commented Jun 22, 2021

fg-mindee left a comment •

edited

Loading

Rob192 commented Jun 23, 2021

fg-mindee left a comment •

edited

Loading

Rob192 commented Jul 1, 2021 •

edited

Loading

codecov bot commented Jul 1, 2021 •

edited

Loading

fg-mindee left a comment

Rob192 commented Jul 1, 2021

fg-mindee left a comment

fg-mindee commented Jul 1, 2021

Orient images before processing OCR #324

Orient images before processing OCR #324

Conversation

Rob192 commented Jun 22, 2021

fg-mindee left a comment • edited Loading

Choose a reason for hiding this comment

Rob192 commented Jun 23, 2021

fg-mindee left a comment • edited Loading

Choose a reason for hiding this comment

Rob192 commented Jul 1, 2021 • edited Loading

codecov bot commented Jul 1, 2021 • edited Loading

Codecov Report

fg-mindee left a comment

Choose a reason for hiding this comment

Rob192 commented Jul 1, 2021

fg-mindee left a comment

Choose a reason for hiding this comment

fg-mindee commented Jul 1, 2021

fg-mindee left a comment •

edited

Loading

fg-mindee left a comment •

edited

Loading

Rob192 commented Jul 1, 2021 •

edited

Loading

codecov bot commented Jul 1, 2021 •

edited

Loading