-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat:Alignment during recontruction of image #1657
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1657 +/- ##
==========================================
+ Coverage 96.35% 96.40% +0.04%
==========================================
Files 164 164
Lines 7773 7780 +7
==========================================
+ Hits 7490 7500 +10
+ Misses 283 280 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Hi @SkaarFacee 👋, Thanks for the PR. |
@@ -38,14 +38,18 @@ def synthesize_page( | |||
# Draw each word | |||
for block in page["blocks"]: | |||
for line in block["lines"]: | |||
line_ymin = min(int(round(h * word["geometry"][0][1])) for word in line["words"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest the following:
def synthesize_page(
page: Dict[str, Any],
draw_proba: bool = False,
font_family: Optional[str] = None,
adjust_to_line: bool = False,
) -> np.ndarray:
"""Draw a the content of the element page (OCR response) on a blank page.
Args:
----
page: exported Page object to represent
draw_proba: if True, draw words in colors to represent confidence. Blue: p=1, red: p=0
font_size: size of the font, default font = 13
font_family: family of the font
adjust_to_line: if True, adjust y coordinates to line geometry
Returns:
-------
the synthesized page
"""
# Draw template
h, w = page["dimensions"]
response = 255 * np.ones((h, w, 3), dtype=np.int32)
# Draw each word
for block in page["blocks"]:
multiline = len(block["lines"]) > 1
for line in block["lines"]:
for word in line["words"]:
# Get absolute word geometry
(xmin, ymin), (xmax, ymax) = word["geometry"]
xmin, xmax = int(round(w * xmin)), int(round(w * xmax))
if multiline and adjust_to_line:
ymin = int(round(h * line["geometry"][0][1]))
ymax = int(round(h * line["geometry"][1][1]))
else:
ymin, ymax = int(round(h * ymin)), int(round(h * ymax))
In this case the user can still decide and adjusting makes only sense if we have lines (so resolve_lines=True
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I will do that right away
# White drawing context adapted to font size, 0.75 factor to convert pts --> pix | ||
font = get_font(font_family, int(0.75 * (ymax - ymin))) | ||
ymin, ymax = line_ymin, line_ymax | ||
calculate_font_size = int(0.75 * (ymax - ymin)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is interesting. I shall debug this and see why the issue exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there, what models did you use for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey :)
fast_base
and parseq
and db_mobilenet_v3_large
and crnn_mobilenet_v3_large
No description provided.