Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance with high-res images #85

Open
bertsky opened this issue Aug 16, 2022 · 3 comments
Open

performance with high-res images #85

bertsky opened this issue Aug 16, 2022 · 3 comments
Labels
question Further information is requested

Comments

@bertsky
Copy link
Contributor

bertsky commented Aug 16, 2022

Sometimes the input comes with DPI 600 or beyond. It seems to me this makes eynollah become much slower. Larger resolution might be needed for newspapers, but there is always a point at which result quality does increase. I would assume that a single downscaling interpolation after import should not be too costly.

The documentation of allow_scaling says that it would also scale down images. But the implementation does not look like that's the case:

if dpi < DPI_THRESHOLD:
img_new, num_column_is_classified = self.calculate_width_height_by_columns(img, num_col, width_early, label_p_pred)
image_res = self.predict_enhancement(img_new)
is_image_enhanced = True
else:
num_column_is_classified = True
image_res = np.copy(img)
is_image_enhanced = False

IIUC only too small images get upsampled. I'd expect a secondary DPI_THRESHOLD2 at which downsampling would begin.

@vahidrezanezhad
Copy link
Member

Sometimes the input comes with DPI 600 or beyond. It seems to me this makes eynollah become much slower. Larger resolution might be needed for newspapers, but there is always a point at which result quality does increase. I would assume that a single downscaling interpolation after import should not be too costly.

The documentation of allow_scaling says that it would also scale down images. But the implementation does not look like that's the case:

if dpi < DPI_THRESHOLD:
img_new, num_column_is_classified = self.calculate_width_height_by_columns(img, num_col, width_early, label_p_pred)
image_res = self.predict_enhancement(img_new)
is_image_enhanced = True
else:
num_column_is_classified = True
image_res = np.copy(img)
is_image_enhanced = False

IIUC only too small images get upsampled. I'd expect a secondary DPI_THRESHOLD2 at which downsampling would begin.

Two points about your comment. First, a DPI of 600 can not alone make eynollah slower. The problem with high resolution documents is (without allow_scaling option) that they can not be scaled down automatically. The allow_scaling should be True and if columns are detected correctly then down scaling can be a case.

The second point, allow_scaling lets you to scale down for documents with DPI bigger than 300. But scaling down will happen if its needed. This means if scale of document is much bigger than of "training scale" then scaling down will be applied.

@bertsky
Copy link
Contributor Author

bertsky commented Aug 30, 2022

@vahidrezanezhad please help me understand:

First, a DPI of 600 can not alone make eynollah slower.

How is that? I can see lots of CPU-bound image processing. Most algorithms are O(n²). And even for the GPU-bound parts: they each need to downscale to the fixed input size of the respective model.

The problem with high resolution documents is (without allow_scaling option) that they can not be scaled down automatically.

Why not? Downsampling (with suitable interpolation algorithm) should be trivial – as opposed to upsampling, for which you built an elaborate model.

allow_scaling lets you to scale down for documents with DPI bigger than 300. But scaling down will happen if its needed. This means if scale of document is much bigger than of "training scale" then scaling down will be applied.

I am confused. Where does this actually happen?

@bertsky
Copy link
Contributor Author

bertsky commented Feb 16, 2023

I am confused. Where does this actually happen?

Here:

if self.allow_scaling:
img_org, img_res, is_image_enhanced = self.resize_image_with_column_classifier(is_image_enhanced, img_bin)

img_new, _ = self.calculate_width_height_by_columns(img, num_col, width_early, label_p_pred)

if label_p_pred[0][int(num_col - 1)] < 0.9 and img_w_new < width_early:
img_new = np.copy(img)
num_column_is_classified = False
else:
img_new = resize_image(img, img_h_new, img_w_new)
num_column_is_classified = True

(So, essentially, if the column detector is confident enough, there can be downsampling.)

@cneud cneud added the question Further information is requested label Aug 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants