Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qiqqa Expedition Explorer: clicking Refresh seems to (re)trigger OCR activity for many files in the library #73

Open
GerHobbelt opened this issue Oct 3, 2019 · 1 comment
Labels
🐛bug Something isn't working 🕵investigate Needs further analysis to find the root cause. ⛷performance Anything that's related to UX: speed of response; I/O speed, etc.

Comments

@GerHobbelt
Copy link
Collaborator

See https://github.com/GerHobbelt/qiqqa-open-source/issues/1

GerHobbelt added a commit to GerHobbelt/qiqqa-open-source that referenced this issue Oct 3, 2019
…/qiqqa-open-source/issues/1 : QiqqaOCR is rather fruity when it comes to generating rectangle areas to OCR. This is now sanitized.
GerHobbelt added a commit to GerHobbelt/qiqqa-open-source that referenced this issue Oct 3, 2019
…/qiqqa-open-source/issues/1 : QiqqaOCR is rather fruity when it comes to generating rectangle areas to OCR. This is now sanitized.
@GerHobbelt GerHobbelt added 🐛bug Something isn't working 🤔question Further information is requested or this is a support question ⛷performance Anything that's related to UX: speed of response; I/O speed, etc. labels Oct 4, 2019
@GerHobbelt
Copy link
Collaborator Author

This issue is partly solved by the fix for one important root cause having been added to release v82: see also #74

The other reason this is happening is that Qiqqa does a full DB scan in this scenario and any page which has not produced any words via the OCR process will trigger a re-execution of that OCR activity.

Naturally this will retrigger OCR for any document which has empty or image-only pages without any text -- then it's not QiqqaOCR's fault that nothing was produced, yet still OCR is retriggered.

@GerHobbelt GerHobbelt mentioned this issue Oct 4, 2019
@GerHobbelt GerHobbelt added 🕵investigate Needs further analysis to find the root cause. and removed 🤔question Further information is requested or this is a support question labels Oct 4, 2019
@GerHobbelt GerHobbelt changed the title Qiqqa Explorer: clicking Refresh seems to (re)trigger OCR activity for many files in the library Qiqqa Expedition Explorer: clicking Refresh seems to (re)trigger OCR activity for many files in the library Oct 9, 2019
@GerHobbelt GerHobbelt added this to the Our Glorious Future milestone Oct 9, 2019
GerHobbelt added a commit to GerHobbelt/qiqqa-open-source that referenced this issue Mar 23, 2020
… SINGLE don't deliver due to, for example, encrypted PDF source. This is a temporary hack to ensure Qiqqa doesn't repeat OCR activities ad nauseam (jimmejardine#129 , jimmejardine#135 , jimmejardine#73 , etc.)

- the previously added extra OCR text files' sanity checks (zero-sized areas of words, etc.) seems to pay off. At least we've observed quite a few OCR files/pages being retriggered for OCR as Qiqqa uncovers these zero-sized word areas while refreshing for Expeditions
- added a few more UI-thread-or-not Assertions.
GerHobbelt added a commit to GerHobbelt/qiqqa-open-source that referenced this issue Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛bug Something isn't working 🕵investigate Needs further analysis to find the root cause. ⛷performance Anything that's related to UX: speed of response; I/O speed, etc.
Projects
None yet
Development

No branches or pull requests

1 participant