BibTeX Sniffer: Sometimes clicking on a PDF does show the PDF but DOES NOT import it in your library #54

GerHobbelt · 2019-08-17T15:49:07Z

Happens on rare occasions, e.g. bad connections or some other "weird failures".

The recurring theme here is that going back&forth in the browser pane is of no use: the PDF will load/show, but will NOT import. 😭

GerHobbelt · 2019-08-17T15:52:35Z

Fixed in commit SHA-1: c28eb11. I hope.

The side-effect is that for proper PDF fetches (via the PDFInterceptor class), the PDF is fetched from the website twice. However, this is not a problem as Qiqqa includes dedup logic hence does download the second copy, but then will discard it as URL+fingerprint will match with the PDF imported just before.

Snippet from the GoogleBibTexSnifferControl code:

// When PDFs are viewed in Gecko/Firefox and somehow things went wrong the first time around,
// but **not enough wrong** so to speak, then the PDF is **cached** by Gecko/FireFox and it WILL NOT
// show up as one of the URIs being fetched for a page reload! The PDF will only show up **here**,
// as a completely loaded document.
//
// Meanwhile the Acrobat Reader in there will cause the `ObjWebBrowser.CurrentPageHTML` to render
// something like this:
//
// <html><head><meta content="width=device-width; height=device-height;" name="viewport"></head>
// <body marginheight="0" marginwidth="0"><embed type="application/pdf" 
//    src ="https://escholarship.org/content/qt0cs6v2w7/qt0cs6v2w7.pdf" 
//    name ="plugin" height="100%" width="100%"></body></html>
//
// !Yay!          /sarcasm!/

- Gecko these days crashes on ContentDispositionXXXX member accesses: Exception thrown: 'System.Runtime.InteropServices.COMException' in Geckofx-Core.dll I'm not sure why; the only change I know of is an update of MSVS2019. :-S - implement the logic for the BibTeXSniffer 'Has OCR' checkbox filter criterium. It's useful but the zillion file-accesses slow the response down too much to my taste. :-S

GerHobbelt · 2019-08-22T10:01:34Z

The fix impacts #52. Double-check that one before marking this one fixed.

GerHobbelt · 2019-08-26T18:54:47Z

Related: #56 -- another case of not fetching the PDF

GerHobbelt · 2019-08-27T23:22:20Z

I bet this got fixed as part of the #56 fix activity in commit SHA-1: b3f1f2d

GerHobbelt · 2019-08-27T23:24:06Z

Marking this one FIXED given the commit mentioned above: web imports shouldn't be silent no more. And when they are, I'ld better file a fresh PR.

…sty PDF URIs which weren't recognized as such before. Right now we're pretty aggressive as we fetch almost everything that crosses our path; once fetched we check if's actually a valid PDF file after all. CiteSeerX and other sites now deliver once again...

GerHobbelt · 2019-08-28T00:37:57Z

Closing ^{and decluttering the issue list so it stays workable for me}: fixed in https://github.com/GerHobbelt/qiqqa-open-source mainline=master branch, pending #15 / any maintainer rights/actions.

- Gecko these days crashes on ContentDispositionXXXX member accesses: Exception thrown: 'System.Runtime.InteropServices.COMException' in Geckofx-Core.dll I'm not sure why; the only change I know of is an update of MSVS2019. :-S - implement the logic for the BibTeXSniffer 'Has OCR' checkbox filter criterium. It's useful but the zillion file-accesses slow the response down too much to my taste. :-S

…sty PDF URIs which weren't recognized as such before. Right now we're pretty aggressive as we fetch almost everything that crosses our path; once fetched we check if's actually a valid PDF file after all. CiteSeerX and other sites now deliver once again...

- Gecko these days crashes on ContentDispositionXXXX member accesses: Exception thrown: 'System.Runtime.InteropServices.COMException' in Geckofx-Core.dll I'm not sure why; the only change I know of is an update of MSVS2019. :-S - implement the logic for the BibTeXSniffer 'Has OCR' checkbox filter criterium. It's useful but the zillion file-accesses slow the response down too much to my taste. :-S

…'t recognized as such before. Right now we're pretty aggressive as we fetch almost everything that crosses our path; once fetched we check if's actually a valid PDF file after all. CiteSeerX and other sites now deliver once again...

GerHobbelt changed the title ~~BibTeX Sniffer: Sometimes clicking on a PDF does show the PDF but DOES NOT import it in your library~~ ✅BibTeX Sniffer: Sometimes clicking on a PDF does show the PDF but DOES NOT import it in your library Aug 27, 2019

GerHobbelt closed this as completed Aug 28, 2019

This was referenced Sep 3, 2019

Qiqqa: crashes/fails to import PDF from ill-configured servers #63

Closed

BibTeX Sniffer: PDFs are not downloaded into the Qiqqa library when their URL does not include the "pdf" characters #67

Closed

GerHobbelt added the 🐛bug Something isn't working label Oct 4, 2019

GerHobbelt changed the title ~~✅BibTeX Sniffer: Sometimes clicking on a PDF does show the PDF but DOES NOT import it in your library~~ BibTeX Sniffer: Sometimes clicking on a PDF does show the PDF but DOES NOT import it in your library Oct 4, 2019

GerHobbelt added this to the v82 milestone Oct 4, 2019

GerHobbelt mentioned this issue Oct 5, 2019

v82pre: some PDFs are downloaded twice from Sniffer #83

Closed

GerHobbelt mentioned this issue Dec 10, 2019

upgrade the embedded browser (xulrunner) to the latest version #2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BibTeX Sniffer: Sometimes clicking on a PDF does show the PDF but DOES NOT import it in your library #54

BibTeX Sniffer: Sometimes clicking on a PDF does show the PDF but DOES NOT import it in your library #54

GerHobbelt commented Aug 17, 2019

GerHobbelt commented Aug 17, 2019 •

edited

Loading

GerHobbelt commented Aug 22, 2019 •

edited

Loading

GerHobbelt commented Aug 26, 2019 •

edited

Loading

GerHobbelt commented Aug 27, 2019

GerHobbelt commented Aug 27, 2019

GerHobbelt commented Aug 28, 2019

BibTeX Sniffer: Sometimes clicking on a PDF does show the PDF but DOES NOT import it in your library #54

BibTeX Sniffer: Sometimes clicking on a PDF does show the PDF but DOES NOT import it in your library #54

Comments

GerHobbelt commented Aug 17, 2019

GerHobbelt commented Aug 17, 2019 • edited Loading

GerHobbelt commented Aug 22, 2019 • edited Loading

GerHobbelt commented Aug 26, 2019 • edited Loading

GerHobbelt commented Aug 27, 2019

GerHobbelt commented Aug 27, 2019

GerHobbelt commented Aug 28, 2019

GerHobbelt commented Aug 17, 2019 •

edited

Loading

GerHobbelt commented Aug 22, 2019 •

edited

Loading

GerHobbelt commented Aug 26, 2019 •

edited

Loading