Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDFDocument is passed by reference across thread boundaries but is not threadsafe: Qiqqa crashes 💥 #101

Closed
GerHobbelt opened this issue Oct 9, 2019 · 2 comments
Labels
🐛bug Something isn't working duplicate This issue or pull request already exists 🕵investigate Needs further analysis to find the root cause.
Milestone

Comments

@GerHobbelt
Copy link
Collaborator

💥

Another instance of PDFDocument causing trouble due to missing protection for cross-thread-boundary travels, where a reader thread will still expect constant (= unmodified) Dictionary, List, whathaveyou type(s).

Extract for the exception at hand, this time in DeepClone()'s internals: Serialize there discovering an edit by another thread at the same time --> 💣 💥 ::

System.ArgumentException
  HResult=0x80070057
  Message=Destination array is not long enough to copy all the items in the collection. Check array index and length.
  Source=mscorlib
  StackTrace:
   at System.ThrowHelper.ThrowArgumentException(ExceptionResource resource)
   at System.Collections.Generic.Dictionary`2.CopyTo(KeyValuePair`2[] array, Int32 index)
   at System.Collections.Generic.Dictionary`2.GetObjectData(SerializationInfo info, StreamingContext context)
   at System.Runtime.Serialization.Formatters.Binary.WriteObjectInfo.InitSerialize(Object obj, ISurrogateSelector surrogateSelector, StreamingContext context, SerObjectInfoInit serObjectInfoInit, IFormatterConverter converter, ObjectWriter objectWriter, SerializationBinder binder)
   at System.Runtime.Serialization.Formatters.Binary.ObjectWriter.Write(WriteObjectInfo objectInfo, NameInfo memberNameInfo, NameInfo typeNameInfo)
   at System.Runtime.Serialization.Formatters.Binary.ObjectWriter.Serialize(Object graph, Header[] inHeaders, __BinaryWriter serWriter, Boolean fCheck)
   at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Serialize(Stream serializationStream, Object graph, Header[] headers, Boolean fCheck)
   at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Serialize(Stream serializationStream, Object graph)
   at Utilities.Cloning.DeepObjectCloner.DeepClone[T](T object_to_clone) in W:\Users\Ger\Projects\sites\library.visyond.gov\80\lib\tooling\qiqqa\Utilities\Cloning\DeepObjectCloner.cs:line 16
   at Utilities.Misc.DictionaryBasedObject.Clone() in W:\Users\Ger\Projects\sites\library.visyond.gov\80\lib\tooling\qiqqa\Utilities\Misc\DictionaryBasedObject.cs:line 172
   at Qiqqa.Documents.PDF.PDFDocument.QueueToStorage() in W:\Users\Ger\Projects\sites\library.visyond.gov\80\lib\tooling\qiqqa\Qiqqa\Documents\PDF\PDFDocument.cs:line 877
   at Qiqqa.Documents.PDF.PDFDocument.bindable_PropertyChanged(Object sender, PropertyChangedEventArgs e) in W:\Users\Ger\Projects\sites\library.visyond.gov\80\lib\tooling\qiqqa\Qiqqa\Documents\PDF\PDFDocument.cs:line 1002

Dictionary:
keys
    [0] "FileType"  string
    [1] "Fingerprint" string
    [2] "DateAddedToDatabase" string
    [3] "DateLastModified"  string
    [4] "DownloadLocation"  string
    [5] "BibTex"  string
    [6] "Comments"  string
    [7] "Tags"  string
    [8] "DateLastRead"  string
    [9] "AutoSuggested_PDFMetadata" string
    [10]  "AuthorsSuggested"  string
    [11]  "YearSuggested" string
    [12]  "AutoSuggested_OCRFrontPage"  string
    [13]  "TitleSuggested"  string
    [14]  "AutoSuggested_BibTeXSearch"  string

values
    [0] "pdf" object {string}
    [1] "1CC8863D52A759FF98D5622178BEBC088CFEBF"  object {string}
    [2] "20191009002444426" object {string}
    [3] "20191009002444426" object {string}
    [4] "W:\\Sopkonijn\\!QIQQA-pdf-watch-dir\\!1\\!from-old-drive\\Guest\\documents\\1\\1CC8863D52A759FF98D5622178BEBC088CFEBF.pdf" object {string}
    [5] null  object
    [6] null  object
    [7] ""  object {string}
    [8] null  object
    [9] true  object {bool}
    [10]  "GruhlA"  object {string}
    [11]  "2012"  object {string}
    [12]  true  object {bool}
    [13]  "A Novel Error-Tolerant Anonymous Linking Code" object {string}
    [14]  true  object {bool}

Preliminary Analysis

Same root cause as #96: PDFDocument (and probably the other serialized objects in Qiqqa code) is not protected against simultaneous access by multiple threads. (As described above, even a read operation, i.e. a list/table iteration, is already very dangerous in C#.

@GerHobbelt GerHobbelt added 🐛bug Something isn't working 🕵investigate Needs further analysis to find the root cause. labels Oct 9, 2019
@GerHobbelt GerHobbelt added this to the v82 milestone Oct 9, 2019
@GerHobbelt
Copy link
Collaborator Author

GerHobbelt added a commit to GerHobbelt/qiqqa-open-source that referenced this issue Oct 9, 2019
@GerHobbelt
Copy link
Collaborator Author

Considered a dupe of #96 as far as root cause is concerned. Further work will reference that one.

@GerHobbelt GerHobbelt added the duplicate This issue or pull request already exists label Oct 9, 2019
GerHobbelt added a commit to GerHobbelt/qiqqa-open-source that referenced this issue Oct 22, 2019
…ardine#112 ); PDF/OCR threads got locked up via WaitForUIThreadActivityDone() call inside a lock, which would indirectly result in other threads to run into that lock and the main thread locked up in WM message pipe handling triggering a FORCED FLUSH, which, under the hood, would run into that same lock and thus block forever. The lock footprint has been significantly reduced as we can do now that we already have made PDFDocument largely thread-safe ( jimmejardine#101 ).

Also moved the Application Shutting Down signalling forward in time: already signal this when the user/run-time zips through the 'Do you really want to exit' dialog, i.e. near the end of the Closing event for the main window.

`WaitForUIThreadActivityDone` now does not sit inside a lock any more so everyone is free to call it, while in shutdown state, when the WM message pipe is untrustworthy, we leave the relinquishing to standard lib `Thread.Sleep(50)` to cope with: the small delay should be negligible while we are guaranteed to not run into issues around the exact message pipe state: we also ran into issues invoking some flush/actions via a Dispatcher while the app was clearly shortly into shutdown, so we try to be safe there too.

Also patched the StatusManager to not clutter the message pipeline at shutdown by NOT showing any status updates in the UI any more once the user has closed the app window. The status messages will continue to be logged as usual, we only do *not* try to update UI any more. This saves a bundle in cross-thread dispatches in the termination phase of the app when large numbers of pending changes and/or libraries are flushed to disk.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛bug Something isn't working duplicate This issue or pull request already exists 🕵investigate Needs further analysis to find the root cause.
Projects
None yet
Development

No branches or pull requests

1 participant