-
Notifications
You must be signed in to change notification settings - Fork 722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQLite tests having intermittent failures in the threading test #1011
Comments
I enabled TSan and ran the test suite and it pointed to data races on properties of Enable-TSan.diffdiff --git i/Apollo.xcodeproj/xcshareddata/xcschemes/Apollo.xcscheme w/Apollo.xcodeproj/xcshareddata/xcschemes/Apollo.xcscheme
index 1e839927..a375da33 100644
--- i/Apollo.xcodeproj/xcshareddata/xcschemes/Apollo.xcscheme
+++ w/Apollo.xcodeproj/xcshareddata/xcschemes/Apollo.xcscheme
@@ -27,6 +27,7 @@
selectedDebuggerIdentifier = "Xcode.DebuggerFoundation.Debugger.LLDB"
selectedLauncherIdentifier = "Xcode.DebuggerFoundation.Launcher.LLDB"
shouldUseLaunchSchemeArgsEnv = "NO"
+ enableThreadSanitizer = "YES"
codeCoverageEnabled = "YES"
onlyGenerateCoverageForSpecifiedTargets = "YES">
<MacroExpansion>
@@ -81,6 +82,7 @@
buildConfiguration = "Debug"
selectedDebuggerIdentifier = "Xcode.DebuggerFoundation.Debugger.LLDB"
selectedLauncherIdentifier = "Xcode.DebuggerFoundation.Launcher.LLDB"
+ enableThreadSanitizer = "YES"
launchStyle = "0"
useCustomWorkingDirectory = "NO"
ignoresPersistentStateOnLaunch = "NO" This naïve introduction of a lock around GraphQLQueryWatcher-locking.diffdiff --git i/Sources/Apollo/GraphQLQueryWatcher.swift w/Sources/Apollo/GraphQLQueryWatcher.swift
index 52e1b6a3..7a1cb221 100644
--- i/Sources/Apollo/GraphQLQueryWatcher.swift
+++ w/Sources/Apollo/GraphQLQueryWatcher.swift
@@ -2,6 +2,8 @@ import Dispatch
/// A `GraphQLQueryWatcher` is responsible for watching the store, and calling the result handler with a new result whenever any of the data the previous result depends on changes.
public final class GraphQLQueryWatcher<Query: GraphQLQuery>: Cancellable, ApolloStoreSubscriber {
+ private let lock: NSRecursiveLock
+
weak var client: ApolloClientProtocol?
let query: Query
let resultHandler: GraphQLResultHandler<Query.Data>
@@ -21,6 +23,9 @@ public final class GraphQLQueryWatcher<Query: GraphQLQuery>: Cancellable, Apollo
public init(client: ApolloClientProtocol,
query: Query,
resultHandler: @escaping GraphQLResultHandler<Query.Data>) {
+ lock = NSRecursiveLock()
+ lock.name = "GraphQLQueryWatcher.lock"
+
self.client = client
self.query = query
self.resultHandler = resultHandler
@@ -34,9 +39,15 @@ public final class GraphQLQueryWatcher<Query: GraphQLQuery>: Cancellable, Apollo
}
func fetch(cachePolicy: CachePolicy) {
+ lock.lock()
+ defer { lock.unlock() }
+
fetching = client?.fetch(query: query, cachePolicy: cachePolicy, context: &context, queue: .main) { [weak self] result in
guard let `self` = self else { return }
+ self.lock.lock()
+ defer { self.lock.unlock() }
+
switch result {
case .success(let graphQLResult):
self.dependentKeys = graphQLResult.dependentKeys
@@ -50,6 +61,9 @@ public final class GraphQLQueryWatcher<Query: GraphQLQuery>: Cancellable, Apollo
/// Cancel any in progress fetching operations and unsubscribe from the store.
public func cancel() {
+ lock.lock()
+ defer { lock.unlock() }
+
fetching?.cancel()
client?.store.unsubscribe(self)
}
@@ -57,6 +71,9 @@ public final class GraphQLQueryWatcher<Query: GraphQLQuery>: Cancellable, Apollo
func store(_ store: ApolloStore,
didChangeKeys changedKeys: Set<CacheKey>,
context: UnsafeMutableRawPointer?) {
+ lock.lock()
+ defer { lock.unlock() }
+
if context == &self.context { return }
guard let dependentKeys = dependentKeys else { return } |
Looking at func fetch(cachePolicy: CachePolicy) {
fetching?.cancel() // <- missing?
fetching = client?.fetch(query: query, cachePolicy: cachePolicy, context: &context, queue: .main) { ... }
} |
@RolandasRazma - ooh, that could be it - and that's probably a better way to do it than locking the cache up. Will give it a try! |
Performing that cancellation exposes a data race on Enable-TSan-and-Cancel-Fetch.diffdiff --git i/Apollo.xcodeproj/xcshareddata/xcschemes/Apollo.xcscheme w/Apollo.xcodeproj/xcshareddata/xcschemes/Apollo.xcscheme
index 1e839927..a375da33 100644
--- i/Apollo.xcodeproj/xcshareddata/xcschemes/Apollo.xcscheme
+++ w/Apollo.xcodeproj/xcshareddata/xcschemes/Apollo.xcscheme
@@ -27,6 +27,7 @@
selectedDebuggerIdentifier = "Xcode.DebuggerFoundation.Debugger.LLDB"
selectedLauncherIdentifier = "Xcode.DebuggerFoundation.Launcher.LLDB"
shouldUseLaunchSchemeArgsEnv = "NO"
+ enableThreadSanitizer = "YES"
codeCoverageEnabled = "YES"
onlyGenerateCoverageForSpecifiedTargets = "YES">
<MacroExpansion>
@@ -81,6 +82,7 @@
buildConfiguration = "Debug"
selectedDebuggerIdentifier = "Xcode.DebuggerFoundation.Debugger.LLDB"
selectedLauncherIdentifier = "Xcode.DebuggerFoundation.Launcher.LLDB"
+ enableThreadSanitizer = "YES"
launchStyle = "0"
useCustomWorkingDirectory = "NO"
ignoresPersistentStateOnLaunch = "NO"
diff --git i/Sources/Apollo/GraphQLQueryWatcher.swift w/Sources/Apollo/GraphQLQueryWatcher.swift
index 52e1b6a3..4612c748 100644
--- i/Sources/Apollo/GraphQLQueryWatcher.swift
+++ w/Sources/Apollo/GraphQLQueryWatcher.swift
@@ -34,6 +34,7 @@ public final class GraphQLQueryWatcher<Query: GraphQLQuery>: Cancellable, Apollo
}
func fetch(cachePolicy: CachePolicy) {
+ fetching?.cancel()
fetching = client?.fetch(query: query, cachePolicy: cachePolicy, context: &context, queue: .main) { [weak self] result in
guard let `self` = self else { return }
|
Eesh, setting that in a getter seems like a bad idea in general, that's one hell of a side effect. It does seem like that cancellation helps the issue at hand (although I still have no idea why it was only happening on SQLite test runs). I'll see what I can figure out about why that getter is doing what it is. I might wind up doing this as separate PRs though. |
Interestingly even if I comment out that whole There's obviously something else going on there - my hunch is something in our |
Thanks to @martijnwalraven's hard work, all our tests are now passing with TSAN as of version 0.38.0! If you run into TSAN issues in that version or later, please open a new issue. Thank you! |
Background: While investigating #993 I determined that the intermittent failures of
testThreadedCache
appear to only be happening in the SQLite tests when theDispatchGroup
getsleave
called more times than it should. I added counters to the three closures and was seeing wayyyy more than the expected 1001leave
calls, which would cause a crash sinceenter
andleave
calls toDispatchGroup
need to be balanced.In the discussion for #1010, there was a general agreement that this test uncovered existing incorrect behavior rather than having the incorrect behavior introduced in the corresponding PR, and this test was updated to not run on SQLite tests in order to keep further forward movement unblocked.
I have no idea why a) This is happening b) It only seems to be happening on the SQLite tests, even though the client in the
testThreadedCache
is hard-coded to be an in-memory cache. The failures are intermittent, but they do still happen even when the test is run in isolation, so it doesn't seem like it's leakage from another test.Any help we can get on this would be really great.
The text was updated successfully, but these errors were encountered: