Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't run in-memory cache tests during SQLite cache testing #1010

Merged
merged 2 commits into from
Feb 12, 2020

Conversation

designatednerd
Copy link
Contributor

It looks like there's some super-weirdness going on because the threaded cache test is hard-coded to use an in-memory cache, but there's a bunch of other SQLite caches going on.

I think this is #993, but I've said that before and been wrong. Let's spin the Wheel of CI!

@designatednerd
Copy link
Contributor Author

This is soooort of a cheat but @RolandasRazma @AnthonyMDev I think this is where the problem is coming from in terms of the tests being super flaky. I know there's A LOT more underlying stuff but this at least gets us to something shippable.

@RolandasRazma
Copy link
Contributor

but that just hide the problem, no?

@designatednerd
Copy link
Contributor Author

I mean, it does hide the problem - the issue is that I don't think the problem this is causing is a real one. It's only crashing out when trying to run watcher on an in-memory store with SQLite tests. I think what's happening is that there's stuff still finishing on other threads from other tests, and that's causing overcalling of the closures.

We definitely still need to take a look at what the shit is going on with watchers, particularly in regards to SQLite, that's making it do this, but I don't think it has anything to do with the changes that have been merged recently.

@designatednerd
Copy link
Contributor Author

Also I tried adding an autoreleasepool and there were definitely still a lot of instances where one or more of the three completion closures would get called more than 1001 times on the SQLite tests and only on the SQLite tests.

That's what, at the top level, is causing this crash, since you have to enter -> leave the same number of times for a group.

@RolandasRazma
Copy link
Contributor

I guess as it allows to move forward its fine, but I have bad feeling about this :)

@designatednerd
Copy link
Contributor Author

What's your specific concern in terms of what this could be hiding? I am more than willing to admit it if I'm wrong, but I need something more specific to investigate

@RolandasRazma
Copy link
Contributor

not saying you wrong, it just doesn't "feel right". Not talking in particular about this PR, more about "completion closures would get called more than 1001 times"

@designatednerd
Copy link
Contributor Author

Right- this test has definitely uncovered a problem, the question is whether it's a new problem or an old one.

@AnthonyMDev
Copy link
Contributor

Yeah. This 100% just hides a problem, not solves it. If you need to stop the flaky test from being flaky for now, go for it. But this is an issue that should still be looked into.

This test should actually be fine to run on the SQL Cache as well and it should pass or else there are other race conditions there.

If watch is being overcalled, that's another bug too... but regardless, I have no idea why watching an in memory cache would get overcalled from other tests running on a client using a totally separate store and cache for SQL. They shouldn't be interacting with each other at all? Am I missing something.

Lastly, if a test only fails when run with other tests, it means you have not isolated your tests properly. That's another big issue. Code from the SQL tests should have all finished executing before this test starts, and this test should be done with all threads before the next test starts. Threads from different tests should never be executing in parallel.

@designatednerd
Copy link
Contributor Author

I think there's definitely other race conditions going on in the SQLite stuff for sure. For what it's worth it is failing in isolation (though not consistently, similarly to how it's not failing consistently on CI), so I'm likely wrong on the cause of the over-calling.

Looking at the changes that touched anything anywhere near this in the comparison to 0.21.0, the only things that have changed around the store or cache were the PR that added this test and one around fixing a cache key bug that was hitting the failures in the same place as well.

@AnthonyMDev @RolandasRazma would you all be okay with me opening a new issue to come back to this problem since it appears to be existing incorrect behavior? Or is there something you're seeing here where the changes we've made could have caused this to happen?

@AnthonyMDev
Copy link
Contributor

Sounds good. Definitely should be another issue opened to address this. I wish I didn't have a full time job so I could really dig into this for you haha. My responsibilities at work are so crazy right now. :/

@designatednerd
Copy link
Contributor Author

LOL this is my full time job and I still don't have time to look into it, so I feel you.

@designatednerd
Copy link
Contributor Author

#1011 now covers the issues being glossed over here, and is tagged as help-wanted because seriously, what on earth is going on here.

Gonna merge this one.

@designatednerd designatednerd merged commit 56072f0 into master Feb 12, 2020
@RolandasRazma
Copy link
Contributor

Hi, I would definitely look into it if we would be seeing something odd in app, but otherwise it's as always - "not a priority"... You definitely should open issue with pointer what to disable (this PR) for tests to start failing again

@designatednerd
Copy link
Contributor Author

@RolandasRazma Yep, #1011!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants