Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

watcher: Fix integration tests to ensure correct start/stop of Watcher #35271

Merged
merged 5 commits into from
Nov 7, 2018

Conversation

jakelandis
Copy link
Contributor

@jakelandis jakelandis commented Nov 5, 2018

Ensure that Watcher is correctly started and stopped between tests for
SmokeTestWatcherWithSecurityIT,
SmokeTestWatcherWithSecurityClientYamlTestSuiteIT,
SmokeTestWatcherTestSuiteIT, WatcherRestIT,
XDocsClientYamlTestSuiteIT, and XPackRestIT

The change here is to throw an AssertionError instead of break; to
allow the assertBusy() to continue to busy wait until the desired
state is reached.

closes #33291, closes #29877, closes #34462, closes #30705, closes #34448

more details: #33291 (comment)

Note - I wasn't able to reproduce the actual test failures, but this is a strong candidate as the root cause and optimistically closing the open issues

EDIT: updated list of impacted classes
EDIT2: added XDocsClientYamlTestSuiteIT, and XPackRestIT and optimistically closing #34448

Ensure that Watcher is correctly started and stopped between tests for
SmokeTestWatcherWithSecurityIT and
SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.

The change here is to throw an `AssertionError` instead of `break;` to
allow the `assertBusy()` to continue to busy wait until the desired
state is reached.

closes elastic#33291
closes elastic#29877
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@spinscale
Copy link
Contributor

how did you relate #34462 to this?

@jakelandis
Copy link
Contributor Author

@spinscale - thanks for the review. I just realized I missed the same issue in two different classes, can you take another look ? e230732

how did you relate #34462 to this?

The association is merely that it is the impacted class and 503 hints at something not started. However, there isn't enough information on the issue to take any action (the stack trace is gone) and it is not reproducible, so I think it best to close it with this change since it impacts the this class and something to do with a service not started. If it occurs again, we can re-open.

FWIW, this is generally the logic I applied to all in the closes list.

@spinscale
Copy link
Contributor

LGTM if CI is happy, thanks for peeking into it!

@jakelandis
Copy link
Contributor Author

@spinscale - apologies for the death by a thousand pin pricks here... I found two more places. 8d75479

Also, add #34448 to the optimistic list of closes.

@jakelandis jakelandis changed the title watcher: Fix Smoketests to ensure correct start/stop of Watcher watcher: Fix integration tests to ensure correct start/stop of Watcher Nov 6, 2018
@jakelandis jakelandis merged commit d7a4cef into elastic:master Nov 7, 2018
jasontedor added a commit to martijnvg/elasticsearch that referenced this pull request Nov 8, 2018
* master: (24 commits)
  Replicate index settings to followers (elastic#35089)
  Rename RealmConfig.globalSettings() to settings() (elastic#35330)
  [TEST] Cleanup FileUserPasswdStoreTests (elastic#35329)
  Scripting: Add back lookup vars in score script (elastic#34833)
  watcher: Fix integration tests to ensure correct start/stop of Watcher (elastic#35271)
  Remove ALL shard check in CheckShrinkReadyStep (elastic#35346)
  Use soft-deleted docs to resolve strategy for engine operation (elastic#35230)
  [ILM] Check shard and relocation status in AllocationRoutedStep (elastic#35316)
  Ignore date ranges containing 'now' when pre-processing a percolator query (elastic#35160)
  Add a frozen engine implementation (elastic#34357)
  Put a fake allocation id on allocate stale primary command (elastic#34140)
  [CCR] Enforce auto follow pattern name restrictions (elastic#35197)
  [ILM] rolling upgrade tests (elastic#35328)
  [ML] Add Missing data checking class (elastic#35310)
  Apply `ignore_throttled` also to concrete indices (elastic#35335)
  Make version field names more meaningful  (elastic#35334)
  [CCR] Added HLRC support for pause follow API (elastic#35216)
  [Docs] Improve Convert Processor description (elastic#35280)
  [Painless] Removes extraneous compile method (elastic#35323)
  [CCR] Fail with a better error if leader index is red (elastic#35298)
  ...
@alpar-t
Copy link
Contributor

alpar-t commented Nov 8, 2018

@jakelandis I saw this failure on 6.5 also, please consider back-porting there as well.
I was about to do the back-ports but held off as there's a possibly related failure I described in #35361.

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Nov 8, 2018
* elastic/master: (25 commits)
  Fixes fast vector highlighter docs per issue 24318. (elastic#34190)
  [ML] Prevent notifications on deletion of a non existent job (elastic#35337)
  [CCR] Auto follow Coordinator fetch cluster state in system context (elastic#35120)
  Mute test for elastic#35361
  Preserve `date_histogram` format when aggregating on unmapped fields (elastic#35254)
  Test: Mute failing SSL test
  Allow unmapped fields in composite aggregations (elastic#35331)
  [RCI] Add IndexShardOperationPermits.asyncBlockOperations(ActionListener<Releasable>) (elastic#34902)
  HLRC: reindex API with wait_for_completion false (elastic#35202)
  Add docs on JNA temp directory not being noexec (elastic#35355)
  [CCR] Adjust list of dynamic index settings that should be replicated (elastic#35195)
  Replicate index settings to followers (elastic#35089)
  Rename RealmConfig.globalSettings() to settings() (elastic#35330)
  [TEST] Cleanup FileUserPasswdStoreTests (elastic#35329)
  Scripting: Add back lookup vars in score script (elastic#34833)
  watcher: Fix integration tests to ensure correct start/stop of Watcher (elastic#35271)
  Remove ALL shard check in CheckShrinkReadyStep (elastic#35346)
  Use soft-deleted docs to resolve strategy for engine operation (elastic#35230)
  [ILM] Check shard and relocation status in AllocationRoutedStep (elastic#35316)
  Ignore date ranges containing 'now' when pre-processing a percolator query (elastic#35160)
  ...
jakelandis added a commit that referenced this pull request Nov 8, 2018
#35271)

Ensure that Watcher is correctly started and stopped between tests for
SmokeTestWatcherWithSecurityIT,
SmokeTestWatcherWithSecurityClientYamlTestSuiteIT,
SmokeTestWatcherTestSuiteIT, WatcherRestIT,
XDocsClientYamlTestSuiteIT, and XPackRestIT

The change here is to throw an `AssertionError` instead of `break;` to
allow the `assertBusy()` to continue to busy wait until the desired
state is reached.

closes #33291, closes #29877, closes #34462, closes #30705, closes #34448
jakelandis added a commit that referenced this pull request Nov 8, 2018
#35271)

Ensure that Watcher is correctly started and stopped between tests for
SmokeTestWatcherWithSecurityIT,
SmokeTestWatcherWithSecurityClientYamlTestSuiteIT,
SmokeTestWatcherTestSuiteIT, WatcherRestIT,
XDocsClientYamlTestSuiteIT, and XPackRestIT

The change here is to throw an `AssertionError` instead of `break;` to
allow the `assertBusy()` to continue to busy wait until the desired
state is reached.

closes #33291, closes #29877, closes #34462, closes #30705, closes #34448
@jakelandis
Copy link
Contributor Author

@atorok pushed this back to 6.5 and 6.x, as on #35361 (comment) the linked failure was for build prior to including these changes. While not perfectly confident this change fixes all the issue, I am confident it does not introduce new issues.

pgomulka pushed a commit to pgomulka/elasticsearch that referenced this pull request Nov 13, 2018
elastic#35271)

Ensure that Watcher is correctly started and stopped between tests for
SmokeTestWatcherWithSecurityIT,
SmokeTestWatcherWithSecurityClientYamlTestSuiteIT,
SmokeTestWatcherTestSuiteIT, WatcherRestIT,
XDocsClientYamlTestSuiteIT, and XPackRestIT

The change here is to throw an `AssertionError` instead of `break;` to
allow the `assertBusy()` to continue to busy wait until the desired
state is reached.

closes elastic#33291, closes elastic#29877, closes elastic#34462, closes elastic#30705, closes elastic#34448
@jkakavas
Copy link
Member

@jakelandis
Copy link
Contributor Author

@jkakavas - I had optimistically closed a few of the non-reproducible issue that may have been fixed with this change. I re-opened #29877 (comment) which has the same signature as the referenced build failure.

pgomulka added a commit that referenced this pull request Jan 10, 2019
Reenabling the test on 6.x as  #35271 looks like fixed the problem. If that helps on 6.x it will be back ported to 6.6, 6.5 and previous versions where this test is muted.

closes #33753
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment