Extend full upgrade to any version upgrade of non-HA ES #5408

pebrc · 2022-02-23T11:48:13Z

Elasticsearch documentation implies that two node clusters cannot be upgraded in a rolling fashion because we would be removing more than 50% of the nodes during the upgrade. A stricter validation of this invariant introduced in 7.14 was causing the test failures we have seen in our e2e test pipelines once we had fixed the faulty non-HA upgrade test This affects any 7.x version but as of 7.14 there is an error and the nodes will not boot up.

In the light of this information this PR extends the full restart upgrade mode to any version upgrade. It also adjusts the setting of initial master nodes during the 6.x to 7.x transition to include two node clusters. This gives us a uniform way of handling those upgrades.

There is still a case where an upgrade can get stuck that this PR does not cover: if a user combines a version upgrade with an upscale/downscale operation e.g. by bumping the Elasticsearch version and renaming the nodeSet at the same time. This will lead to something that looks like a rolling upgrade as we are removing the old nodes and adding new nodes one by one. There is still a chance that the upgrade gets stuck in that scenario and I don't see any easy way to fix that (other than through documentation)

One potential way of how the upgrade might get stuck is as follows:

we create new node running in the new version but the image needs to be pulled or the startup is otherwise delayed
we delete one of the existing nodes (in accordance with the change budget)
the old node terminates before the new node joins the cluster and if the old node happened to be the master at the time the cluster now breaks down

I think this scenario can be avoided by setting a change budget with maxUnavailable: 0 or by avoiding the combination of nodeSet addition/removal with a version upgrade.

barkbay · 2022-02-24T13:25:30Z

the old node terminates before the new node joins the cluster and if the old node happened to be the master at the time the cluster now breaks down

I think I'm missing something, assuming the upscaled+renamed nodeSet is configured with 3 replicas, we would then still have 2 nodes running, which I think would be enough to trigger a new election ?
If the upscaled+renamed nodeSet is configured with 2 replicas it would be then restarted because isNonHACluster would return true Both situation seems to be covered here ?

pebrc · 2022-02-24T13:30:04Z

If the upscaled+renamed nodeSet is configured with 2 replicas it would be then restarted because isNonHACluster would return true Both situation seems to be covered here ?

I think the problem is that once you rename a nodeSet we are no longer executing the upgrade path of the reconciliation loop but we have instead two independent operations that are only coordinated through the maxUnavailable change budget: a downscale of the old nodeSet and an upscale of the new nodeSet. Both downscale and upscale logic do not benefit from the changes in this PR and thus the we are again in a situation where things can go wrong.

barkbay

LGTM from a code perspective.
I'm running TestVersionUpgrade* e2e tests locally, will ✅ once it's done.

barkbay · 2022-02-24T16:11:46Z

pkg/controller/elasticsearch/version/zen2/initial_master_nodes.go

@@ -75,7 +76,7 @@ func shouldSetInitialMasterNodes(es esv1.Elasticsearch, k8sClient k8s.Client, no
 		return true, nil
 	}
 	// - we're upgrading (effectively restarting) a single zen1 master to zen2


nit: I think this comment should be updated

barkbay · 2022-02-24T17:01:08Z

TestVersionUpgradeTwoNodes68xTo7x and TestVersionUpgradeTwoNodesToLatest7x failed:

=== RUN   TestVersionUpgradeTwoNodes68xTo7x/Stopping_to_pod_count_for_change_budget:_expect_to_stay_within_the_change_budget
    checks_budget.go:105: 
        	Error Trace:	checks_budget.go:105
        	            				watcher.go:83
        	Error:      	change budget violated
        	Test:       	TestVersionUpgradeTwoNodes68xTo7x/Stopping_to_pod_count_for_change_budget:_expect_to_stay_within_the_change_budget
        	Messages:   	ready pod count 0 when allowed min was 1

=== RUN   TestVersionUpgradeTwoNodesToLatest7x/Stopping_to_pod_count_for_change_budget:_expect_to_stay_within_the_change_budget
    checks_budget.go:105: 
        	Error Trace:	checks_budget.go:105
        	            				watcher.go:83
        	Error:      	change budget violated
        	Test:       	TestVersionUpgradeTwoNodesToLatest7x/Stopping_to_pod_count_for_change_budget:_expect_to_stay_within_the_change_budget
        	Messages:   	ready pod count 0 when allowed min was 1

I think we have to use the WithChangeBudget(2, 2) "trick":

diff --git a/test/e2e/es/version_upgrade_test.go b/test/e2e/es/version_upgrade_test.go
index 113fac858..63b9e8705 100644
--- a/test/e2e/es/version_upgrade_test.go
+++ b/test/e2e/es/version_upgrade_test.go
@@ -55,6 +55,10 @@ func TestVersionUpgradeTwoNodes68xTo7x(t *testing.T) {
        // due to minimum_master_nodes=2, the cluster is unavailable while the first master is upgraded
        initial := elasticsearch.NewBuilder("test-version-up-2-68x-to-7x").
                WithVersion(srcVersion).
+               // 7x non-HA upgrades cannot honour a change budget with maxUnavailable 1 because we are rolling both nodes at once
+               // setting the change budget accordingly allows this test to pass as otherwise the changeBudgetWatcher step would
+               // fail as it would detect a change budget violation.
+               WithChangeBudget(2, 2).
                WithESMasterDataNodes(2, elasticsearch.DefaultResources)
 
        mutated := initial.WithNoESTopology().
@@ -173,6 +177,10 @@ func TestVersionUpgradeTwoNodesToLatest7x(t *testing.T) {
 
        initial := elasticsearch.NewBuilder("test-version-up-2-to-7x").
                WithVersion(srcVersion).
+               // 7x non-HA upgrades cannot honour a change budget with maxUnavailable 1 because we are rolling both nodes at once
+               // setting the change budget accordingly allows this test to pass as otherwise the changeBudgetWatcher step would
+               // fail as it would detect a change budget violation.
+               WithChangeBudget(2, 2).
                WithESMasterDataNodes(2, elasticsearch.DefaultResources)
 
        mutated := initial.WithNoESTopology().

pebrc · 2022-02-24T17:02:59Z

I think we have to use the WithChangeBudget(2, 2) "trick"

Good catch. I thought I tested that but apparently not 😞

barkbay

LGTM !

Elasticsearch documentation implies that two node clusters cannot be upgraded in a rolling fashion because we would be removing more than 50% of the nodes during the upgrade. A stricter validation of this invariant introduced in 7.14 was causing the test failures we have seen in our e2e test pipelines once we had fixed the faulty non-HA upgrade test This affects any 7.x version but as of 7.14 there is an error and the nodes will not boot up. In the light of this information this PR extends the full rolling upgrade mode to any version upgrade. It also adjusts the setting of initial master nodes during the 6.x to 7.x transition to include two node clusters. This gives us a uniform way of handling those upgrades. There is still a case where an upgrade can get stuck that this change does not cover: if a user combines a version upgrade with an upscale/downscale operation e.g. by bumping the Elasticsearch version and renaming the nodeSet at the same time. This will lead to something that looks like a rolling upgrade as we are removing the old nodes and adding new nodes one by one. There is still a chance that the upgrade gets stuck in that scenario and I don't see any easy way to fix that (other than through documentation). The problem is that once you rename a nodeSet we are no longer executing the upgrade path of the reconciliation loop but we have instead two independent operations that are only coordinated through the maxUnavailable change budget: a downscale of the old nodeSet and an upscale of the new nodeSet. Both downscale and upscale logic do not benefit from the changes in this commit and thus the we are again in a situation where things can go wrong.

pebrc added 2 commits February 23, 2022 12:35

Extend full upgrade to any version upgrade of non-HA ES

b8f4a45

doc wording

07ed09a

pebrc added the >enhancement Enhancement of existing functionality label Feb 23, 2022

format imports

5e5e5ec

barkbay reviewed Feb 24, 2022

View reviewed changes

comment adjusted

e8f8b32

naemono added the v2.1.0 label Feb 24, 2022

Higher change budget for all node at once upgrades

59d943e

barkbay approved these changes Feb 24, 2022

View reviewed changes

pebrc merged commit bc4e28a into elastic:main Feb 24, 2022

pebrc mentioned this pull request Mar 2, 2022

E2E: Disable master change budget watcher on non-HA clusters #5426

Merged

pebrc mentioned this pull request Aug 1, 2024

Non-Graceful Cluster Rollout on Version Change #7979

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend full upgrade to any version upgrade of non-HA ES #5408

Extend full upgrade to any version upgrade of non-HA ES #5408

pebrc commented Feb 23, 2022 •

edited

Loading

barkbay commented Feb 24, 2022

pebrc commented Feb 24, 2022

barkbay left a comment

barkbay Feb 24, 2022

barkbay commented Feb 24, 2022

pebrc commented Feb 24, 2022

barkbay left a comment

Extend full upgrade to any version upgrade of non-HA ES #5408

Extend full upgrade to any version upgrade of non-HA ES #5408

Conversation

pebrc commented Feb 23, 2022 • edited Loading

barkbay commented Feb 24, 2022

pebrc commented Feb 24, 2022

barkbay left a comment

Choose a reason for hiding this comment

barkbay Feb 24, 2022

Choose a reason for hiding this comment

barkbay commented Feb 24, 2022

pebrc commented Feb 24, 2022

barkbay left a comment

Choose a reason for hiding this comment

pebrc commented Feb 23, 2022 •

edited

Loading