Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One-node clusters with a yellow health cannot be upgraded #4625

Closed
sebgl opened this issue Jul 9, 2021 · 2 comments · Fixed by #4787
Closed

One-node clusters with a yellow health cannot be upgraded #4625

sebgl opened this issue Jul 9, 2021 · 2 comments · Fixed by #4787
Assignees
Labels
>bug Something isn't working

Comments

@sebgl
Copy link
Contributor

sebgl commented Jul 9, 2021

Deploy a one-node cluster, and ingest data into it so that the cluster becomes yellow (indices with one extra replica).
One quick way to do that is to deploy a cluster with stack monitoring enabled, while having the monitoring data sent to the cluster itself.

Once that single node cluster is yellow, there is no way to upgrade it anymore. Any change to the Elasticsearch spec requiring a rolling upgrade will wait until the cluster becomes green:

2021-07-09T11:37:03.750+0200	INFO	driver	Cannot restart some nodes for upgrade at this time	{"service.version": "1.7.0-SNAPSHOT+30579acb", "namespace": "default", "es_name": "quickstart", "failed_predicates": {"if_yellow_only_restart_upgrading_nodes_with_unassigned_replicas":["quickstart-es-default-0"]}}

I think we should make an exception for single-node clusters. There is no point in waiting for them to report a green health before rotating the single Pod, since we know we'll break availability anyway?

@barkbay
Copy link
Contributor

barkbay commented Sep 9, 2021

Sorry, I'm late to the party. IIUC a generic way to think about this problem is: there are n nodes in a tier but number_of_replicas is set to m >= n. For example having an index with 1 replica moving to the warm tier while there's only 1 node in that tier will end up in the same situation. I can also think about an other case where there is 1 data node and 1 dedicated ml node. Do we want to allow the upgrade as the ml node can't hold the shard ?

I believe we don't want to go into that level of details, and just allow a single, multirole, node to be upgraded. However it would be great to mention this exception in our documentation: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-orchestration.html#k8s-orchestration-limitations as it might raise some questions about which cases are supported:


  • Rolling upgrades can only make progress if the Elasticsearch cluster health is green. There are exceptions to this rule if the cluster health is yellow and if the following conditions are satisfied:

    • A cluster version upgrade is in progress and some Pods are not up to date.
    • There are no initializing or relocating shards.
    • other conditions/exceptions here

@CamiloSierraH
Copy link
Contributor

Hey @barkbay !!

Do we want to allow the upgrade as the ml node can't hold the shard ?

I agree with you at 100% I think this will be great for monitoring clusters that most of then are using a single node and users do not change the default replica value, so most of them are yellow (this is what we can see 90% of the time in support).

As you mentioned we have ML, data_tiers, and also dedicated coordinator nodes, or dedicated master nodes.
In my opinion, the problem, if we manage those, is that we also have the shard allocation filtering feature that was used before we released the data tiering roles (so is well known by the community and remains relevant in the latest versions) and this allocation filtering will be a nightmare to manage as the user can set N attributes with N number of different values, and then assign those to the indices.

I will create a new PR to add the details on the documentation to mentioned that Rolling upgrades need a green cluster or list the exceptions that exist on the upgrade_predicates.go file.

I keep you posted!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Something isn't working
Projects
None yet
3 participants