Enable to sort kube_control_plane including the first node #9866

HoKim98 · 2023-03-08T08:42:41Z

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind cleanup

What this PR does / why we need it:

The first kubernetes control plane node has a very important role in overall kubespray workflows. However, if the first node is out of order with the others, the kubernetes cluster configuration will fail catastrophically. For example, certificate loss due to accidental ETCD and Kubernetes SSL key renewal.

Keeping the order of the first nodes is very important, but there is always the potential for human error to cause problems. So, I want to force the dynamic discovery logic to consider one of the kubernetes control plane nodes that are still functioning as the first node, so that we don't run into issues with node order in the first place.

Fortunately, I have been able to find that these attempts are already being made locally. Therefore, this PR will be a generalization of them. (#7989)

> BEFORE

groups['kube_control_plane'][0]
groups['kube_control_plane'] | first

> AFTER

first_kube_control_plane
- If there is one or more running nodes: Choose the first one of them
- If there is no running nodes (fresh install): Choose the first one of kube_control_plane

And if control planes are changed via cluster.yml and etc, it will automatically update the cluster-info to the first control plane node.

> BUG FIXES

Fix a bug in parsing empty fallback_ips: Nodes without network adapters cannot assign fallback_ips. The problem is that if all nodes do not have network adapters, the value of fallback_ips_base becomes "---", so the value of fallback_ips, the result of yaml-parsing, becomes a str object. This bug occurs when testing molecule_docker.

Which issue(s) this PR fixes:

Fixes #3471 (only for kube_control_plane)
Fixes #9863

Special notes for your reviewer:

This PR has many changes for usage of kube_control_plane | first.

So many various tests are needed, but I can't be sure that this changes are complete yet, so I want to review with others.

Does this PR introduce a user-facing change?:

Changing the order of kube_control_plane is now allowed freely.
cluster-info configmap is automatically updated to the first kube_control_plane.

k8s-ci-robot · 2023-03-08T08:42:50Z

Hi @kerryeon. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

liupeng0518 · 2023-03-09T08:31:16Z

roles/network_plugin/calico/tasks/install.yml

@@ -158,7 +158,7 @@
        filename: "{{ kube_config_dir }}/kdd-crds.yml"
        state: "latest"
      when:
-        - inventory_hostname == groups['kube_control_plane'][0]
+        - inventory_hostname == first_kube_control_plane


It is not appropriate to apply these yaml files on every node here, because there may be conflicts and errors when applying them concurrently, and it also wastes time.

k8s-ci-robot · 2023-03-11T06:37:01Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-triage-robot · 2023-06-09T06:47:54Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2023-07-09T07:15:48Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle rotten
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

VannTen · 2023-11-18T15:12:32Z

roles/kubernetes/kubeadm/tasks/update_cluster_info.yml

@@ -0,0 +1,83 @@
+---
+- name: Load ca.crt
+  shell: set -o pipefail && cat "{{ kube_apiserver_client_cert }}" | base64 --wrap=0


That could use slurp rather than shell. If you have unwanted wrapping just remove it with a filter.

VannTen · 2023-11-18T15:18:40Z

roles/kubernetes/kubeadm/tasks/update_cluster_info.yml

+  changed_when: false
+
+- name: Get current kubernetes cluster-info
+  command: "{{ kubectl }} -n kube-public get configmap cluster-info -o jsonpath --template '{.data.kubeconfig}'"


Same here, why not use k8s.core.k8s_info ?

VannTen · 2023-11-18T15:19:26Z

roles/kubernetes/kubeadm/tasks/update_cluster_info.yml

+    - kube-cluster-info
+
+- name: Update kubernetes cluster-info
+  command:


Same as above.

VannTen · 2023-11-18T15:21:35Z

roles/kubespray-defaults/defaults/main.yaml

-kube_apiserver_global_endpoint: |-
-  {% if loadbalancer_apiserver is defined -%}
-      https://{{ apiserver_loadbalancer_domain_name }}:{{ loadbalancer_apiserver.port|default(kube_apiserver_port) }}
-  {%- elif loadbalancer_apiserver_localhost and (loadbalancer_apiserver_port is not defined or loadbalancer_apiserver_port == kube_apiserver_port) -%}
-      https://localhost:{{ kube_apiserver_port }}
-  {%- else -%}
-      https://{{ first_kube_control_plane_address }}:{{ kube_apiserver_port }}
-  {%- endif %}
-kube_apiserver_endpoint: |-
-  {% if loadbalancer_apiserver is defined -%}
-      https://{{ apiserver_loadbalancer_domain_name }}:{{ loadbalancer_apiserver.port|default(kube_apiserver_port) }}
-  {%- elif not is_kube_master and loadbalancer_apiserver_localhost -%}
-      https://localhost:{{ loadbalancer_apiserver_port|default(kube_apiserver_port) }}
-  {%- elif is_kube_master -%}
-      https://{{ kube_apiserver_bind_address | regex_replace('0\.0\.0\.0','127.0.0.1') }}:{{ kube_apiserver_port }}
-  {%- else -%}
-      https://{{ first_kube_control_plane_address }}:{{ kube_apiserver_port }}
-  {%- endif %}
+# kube_apiserver_global_endpoint: |-
+#   {% if loadbalancer_apiserver is defined -%}
+#       https://{{ apiserver_loadbalancer_domain_name }}:{{ loadbalancer_apiserver.port|default(kube_apiserver_port) }}
+#   {%- elif loadbalancer_apiserver_localhost and (loadbalancer_apiserver_port is not defined or loadbalancer_apiserver_port == kube_apiserver_port) -%}
+#       https://localhost:{{ kube_apiserver_port }}
+#   {%- else -%}
+#       https://{{ first_kube_control_plane_address }}:{{ kube_apiserver_port }}
+#   {%- endif %}
+# kube_apiserver_endpoint: |-
+#   {% if loadbalancer_apiserver is defined -%}
+#       https://{{ apiserver_loadbalancer_domain_name }}:{{ loadbalancer_apiserver.port|default(kube_apiserver_port) }}
+#   {%- elif not is_kube_master and loadbalancer_apiserver_localhost -%}
+#       https://localhost:{{ loadbalancer_apiserver_port|default(kube_apiserver_port) }}
+#   {%- elif is_kube_master -%}
+#       https://{{ kube_apiserver_bind_address | regex_replace('0\.0\.0\.0','127.0.0.1') }}:{{ kube_apiserver_port }}
+#   {%- else -%}
+#       https://{{ first_kube_control_plane_address }}:{{ kube_apiserver_port }}
+#   {%- endif %}


Why are you converting this to a fact ? Since variable are lazy that should not be necessary.

VannTen · 2023-11-18T15:22:44Z

roles/kubespray-defaults/tasks/fallback_ips.yml

+- name: create fallback_ips_parsed
+  set_fact:
+    fallback_ips_parsed: "{{ hostvars.localhost.fallback_ips_base | from_yaml }}"
+
 - name: set fallback_ips
  set_fact:
-    fallback_ips: "{{ hostvars.localhost.fallback_ips_base | from_yaml }}"
+    fallback_ips: "{{ fallback_ips_parsed if fallback_ips_parsed is mapping else {} }}"


It's not super clear why you need to do this, could you explain it ?

VannTen · 2023-11-18T15:25:48Z

tests/testcases/040_check-network-adv.yml

-      when: inventory_hostname == groups['kube_control_plane'][0]
+      when: inventory_hostname == groups['kube_control_plane'] | first


Is that necessary to make the PR works ? It does not change the semantic right ?

If not I'd drop those, the PR is already big enough ! ^^

k8s-ci-robot · 2023-11-18T15:26:12Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kerryeon
Once this PR has been reviewed and has the lgtm label, please assign cristicalin for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-triage-robot · 2024-01-19T06:01:45Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Reopen this PR with /reopen
Mark this PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2024-01-19T06:01:50Z

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Reopen this PR with /reopen

Mark this PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

VannTen · 2024-01-19T08:07:08Z

/reopen /remove-lifecycle rotten

k8s-ci-robot · 2024-01-19T08:07:13Z

@VannTen: Reopened this PR.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-triage-robot · 2024-04-18T08:31:57Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-05-18T08:41:20Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle rotten
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2024-06-17T09:03:46Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Reopen this PR with /reopen
Mark this PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2024-06-17T09:03:51Z

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Reopen this PR with /reopen

Mark this PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 8, 2023

k8s-ci-robot requested review from alijahnas and bozzo March 8, 2023 08:42

k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 8, 2023

k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Mar 8, 2023

Replace into first_kube_control_plane

e822eec

HoKim98 force-pushed the raplace-first-kube-control-plane branch from c92bb0d to e822eec Compare March 8, 2023 08:57

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 8, 2023

liupeng0518 reviewed Mar 9, 2023

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 11, 2023

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 9, 2023

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 9, 2023

VannTen requested changes Nov 18, 2023

View reviewed changes

k8s-ci-robot closed this Jan 19, 2024

k8s-ci-robot reopened this Jan 19, 2024

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 19, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 18, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 18, 2024

k8s-ci-robot closed this Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable to sort kube_control_plane including the first node #9866

Enable to sort kube_control_plane including the first node #9866

HoKim98 commented Mar 8, 2023 •

edited

Loading

k8s-ci-robot commented Mar 8, 2023

liupeng0518 Mar 9, 2023

k8s-ci-robot commented Mar 11, 2023

k8s-triage-robot commented Jun 9, 2023

k8s-triage-robot commented Jul 9, 2023

VannTen Nov 18, 2023

VannTen Nov 18, 2023

VannTen Nov 18, 2023

VannTen Nov 18, 2023

VannTen Nov 18, 2023

VannTen Nov 18, 2023

k8s-ci-robot commented Nov 18, 2023

k8s-triage-robot commented Jan 19, 2024

k8s-ci-robot commented Jan 19, 2024

VannTen commented Jan 19, 2024 via email

k8s-ci-robot commented Jan 19, 2024

k8s-triage-robot commented Apr 18, 2024

k8s-triage-robot commented May 18, 2024

k8s-triage-robot commented Jun 17, 2024

k8s-ci-robot commented Jun 17, 2024

		when: inventory_hostname == groups['kube_control_plane'][0]
		when: inventory_hostname == groups['kube_control_plane'] \| first

Enable to sort kube_control_plane including the first node #9866

Enable to sort kube_control_plane including the first node #9866

Conversation

HoKim98 commented Mar 8, 2023 • edited Loading

k8s-ci-robot commented Mar 8, 2023

liupeng0518 Mar 9, 2023

Choose a reason for hiding this comment

k8s-ci-robot commented Mar 11, 2023

k8s-triage-robot commented Jun 9, 2023

k8s-triage-robot commented Jul 9, 2023

VannTen Nov 18, 2023

Choose a reason for hiding this comment

VannTen Nov 18, 2023

Choose a reason for hiding this comment

VannTen Nov 18, 2023

Choose a reason for hiding this comment

VannTen Nov 18, 2023

Choose a reason for hiding this comment

VannTen Nov 18, 2023

Choose a reason for hiding this comment

VannTen Nov 18, 2023

Choose a reason for hiding this comment

k8s-ci-robot commented Nov 18, 2023

k8s-triage-robot commented Jan 19, 2024

k8s-ci-robot commented Jan 19, 2024

VannTen commented Jan 19, 2024 via email

k8s-ci-robot commented Jan 19, 2024

k8s-triage-robot commented Apr 18, 2024

k8s-triage-robot commented May 18, 2024

k8s-triage-robot commented Jun 17, 2024

k8s-ci-robot commented Jun 17, 2024

HoKim98 commented Mar 8, 2023 •

edited

Loading