Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: change default blockSize for calico #9055

Merged
merged 1 commit into from
Jul 19, 2022

Conversation

cyclinder
Copy link
Contributor

@cyclinder cyclinder commented Jul 5, 2022

Signed-off-by: cyclinder qifeng.guo@daocloud.io

What type of PR is this?

/kind feature

What this PR does / why we need it:

According to calico official recommendation, Default value of calico_blocksize_ipv4 is 26, and calico_blocksize_ipv6 is 122.

The current default value of calico_blocksize_ipv4 in kubespray is 24, in the case of cluster pod cidr is 18-bit mask, which means there are at most 64 blocks and each block has 256 addresses, which also equates to a maximum of 64 nodes in the cluster, otherwise there will be some problems, refer to projectcalico/calico#6160
Also, by default, each node has a maximum of 110 pod(kubelet_max_pods), and cannot use up all 256 addresses.

So we should adjust the default value of calico_block_size to 26, so that with 18-bit mask, there will be at most 2^(26-18)=256 blocks with 64 addresses per block. When a node has more than 64 pods, calico assigns 2 blocks to it (calico assigns one or more blocks to each node)

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Adjust the default value of calico blockSize ipv4 to 26, and ipv6 to 122.

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 5, 2022
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 5, 2022
@cristicalin
Copy link
Contributor

Hi @cyclinder, good idea but it looks like you might have broken some assumption in the CI tests, please address the CI job failures so the PR can be accepted.

@cyclinder
Copy link
Contributor Author

Hi @cyclinder, good idea but it looks like you might have broken some assumption in the CI tests, please address the CI job failures so the PR can be accepted.

Thanks! Let me see what's wrong.

@cyclinder
Copy link
Contributor Author

@cristicalin I found a ci bug:

- hosts: kube_node
tasks:
- name: Test tunl0 routes
shell: "set -o pipefail && ! /sbin/ip ro | grep '/26 via' | grep -v tunl0"
args:
executable: /bin/bash
when:
- (ipip|default(true) or cloud_provider is defined)
- kube_network_plugin|default('calico') == 'calico'

There are two changes needed here:

  • change grep '/26 via' to grep '/ {{ calico_pool_blocksize}} | default(26)'
  • change ipip|default(true) to calico_ipip_mode != Never

The reason CI kept works before was that the default calico_pool_blocksize was 24, so grep '/26 via' couldn't get anything.

If you think so, I'd fix it.

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 6, 2022
@cyclinder
Copy link
Contributor Author

https://github.com/kubernetes-sigs/kubespray/pull/9055/checks?check_run_id=7217609171

I don't know why the cluster created by this CI job still uses the 24-bit calico_pool_blockSize, which causes calico to fail when reapplying the ippool, since the current default blockSize is 26.

I checked the job for packet_debian11-calico, which already uses a blocksize of 26 bits.

Signed-off-by: cyclinder qifeng.guo@daocloud.io
@cyclinder
Copy link
Contributor Author

Hi @cristicalin , All the tests passed, PTAL :)

@oomichi
Copy link
Contributor

oomichi commented Jul 8, 2022

Nice work!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 8, 2022
@@ -19,7 +19,7 @@ calico_cni_name: k8s-pod-network
# calico_pool_name: "default-pool"

# add default ippool blockSize (defaults kube_network_node_prefix)
# calico_pool_blocksize: 24
calico_pool_blocksize: 26
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it appropriate to modify the existing env directly when the kubespray version is upgraded?

kubelet_max_pods: 250

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's appropriate in most scenarios, but there may be some variables that can't be updated, such as calico_pool_blocksize

@cristicalin
Copy link
Contributor

Thanks @cyclinder for this work!

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cristicalin, cyclinder

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 19, 2022
@k8s-ci-robot k8s-ci-robot merged commit 2e1863a into kubernetes-sigs:master Jul 19, 2022
@floryut floryut mentioned this pull request Sep 19, 2022
nolimitkun pushed a commit to nolimitkun/kubespray that referenced this pull request Mar 19, 2023
Signed-off-by: cyclinder qifeng.guo@daocloud.io
LuckySB pushed a commit to southbridgeio/kubespray that referenced this pull request Jul 2, 2023
Signed-off-by: cyclinder qifeng.guo@daocloud.io
LuckySB pushed a commit to southbridgeio/kubespray that referenced this pull request Jul 7, 2023
Signed-off-by: cyclinder qifeng.guo@daocloud.io
@rptaylor
Copy link
Contributor

rptaylor commented Jan 4, 2024

@cyclinder @cristicalin It looks like this causes upgrades to kubespray 2.20 to fail. When running the kubespray/upgrade-cluster.yml playbook:

TASK [network_plugin/calico : Check if inventory match current cluster configuration] ***************************************************************************************
fatal: [cluster-dev-k8s-master-nf-1]: FAILED! => {
    "assertion": "calico_pool_conf.spec.blockSize|int == (calico_pool_blocksize | default(kube_network_node_prefix) | int)",
    "changed": false,
    "evaluated_to": false,
    "msg": "Your inventory doesn't match the current cluster configuration"
}

For any cluster built with the default config before this change, calico_pool_conf.spec.blockSize is 24. But with this PR changing calico_pool_blocksize to 26 it causes the assertion to fail here: https://github.com/kubernetes-sigs/kubespray/blob/release-2.20/roles/network_plugin/calico/tasks/check.yml#L158

How are clusters meant to be upgraded to Kubespray 2.20.0? Is there a bug fix for this that can be cherrypicked ?

Or is there a manual procedure that is required to update the block size of the calico pool? There is nothing in the 2.20 release notes about this.

@rptaylor
Copy link
Contributor

rptaylor commented Jan 4, 2024

Looks like #10516 could fix it but that isn't even in 2.24 yet.
I guess existing clusters need to keep the same value of calico_pool_blocksize in their inventory to avoid breakage from the change of default value?

@cyclinder
Copy link
Contributor Author

I guess existing clusters need to keep the same value of calico_pool_blocksize in their inventory to avoid breakage from the change of default value?

Yes, you should configure the value of calico_pool_blocksize when you upgrade your cluster. Otherwise, calico_pool_conf.spec.blockSize doesn't match the calico_pool_blocksize. We can consider cherrypick #10516.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants