Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade SONiC to Debian Bullseye #8191

Merged
merged 39 commits into from
Nov 10, 2021

Conversation

saiarcot895
Copy link
Contributor

@saiarcot895 saiarcot895 commented Jul 15, 2021

Why I did it

In preparation for Debian Bullseye, upgrade SONiC's base system to be based on Bullseye, which was released in August 2021. Major changes from Buster are:

  1. Kernel is now based on 5.10.x (currently, official Debian Bullseye is publishing the 5.10.70 kernel)
  2. Most Python 2 packages (as well as pip2.7) have been removed from Bullseye. The Python 2 interpreter is still available.

How I did it

The kernel has been upgraded to 5.10.46, and the base system is now based on Bullseye. Containers are still based on Buster.

How to verify it

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106

Description for the changelog

A picture of a cute animal (not mandatory but encouraged)

@lguohan
Copy link
Collaborator

lguohan commented Jul 16, 2021

can you resolve the conflict so that the build can proceed?

@saiarcot895
Copy link
Contributor Author

Conflicts have been resolved. The expectation is that the kvm image should build (none of the containers will get built), broadcom will fail to build (vendor platform modules need changes), mellanox likely will fail to build.

@saiarcot895 saiarcot895 changed the base branch from master to bullseye July 21, 2021 15:52
@saiarcot895 saiarcot895 changed the base branch from bullseye to master July 21, 2021 15:54
@lguohan
Copy link
Collaborator

lguohan commented Jul 24, 2021

/azp run

@azure-pipelines
Copy link

Pull request contains merge conflicts.

@lguohan
Copy link
Collaborator

lguohan commented Jul 24, 2021

can you resolve merge conflict, i think the last commit on swss-common is no longer needed.

@saiarcot895
Copy link
Contributor Author

The libswsscommon, python-swsscommon, and python3-swsscommon packages are all installed on the base image itself, not just in the containers, so I do think they need the Bullseye changes.

Currently, though, the CI run is reporting a test failure in the target/python-wheels/sonic_config_engine-1.0-py3-none-any.whl target:

test_render_template (tests.test_minigraph_case.TestCfgGenCaseInsensitive) ... Traceback (most recent call last):
  File "/sonic/src/sonic-config-engine/tests/../sonic-cfggen", line 443, in <module>
    main()
  File "/sonic/src/sonic-config-engine/tests/../sonic-cfggen", line 405, in main
    env = _get_jinja2_env(paths)
  File "/sonic/src/sonic-config-engine/tests/../sonic-cfggen", line 237, in _get_jinja2_env
    redis_bcc = RedisBytecodeCache(SonicV2Connector(host='127.0.0.1'))
  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1697, in __init__
    for db_name in self.get_db_list():
  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1633, in get_db_list
    return _swsscommon.SonicV2Connector_Native_get_db_list(self)
RuntimeError: Sonic database config file doesn't exist at /var/run/redis/sonic-db/database_config.json

@lguohan
Copy link
Collaborator

lguohan commented Jul 26, 2021

swss-common submodule has already been updated in master branch, you do not need to update it again.

rules/swsssdk-py2.mk Outdated Show resolved Hide resolved
@lguohan
Copy link
Collaborator

lguohan commented Jul 27, 2021

looks like we still have the sonic-cfggen unit test failure.

@saiarcot895
Copy link
Contributor Author

Build failures are caused by the sonic-telemetry build/test overwriting /var/run/redis/sonic-db/database_config.json, and then removing the /var/run/redis/sonic-db/ directory. Because the same slave container is used for all of the builds, these changes persist across builds.

The commits responsible for this change are sonic-net/sonic-telemetry@4bb02f5 (overwriting /var/run/redis/sonic-db/database_config.json) and sonic-net/sonic-telemetry@6df988c (removing the /var/run/redis/sonic-db/ directory).

 $ grep database_config.json target/debs/buster/*.log
target/debs/buster/libswsscommon_1.0.0_amd64.deb-install.log:Configuration file '/var/run/redis/sonic-db/database_config.json', does not exist on system.
target/debs/buster/libswsscommon_1.0.0_amd64.deb.log: /usr/bin/install -c -m 644 database_config.json '/sonic/src/sonic-swss-common/debian/tmp/var/run/redis/sonic-db'
target/debs/buster/sonic-telemetry_0.1_amd64.deb.log:sudo cp ./testdata/database_config.json /var/run/redis/sonic-db/

@saiarcot895
Copy link
Contributor Author

saiarcot895 commented Jul 29, 2021

Waiting for #8282 to be merged to fix build issue.

@@ -19,7 +19,7 @@ endif
$(DOCKER_SYNCD_BFN_RPC)_CONTAINER_NAME = syncd
$(DOCKER_SYNCD_BFN_RPC)_VERSION = 1.0.0-rpc
$(DOCKER_SYNCD_BFN_RPC)_PACKAGE_NAME = syncd
$(DOCKER_SYNCD_BFN_RPC)_RUN_OPT += --net=host --privileged -t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you share more insight why "--net=host is specified twice" ? Maybe we should fix the other place.

Copy link
Contributor Author

@saiarcot895 saiarcot895 Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit c948379 made some changes in files/build_templates/docker_image_ctl.j2. Specifically, it added some code here to explicitly specify which network namespace/topology the container should run in (with the --net= flag): either the host namespace (so that it has all of the host network interfaces accessible within the container), connected to the Docker bridge with veth interfaces (this is the default), or the namespace of another container.

If the container isn't a per-ASIC container (meaning there's only one for the system), then it'll put it in the host namespace. Otherwise, if it's in a multi-ASIC environment and the database container is being started, then that container will be connected in its own network namespace via the default Docker bridge and a veth interface. Otherwise, it'll use the namespace of the database container for that ASIC.

This change also removed all of the explicit --net=host flags specified in the run options for other containers (see the files changed in rules/ directory in that commit), but didn't handle the ones that were defined in the platform/ directory. So I'm replicating the change made in that commit for these other containers.

@lguohan
Copy link
Collaborator

lguohan commented Aug 7, 2021

i can see the src/sonic-swss-common as a conflict and it blocks the build, can you consider merge your changes in src/sonic-swss-common into the repo so that we won't have such conflict in the future.

src/ifupdown2/Makefile Outdated Show resolved Hide resolved
@alexrallen
Copy link
Contributor

Hi @saiarcot895 if possible could you please avoid force pushing to this branch?

I know its generally not a problem on feature branches but I am currently working on aligning Mellanox code to this branch in parallel and it would be extremely helpful for me to have a consistent history so that I can better merge in changes from your end as they come in.

Please let me know if there are any problems or anything I can do to help.

@lguohan
Copy link
Collaborator

lguohan commented Nov 9, 2021

can you fix the lgtm?

saiarcot895 and others added 18 commits November 9, 2021 09:51
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
ISSU will likely be broken. As of right now, the issu-version file is
not being generated during build.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
…te file instead of the individual container definitions

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Add an include in saibcm-modules and saibcm-modules-dnx that are now
needed due to Mellanox kernel patches.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Ported Iptables patch for supporting fullcone NAT option to 5.10 kernel.

Signed-off-by: Kiran Kella <kiran.kella@broadcom.com>
1. Fix build for armhf and arm64
2. upgrade centec tsingma bsp support to 5.10 kernel
3. modify centec platform driver for linux 5.10

Co-authored-by: Shi Lei <shil@centecnetworks.com>
Upgrade DellEMC platforms to bullseye.
Also add out of tree pca9548 mux driver to use platform data to mapping i2c bus with front panel port.

Signed-off-by: Jakkapan Jangmuang <jjangmua@celestica.com>

Co-authored-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Allow mellanox platform to build and successfully switch packets in
Debian 11

Upgraded

* Mellanox SDK
* Mellanox Hardware Management
* Mellanox Firmware
* Mellanox Kernel Patches

Adjusted build system to support host system running bullseye and
dockers running buster.
Fixes sonic-net#9011.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
* Upgrade new DellEMC platforms to Bullseye

* Update s5212f kernel module
* Make neccesary changed to mellanox platform code to build on Debian 11

* Revert use of backported kernel to build mft and elect to only build kernel module under bullseye
Update Barefoot platform support for Bullseye and 5.10 kernel, and add
python3-venv.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
ipmihelper files are repeated for few DellEMC platforms. Removed the
files in sonic_platform since as part of debian rules,ipmihelper will be
copied to necessary directory.
Updated the hw-mgmt pointer to include some bugfixes related to power supply voltages.
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
@lgtm-com
Copy link

lgtm-com bot commented Nov 9, 2021

This pull request fixes 5 alerts when merging 42855e3 into a8ae39d - view on LGTM.com

fixed alerts:

  • 4 for Unused import
  • 1 for Syntax error

@lguohan lguohan linked an issue Nov 10, 2021 that may be closed by this pull request
@saiarcot895 saiarcot895 linked an issue Nov 10, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bullseye] kdump-tools is not built [bullseye] fdisk and gpg missing from host
10 participants