Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge from Azure to VY #3

Merged
merged 51 commits into from
Jan 24, 2022
Merged

Conversation

VadymYashchenko
Copy link
Owner

Why I did it

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106

Description for the changelog

A picture of a cute animal (not mandatory but encouraged)

ganglyu and others added 30 commits January 12, 2022 10:23
#### Why I did it
AAA yang model is not up to date.

#### How I did it
Add fallback and trace field, and replace boolean_type

#### How to verify it
Run UT for sonic_yang_models.
Follow the steps from #9710
Enable pca954x idle_disconnect to avoid possible I2C device address conflict.

How I did it
Change pca954x device_attr idle_state to -2 (MUX_IDLE_DISCONNECT).

How to verify it
Cat pca954x device_attr idle_state and confirm the value is -2.

Signed-off-by: Sean Wu <sean_wu@edge-core.com>
#### Why I did it
resolves #8779
snmpd writes the below error message in syslog :
snmp#snmpd[27]: truncating integer value > 32 bits
This message is written in syslog when the hrSystemUptime(1.3.6.1.2.1.25.1.1.0 / system uptime) or sysUpTime(1.3.6.1.2.1.1.3 network management portion or snmpd uptime) is queried when either of these counters overflow beyond 32 bit value. This happens the device uptime or snmpd uptime is more than 497 days.

#### How I did it
Reference: https://access.redhat.com/solutions/367093 and https://linux.die.net/man/1/snmpcmd

To avoid seeing this message if the counter grows, the snmpd error log level is changed to display  LOG_EMERG, LOG_ALERT, LOG_CRIT, and LOG_DEBUG.

Without this change, LOG_ERR and LOG_WARNING would also be logged in syslog.

#### How to verify it
On a device which is up for more than 497 days, modify supervisord.conf  with the change and restart snmp.
Query 1.3.6.1.2.1.1.3 and verify that log message is not seen.
…nexthop (#9707)

What I did:-

Enhanced minigraph parser to parse interface name associated with static route nexthop

Why I did:-

One of the use case to support interface name is Chassis Packet. For Chassis Packet we have Static Routes configured to route traffic across line-card. If the FRR programs static route without the interface name then in case if the ip interface that is associated with the nexthop goes down FRR resolves static route nexthop over the default route as we have FRR config ip nht-resolve-via-default which causes undesired behavior. Having interface name with Static Route prevents recursive lookup on default route.

How I verify:

Updated unit-test cases
Manual verification
457e94d51 [macsec_linux]: Fixbug cannot dump the PN due to type error (#42)
f7c073323 Disable P2P module (#41)
7b3b777e2 [ci]: use native arm64 and armhf build pool (#40)
d4e91d66c [sonic_operator]: Increase wait timeout (#39)
43611ef88e [sonic_operators]: Add log in sonic operators (#43)

Signed-off-by: Ze Gan <ganze718@gmail.com>
- Why I did it
Add sensor conf for MSN4600C A1 platform

- How I did it
Add a new sensor conf file and relevant scripts to support two different versions of the platform

- How to verify it
Run "sensors" cmd to check the output on the A1 platform to see whether it's as expected.
Signed-off-by: Kebo Liu <kebol@nvidia.com>
  [submodule update] sonic-sairedis

    d5866a3 (HEAD -> master, origin/master, origin/HEAD) [vslib]: fix create MACsec SA error (#986)
    f36f7ce Added Support for enum query capability of Nexthop Group Type. (#989)
    323b89b Support for MACsec statistics (#892)
    26a8a12 Prevent other notification event storms to keep enqueue unchecked and drained all memory that leads to crashing the switch router (#968)
    0cb253a Fix object availability conversion (#974)
   [Submodule update] sonic-swss

    c78aa1b (HEAD -> master, origin/master, origin/HEAD) OA changes to support Ordered ECMP and DVS test for same. (#2092)
    b4b0003 Handling Invalid CRM configuration gracefully (#2109)
    d240cb2 [Mellanox] '_8lane' not added to Mellanox 5xxx models with 800G (#2090)
    8fd6e48 [pfcwd] Add vs test infrastructure (#2077)
    b96ee54 [vnetorch] Advertise vnet tunnel routes (#2058)
Why I did it
The existing log file size in sonic is 1 Mb. Over a period of time this leads to huge number of log files which becomes difficult for monitoring applications to handle.
Instead of large number of small files, the size of the log file is not set to 16 Mb which reduces the number of files over a period of time.

How I did it
Changed the size parameter and related macros in logrotate config for rsyslog

How to verify it
Execute logrotate manually and verify the limit when the file gets rotated.

Signed-off-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
#### Why I did it
As a fix for #9574

#### How I did it
Enhance yang model for networking-metadata

#### How to verify it
Unit testing
Improve the Linux kernel build cache hit rate.
Current the the hit rate is around 85.8% (based on the last 3 month, 3479 PR builds totally, 494 PR build not hit).
We can improve the hit rate up to 95% or better.
The Linux kernel build will take really long time, most of the PRs are nothing to do with the kernel change. The remaining cache options should be enough to detect the Linux kernel cache status (dirty or not).
Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
* fix workdir for seastone2

Signed-off-by: Viktor Ekmark <viktor@ekmark.se>

* seastone2: Add I2C SFP definition for SFP1

Signed-off-by: Christian Svensson <blue@cmd.nu>

* [device/cel_seastone_2] sfputil logic for SFP1

Earlier logic resulted in the name of SFP1 being SFP33 which is not
correct. The cannonical source is seastone2_fpga module and it calls it
SFP1, so ensure the logic does as well.

Signed-off-by: Christian Svensson <blue@cmd.nu>

* [device/cel_seastone_2] sysfs paths for SFP1

Various changes that plumbs the correct port presence and DOM decoding
for the SFP1 port.

Signed-off-by: Christian Svensson <blue@cmd.nu>

Co-authored-by: Christian Svensson <blue@cmd.nu>
* Add intel_iommu=off to installer.conf

* This solve flooding DMAR err msg: "handling fault status reg 2"

Signed-off-by: Sean Wu <sean_wu@edge-core.com>

* remove customized at24 driver

* Use kernel 5.10.46 upstream at24 driver directly. The ADDR16 issue on
old driver has gone.

Signed-off-by: Sean Wu <sean_wu@edge-core.com>

* pin I2C-0/I2C-1 bus order

* otherwise, sometimes I2C-0/I2C-1 will be assigned to the undesired one.

Signed-off-by: Sean Wu <sean_wu@edge-core.com>

* fix i2c bus num for fan driver

Signed-off-by: Sean Wu <sean_wu@edge-core.com>

* backward compatible with R0A/R0B HW

Signed-off-by: Sean Wu <sean_wu@edge-core.com>
#### Why I did it
Build failed.

Due to the error message looks like build failed because `pyversions` utility was not found.

`pyversions` utility is a part of `python2-minimal` package and it wasn't installed.


#### How I did it
To avoid installing python2 just specify explicit python version `--with python3` and use build system for py3 `--buildsystem=pybuild`

#### How to verify it
Run build
* Update multiarch related command.
* [BFN] Updated platform APIs impl

Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>

* Extended BFN platform SFP APIs implementation

* Update sfp.py

* [BFN] Extended SFP platform plugin implementation

Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>

* [BFN] Extended Fans platform plugin implementation

* [BFN] divided classes Fan and  FanDrawer into 2 files

* Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

What I did
	Add get_model() function
	Add get_low_critical_threshold() function
	Change __get(...) function.
How I did it
	Differnece from previous implementation of __get(...) function is return real value or -9999.9 if value is not provided by thrift API

* Add get_presence() function and revised __get() function

Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* [BFN] Updated PSU platform APIs impl

Signed-off-by: Dmytro Lytvynenko <dmytrox.lytvynenko@intel.com>

* Added BFN PSU cache (#9)

Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>

* [BFN]  Fans and Fantray platform APIs update (#7)

* [BFN] Updated SFP platform APIs (#10)

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>

* [BFN] Updated platform API for thermal (#8)

* Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revert "[BFN]  Fans and Fantray platform APIs update (#7)" (#11)

This reverts commit c62a733.

* Add support health monitor system (#15)

Signed-off-by: Petro Bratash <petrox.bratash@intel.com>

* Update chassis.py

* [BFN] Updated FANs and FAN Tray platform API (#14)

* Fix fix_alignment (#17)

Signed-off-by: Petro Bratash <petrox.bratash@intel.com>

* [BFN] Improvement show environment (#16)

* Added PSU temperature skip into platform.json (#18)

Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>

* Do not skip psud on Newport

Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>

* [BFN] fix fan status from Not OK to Ok (#19)

* [BFN] Updated SFP platform plugin (#13)

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>

* [DPB] Fix typo for Ethernet0 2x200G[100G,40G] breakout mode (#21)

Signed-off-by: Mykola Gerasymenko <mykolax.gerasymenko@intel.com>

* [barefoot] Tmp fix vendor_rev (#22)

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>

* Fixed python issues in sonic_platform/fan_drawer.py

Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>

* Updated fan_drawer.py

* Fixing trailing white spaces in fan_drawer.py

* [BFN] Fix thrift for SFPs API

Signed-off-by: Volodymyr Boyko <volodymyrx.boiko@intel.com>

* In platform.json, replaced 'false' with '0' to workaround ast.literal_eval() issue

Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>

* [Newport] Thermal manager  (#23)

* Signed-off-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>

* Revert "In platform.json, replaced 'false' with '0' to workaround ast.literal_eval() issue"

This reverts commit 1e73127.

* Removed 'controllable' options from platform.json to fix factory default config generation

Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>

* Update thermal_manager.py

* Migrated SFP plugin to sonic_xcvr API (#30)

Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>

Co-authored-by: KostiantynYarovyiBf <kostiantynx.yarovyi@intel.com>
Co-authored-by: Vadym Yashchenko <vadymx.yashchenko@intel.com>
Co-authored-by: Dmytro Lytvynenko <dmytrox.lytvynenko@intel.com>
Co-authored-by: Volodymyr Boiko <volodymyrx.boiko@intel.com>
Co-authored-by: Petro Bratash <petrox.bratash@intel.com>
Co-authored-by: Mykola Gerasymenko <mykolax.gerasymenko@intel.com>
#9727)

[image]: Prevent radius passkey and snmp community string into syslog.  (#9727)

#### Why I did it
    Prevent radius passkey and snmp community string into syslog.

#### How I did it
    Add radius and snmp config command to PASSWD_CMDS

#### How to verify it
    Run and pass all UTs.

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106

#### Description for the changelog
    Add radius and snmp config command to PASSWD_CMDS to prevent radius passkey and snmp community string into syslog.

#### A picture of a cute animal (not mandatory but encouraged)
[submodule]: update sonic-utilities
* Description: Currently IPv4 routes with IPv6 link local next hops are
not properly installed in FPM.
Reason is the netlink decoding truncates the ipv6 LL address to 4 byte
ipv4 address.

Ex : fe80:: is directly converted to ipv4 and it results in 254.128.0.0
as next hop for below routes

show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup

B>* 2.1.0.0/16 [200/0] via fe80::268a:7ff:fed0:d40, Ethernet0, weight 1,
02:22:26
B>* 5.1.0.0/16 [200/0] via fe80::268a:7ff:fed0:d40, Ethernet0, weight 1,
02:22:26
B>* 10.1.0.2/32 [200/0] via fe80::268a:7ff:fed0:d40, Ethernet0, weight
1, 02:22:26

Hence this fix converts the ipv6-LL address to ipv4-LL (169.254.0.1)
address before sending it to FPM. This is inline with how these types of
routes are currently programmed into kernel.

Signed-off-by: Nikhil Kelapure <nikhil.kelapure@broadcom.com>
- Why I did it
MSN4700 platform has 8 lanes per port and thus can support 2x40G with each lane running at 10G

- How I did it
Added 40G to 2x200G breakout mode in platform.json

- How to verify it
Run config int break Ethernet0 2x40G[200G,100G,50G,25G,10G,1G]
And verify the command runs successfully and the port speed was set to 40G with a 2x breakout.
[submodule]: update sonic-mgmt-common
As part of this, update the isc-dhcp package to match the Bullseye
version (this fixes some compile errors related to BIND), clean up some
of the build dependencies and runtime dependencies for debian packaging,
and use the default Boost version to compile against instead of
explicitly saying using 1.74.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
```
d9f3afe [fdbshow] Adding more options for fdbshow and show mac (#1982)
902e14f Revert "Revert "[Barefoot] Added CLI to list/set P4 profile (#1951)"" (#2019)
5cc9dd5 Revert "Revert "[sonic-package-manager] support sonic-cli-gen and packages with YANG model (#1650)" (#1972)" (#1994)
```
…/MSN4700 (#9728)

- Why I did it
For MSN4410/MSN4600/MSN4700 now they can support fetching PSU voltage threshold, no need to skip the psu voltage check in system health monitoring, so update the system health monitoring configuration file for these platforms.

- How I did it
remove skip PSU change config from the system_health_monitoring_config.json file

- How to verify it
Build image run on these platforms, system health monitoring will not report error against PSU voltage

Signed-off-by: Kebo Liu <kebol@nvidia.com>
Tested on a Celestica Seastone2 DX030 switch

Testing scenarios:
- Various QSFP ports in both normal and breakout config.
- 100G and 40G link speed show different colors.
- SFP1 port works.

Signed-off-by: Christian Svensson <blue@cmd.nu>
Junchao-Mellanox and others added 21 commits January 19, 2022 11:44
- Why I did it
Optimize thermal control policies to simplify the logic and add more protection code in policies to make sure it works even if kernel algorithm does not work.

- How I did it
Reduce unused thermal policies
Add timely ASIC temperature check in thermal policy to make sure ASIC temperature and fan speed is coordinated
Minimum allowed fan speed now is calculated by max of the expected fan speed among all policies
Move some logic from fan.py to thermal.py to make it more readable

- How to verify it
1. Manual test
2. Regression
c4127c2 [psud] Fix PSU log issue (#235)
07542cb [pmon][xcvrd]xcvrd process show backtrace on the internal port. (#233)
3e432e7 [Y-Cable] Increased unit test coverage of y_cable_helper.py (#229)
7c363f5 [ledd] prevent led crash on recirc port event (#232)
e9ccd82 [sonic-platform-daemons] fix dependency issue on py2 wheels by correcting the path (#234)
2b0acfb [sfp-refactoring] xcvrd: add initial support for CMIS application initialization (#217)
Why I did it
Need to be able to run smartctl when pmon docker is not running.

How I did it
Removed the pmon dependency for pmon as well as the command wrapper and added it to the debian-extension.

How to verify it
Stop pmon
Run smartctl from the host and verify it runs without error
Why I did it
end2end test is blocked by Yang model for BGP monitor.

How I did it
Create new yang files for BGP monitor, and add UT.

How to verify it
Follow the steps in #9711.
Run UT for sonic-yang-models.

Signed-off-by: Gang Lv ganglv@microsoft.com
Why I did it
ConfigDB schema generated by minigraph parser can't pass yang validation.

How I did it
Modify minigraph.py, and use 'state' to replace 'status'.

How to verify it
Run UT for sonic-config-engine.
Use minigraph parser to generate ConfigDB schema, and run yang validation.

Signed-off-by: Gang Lv ganglv@microsoft.com
Support bullseye for docker-sonic-restapi docker-sonic-telemetry
Upgrade to bullseye and Golang-1.15 to support FIPS.
[sonic-linkmgrd][master] submodule update

Commits added:
0c23756 Jing Zhang      2022-01-19      Linkmgrd subscribing State DB route event  (#13)
12b9951 Longxiang Lyu   2021-12-13      Add TLV support to ICMP payload (#11)
3eedda3 Longxiang Lyu   2022-01-06      Add missing intermediate states (#16)
8da4982 Ying Xie        2022-01-04      [linkmgrd] update README, set coding style guidance (#15)
a897cf8 Longxiang Lyu   2021-12-13      Improve PR template (#16)
6fec701 Jing Zhang      2021-12-06      Add pull request template for linkmgrd repo (#9)


signed-off-by: Jing Zhang zhangjing@microsoft.com
  - External PHY is managed via gearbox (gbsybcd docker container) in SONiC
  - Enhanced 'External PHY management' from SONiC's single-ASIC environment to multi-ASIC
  - Enhanced gbsyncd docker container from single Namespace to multi-Namspace mode
  - Added gbsyncd.service.j2 on per_namespace basis.
  - Each namepace/ASIC now to have its unique gbsyncd<ASIC#> docker container with its
    own Gearbox table, redis-DB

Signed-off-by: Shyam Kumar <shyakuma@cisco.com>
…dump (#9521)

Why I did it
Eliminate benign firsttime boot error reported when running on platforms that do not support kdump.

How I did it
Change rc.local to check for presence of the file /etc/default/kdump-tools before referencing it.

How to verify it
Install a new image on an armhf or arm64 platform and check for a failed reference to /etc/default/kdump-tools on firsttime boot.
Why I did it
ACL have ACCEPT action indeed, but yang doesn't support it.

How I did it
Add 'ACCEPT' enum to sonic-types.yang.j2

How to verify it
Run the YANG model unit tests
Why I did it
sonic-broadcom-dnx.bin should be able to installed on DNX supported platform, whereas it doesn't.

How I did it
Changed CONFIGUTED_PLATFORM to TARGET_MACHINE to distinguish broadcom and broadcom-dnx

How to verify it
tar sonic-broadcom-dnx.bin and verify its platforms_asic contians dnx platforms
Also verify on image with other asic, no regression.
Why I did it
Update Broadcom SAI to version 6.0.0.13, SDK 6.5.24, saibcm-modules to 6.5.24.gpl

How I did it
Brcm SAI 6.0 EA with fixes for CS00012203367, CS00012219613, CS00012213974, CS00012218290, CS00012217169, CS00012211718, CS00012213944, CS00012215529, CS00012218100, CS00012214196, CS00012212681, CS00012205138, CS00012208537, CS00012185316, CS00012208524, CS00012203367, CS00012197364.
…cause it will be change to 755 by debian build and cause dirty image version. (#9821)

#### Why I did it
    src\tacacs\bash_tacplus\debian\rules file mode is 644, and debian build will change it to 755, which will cause image version contains 'dirty'

#### How I did it
    Change src\tacacs\bash_tacplus\debian\rules file mode to 755

#### How to verify it
    Check the image version not contains dirty

#### Which release branch to backport (provide reason below if selected)

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [*] 202111

#### Description for the changelog
    Change src\tacacs\bash_tacplus\debian\rules file mode to 755

#### A picture of a cute animal (not mandatory but encouraged)
Update the sonic-swss submodule. The following are new commits in the submodule:

6cb43ee [p4orch] Fix handlePortStatusChangeNotification status deserialize (#2111)
863f0f1 [azp]: Enable PR diff coverage (#2083)
bf4cd4a Fix the unsafe usage of strncpy in portsorch.cpp (#2110)
c1b4b40 support port isolation group in BFN platform (#1940)


Signed-off-by: Andriy Kokhan <andriyx.kokhan@intel.com>
This is to fix the issue of phy-credo package in bullseye.
* use py3 for dh build
* use pybuild for python packages
Why I did it
Old fan drv will be build fail under kernel 5.10. It get below error message.
/sonic/platform/broadcom/sonic-platform-modules-accton/as7312-54xs/modules/accton_as7312_54x_fan.c:483:5: error: implicit declarat ion of function 'set_fs'; did you mean 'sget_fc'? [-Werror=implicit-function-declaration]
set_fs(KERNEL_DS);
^~~~~~
sget_fc

How I did it
These code is old design and they are not needed currently. So remove them.

Signed-off-by: Jostar Yang <jostar_yang@accton.com>
…ingle boot (#9608)

Why I did it
Requirements from Microsoft for fwutil update all state that all firmwares which support this upgrade flow must support upgrade within a single boot cycle. This conflicted with a number of Mellanox upgrade flows which have been revised to safely meet this requirement.

How I did it
Added --no-power-cycle flags to SSD and ONIE firmware scripts
Modified Platform API to call firmware upgrade flows with this new flag during fwutil update all
Added a script to our reboot plugin to handle installing firmwares in the correct order with prior to reboot
How to verify it
Populate platform_components.json with firmware for CPLD / BIOS / ONIE / SSD
Execute fwutil update all fw --boot cold
CPLD will burn / ONIE and BIOS images will stage / SSD will schedule for reboot
Reboot the switch
SSD will install / CPLD will refresh / switch will power cycle into ONIE
ONIE installer will upgrade ONIE and BIOS / switch will reboot back into SONiC
In SONiC run fwutil show status to check that all firmware upgrades were successful
@VadymYashchenko VadymYashchenko merged commit 99acd67 into VadymYashchenko:master Jan 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.