Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[supervisor][sfm]Fix the issue of swss.sh shows backtrace when shutdown a SFM #18393

Merged
merged 1 commit into from
Mar 30, 2024

Conversation

mlok-nokia
Copy link
Contributor

Why I did it

On a Supervisor card of a VOQ chassis, when remove or shutdown a Fabric card, swss.sh shows Stacktrace for all related empty SFM slots in the syslog file. This PR fixes #18384

Work item tracking
  • Microsoft ADO (number only):

How I did it

In the asic_status.py, all empty SFM slots related swss.sh is in the waiting state to wait for the presence event of SFM -- SET operation. The subscriber event handler also includes the "DEL" operation when a SFM is shutdown/removal. When a SFM is shutdown, all empty slot's swss.sh also get the "DEL" event although it is not for them. In the "DEL" operation, the current implementation doesn't check if this "DEL" operation for them, and then they exit the wait state and proceed to docker-wait-any with wrong operation in the wrong slot. docker-wait0any raise the backtarce.

How to verify it

  1. In a chassis which has some empty SMF slot, remove or shutdown a SFM. There should not be related stacktrace shown in syslog

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

…wn a SFM.

Signed-off-by: mlok <marty.lok@nokia.com>
@mlok-nokia
Copy link
Contributor Author

@arlakshm This PR fixed the backtarce which we saw in the syslog file when we test the shutdown a SFM. Please review it

@mlok-nokia
Copy link
Contributor Author

@judyjoseph @rlhui Please take a look this PR if it should be in the next 202205 image build. Thanks

Copy link
Contributor

@arlakshm arlakshm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@lguohan lguohan added the Chassis for 202205 branch PRs needed for 202205 branch in msft repo label Mar 30, 2024
@lguohan lguohan merged commit a56cf79 into sonic-net:master Mar 30, 2024
19 checks passed
mlok-nokia added a commit to mlok-nokia/sonic-buildimage that referenced this pull request Jun 5, 2024
…wn a SFM. (sonic-net#18393)

On a Supervisor card of a VOQ chassis, when remove or shutdown a Fabric card, swss.sh shows Stacktrace for all related empty SFM slots in the syslog file. This PR fixes sonic-net#18384

How I did it
In the asic_status.py, all empty SFM slots related swss.sh is in the waiting state to wait for the presence event of SFM -- SET operation. The subscriber event handler also includes the "DEL" operation when a SFM is shutdown/removal. When a SFM is shutdown, all empty slot's swss.sh also get the "DEL" event although it is not for them. In the "DEL" operation, the current implementation doesn't check if this "DEL" operation for them, and then they exit the wait state and proceed to docker-wait-any with wrong operation in the wrong slot. docker-wait0any raise the backtarce.

How to verify it
In a chassis which has some empty SMF slot, remove or shutdown a SFM. There should not be related stacktrace shown in syslog

Signed-off-by: mlok <marty.lok@nokia.com>
@mlok-nokia mlok-nokia deleted the fix-sfm-shutdown-backtrace branch September 27, 2024 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chassis for 202205 branch PRs needed for 202205 branch in msft repo
Projects
None yet
3 participants