Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CoPP] Fix the bug that copp configuration will cause failed logs during warm restart #2084

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chaoskao
Copy link
Contributor

What I did
Let CoppMgr skip configuration which are already in APPL_DB after warm restart.

Why I did it
After warm start, the CoppMgr will set configuration into APPL_DB which already exist, but this behavior is not necessary and it will cause CoPPorch to do a modification event.

How I verified it
With this enhancement, the syslog will not appear the error messages about Copp during warm restart.

Details if related

restart

- What I do:
  Let CoppMgr skip configuration which are already in APPL_DB after warm restart.

- Why I did it:
  After warm start, the CoppMgr will set configuration into APPL_DB which already exist, but this behavior is not necessary and it will cause CoPPorch to do a modification event.

- How I verify it:
  With this enhancement, the syslog will not appear the error messages about Copp during a warm restart.
@chaoskao chaoskao changed the title Fix the bug that copp configuration will cause failed logs during warm restart [CoPP] Fix the bug that copp configuration will cause failed logs during warm restart Dec 20, 2021
@dgsudharsan
Copy link
Collaborator

Can you please paste the error messages seen after warmboot here?

@chaoskao
Copy link
Contributor Author

Dec 9 12:38:52.857966 as7726-32x-4 ERR swss#orchagent: :- meta_generic_validation_set: SAI_POLICER_ATTR_METER_TYPE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
Dec 9 12:38:52.857966 as7726-32x-4 ERR swss#orchagent: :- trapGroupUpdatePolicer: Failed to apply attribute[2].id=0 to policer for trap group:default, error:-5
Dec 9 12:38:52.857966 as7726-32x-4 ERR swss#orchagent: :- doTask: Processing copp task item failed, exiting.

@prsunny
Copy link
Collaborator

prsunny commented Jan 24, 2022

@dgsudharsan , could you please check the latest comment?

@dgsudharsan
Copy link
Collaborator

According to existing logic we delete the APP_DB will be cleared during an upgrade which includes warm-upgrade scenario. Without the APP_DB being set the feature wouldn't work during warm upgrades.
Please revisit the logic to make sure it works for both warm restart and warm upgrade.

Please check here

https://github.com/Azure/sonic-utilities/blob/d9f3afe5b34ef0f03f4137f607bc73bc625631ed/scripts/db_migrator.py#L166

Copy link
Collaborator

@dgsudharsan dgsudharsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider the warm upgrade scenario as requested.

@dgsudharsan
Copy link
Collaborator

Dec 9 12:38:52.857966 as7726-32x-4 ERR swss#orchagent: :- doTask: Processing copp task item failed, exiting.

@chaoskao Can you please explain the entire scenario. I tried with default copp configurations and I don't see any issues. Are you adding additional Copp configurations?

@chaoskao
Copy link
Contributor Author

@dgsudharsan
I following the test steps in https://github.com/Azure/sonic-mgmt/blob/master/tests/vrf/test_vrf.py#L1020, enable the swss warm reboot and restart the swss container by CLI command.

The CoPP configuration shows as following
{ "COPP_GROUP": { "default": { "queue": "0", "meter_type":"packets", "mode":"sr_tcm", "cir":"600", "cbs":"600", "red_action":"drop" } } }

@dgsudharsan
Copy link
Collaborator

@dgsudharsan I following the test steps in https://github.com/Azure/sonic-mgmt/blob/master/tests/vrf/test_vrf.py#L1020, enable the swss warm reboot and restart the swss container by CLI command.

The CoPP configuration shows as following { "COPP_GROUP": { "default": { "queue": "0", "meter_type":"packets", "mode":"sr_tcm", "cir":"600", "cbs":"600", "red_action":"drop" } } }

I believe the usecase is swss warm-restart. Can you please check if warm-reboot itself is impacted? If swss warm-restart we may have to deal it differently.

@chaoskao
Copy link
Contributor Author

@dgsudharsan
warm-reboot will not show the error message, the error log only shows in syslog when service swss restart

@dgsudharsan
Copy link
Collaborator

@dgsudharsan warm-reboot will not show the error message, the error log only shows in syslog when service swss restart

With your changes warmboot and warmupgrade will be broken. A suggestion would be to check if app_db is populated already and ignore setting app db.
Another suggestion is to remove app_db entries as in warm reboot @prsunny Can you please suggest your opinion?

@chaoskao
Copy link
Contributor Author

@dgsudharsan warm-reboot will not show the error message, the error log only shows in syslog when service swss restart

With your changes warmboot and warmupgrade will be broken. A suggestion would be to check if app_db is populated already and ignore setting app db. Another suggestion is to remove app_db entries as in warm reboot @prsunny Can you please suggest your opinion?

Do you means check the each attribute of CoPP rules from App_DB, if attribute already exist just ignore this attribute or just check the key of rules?

@dgsudharsan
Copy link
Collaborator

Do you means check the each attribute of CoPP rules from App_DB, if attribute already exist just ignore this attribute or just check the key of rules?

You can check the key alone within warm restart check that you have introduced. You can use m_coppTable for this purpose.

@prsunny
Copy link
Collaborator

prsunny commented Jan 27, 2022

@dgsudharsan warm-reboot will not show the error message, the error log only shows in syslog when service swss restart

With your changes warmboot and warmupgrade will be broken. A suggestion would be to check if app_db is populated already and ignore setting app db. Another suggestion is to remove app_db entries as in warm reboot @prsunny Can you please suggest your opinion?

Looks like the change is impacting regular warmboot. As such, we cannot have this change merged. Can you please check on the options provided by @dgsudharsan.

@chaoskao
Copy link
Contributor Author

chaoskao commented Jan 28, 2022

The modification is just for the warm-restart case which will set a "swss warm restart" flag in STATUS_DB.
I want to avoid the case in which the coppmgr will try to set the same configuration from config_DB into APPL_DB which is already existed during warm restart. When user enable "warm_restart" on swss and APPL_DB does not be cleared, this code will skipped the CoPP configuration which from config_DB to APPL_DB

In the db_migratror case mentioned by @dgsudharsan, the enhancement will still apply the configuration from copp_cfg.json into APPL_DB because there is no entry in APPL_DB which is cleared in db_migrator.

@dgsudharsan
Copy link
Collaborator

dgsudharsan commented Feb 26, 2022

The modification is just for the warm-restart case which will set a "swss warm restart" flag in STATUS_DB. I want to avoid the case in which the coppmgr will try to set the same configuration from config_DB into APPL_DB which is already existed during warm restart. When user enable "warm_restart" on swss and APPL_DB does not be cleared, this code will skipped the CoPP configuration which from config_DB to APPL_DB

In the db_migratror case mentioned by @dgsudharsan, the enhancement will still apply the configuration from copp_cfg.json into APPL_DB because there is no entry in APPL_DB which is cleared in db_migrator.

Have you tested this scenario?
Can you please perform a warm boot with your fix and check if entries are populated after regular warm boot?

@chaoskao
Copy link
Contributor Author

chaoskao commented Mar 7, 2022

I try the warm start scenario and the test steps as following:
step 1. sudo config warm_restart enable swss
step 2. db_migrator.py -o migrate
step 3. sudo service swss restart

on step 2, I check APPL_DB has been cleared, and after the step3 the entries of APPL_DB has been populated again

Another warm reboot scenario has same result, the test steps as following:

step 1. sudo config warm_restart enable swss
step 2. db_migrator.py -o migrate
step 3. sudo warm-reboot

@dgsudharsan
Copy link
Collaborator

I try the warm start scenario and the test steps as following: step 1. sudo config warm_restart enable swss step 2. db_migrator.py -o migrate step 3. sudo service swss restart

on step 2, I check APPL_DB has been cleared, and after the step3 the entries of APPL_DB has been populated again

Another warm reboot scenario has same result, the test steps as following:

step 1. sudo config warm_restart enable swss step 2. db_migrator.py -o migrate step 3. sudo warm-reboot

If using db_migrator helps should we change sonic-mgmt test script rather than changing code here?

@chaoskao
Copy link
Contributor Author

chaoskao commented Apr 7, 2022

Hi, @dgsudharsan
Sorry, I don't know what are you mean, I just simulate the scenario that you say the entries will not be populated due to this code change, so I use db_migrator.py to clear CoPP setting from Appl_DB to check does this issue exist or not.

@dgsudharsan
Copy link
Collaborator

Hi, @dgsudharsan Sorry, I don't know what are you mean, I just simulate the scenario that you say the entries will not be populated due to this code change, so I use db_migrator.py to clear CoPP setting from Appl_DB to check does this issue exist or not.

I believe after using DB migrator you don't see the issue correct? If that's the case we need to use DB migrator logic in the test scripts to avoid the CoPP error logs. @prsunny what is your opinion on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants