Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VoQ][config] Multiasic Supervisor card fails to load config_db#.json in chassis when system is reboot #10106

Merged
merged 1 commit into from
May 9, 2022

Conversation

mlok-nokia
Copy link
Contributor

Why I did it

Supervisor card fails to load config_db#.json in chassis when system reboot. The Supervisor card which has 16 ASICs, when it is reboot, the first one or two Instance fail to load the config_db#.json due to unavailable of some instance /var/run/redis#/sonic-db/database_config.json file. This is an intermittent issue. fixes #10105

How I did it

In the multiasic platform, the database.sh starts randomly to create create redis socket and database for a namespace and call the sonic-cfggen to load its config_db#.json file if it is present. Function sonic-cfggen calls load_sonic_global_db_config()->initializeGlobalConfig()->SonicDBConfig_initializeGlobalConfig() to initialize the database config. Function SonicDBConfig_initializeGlobalConfig() always checks the presence of sonic-db/database_config.json file in all instance /var/run/redis# for a single database creation. The /var/run/redis#/sonic-db/database_config.json file is created by the initialization of the database container. When the first instance database is checking the presence of other database_config.json, they has not been created and ready yet. Therefore, the following exception is shown and sonic-cfggen fails load its config_db#.json file.

Feb 28 22:15:16.576986 supervisor INFO database.sh[6624]: Traceback (most recent call last):
Feb 28 22:15:16.577157 supervisor INFO database.sh[6624]:   File "/usr/local/bin/sonic-cfggen", line 445, in <module>
Feb 28 22:15:16.577598 supervisor INFO database.sh[6624]:     main()
Feb 28 22:15:16.577742 supervisor INFO database.sh[6624]:   File "/usr/local/bin/sonic-cfggen", line 430, in main
Feb 28 22:15:16.578163 supervisor INFO database.sh[6624]:     SonicDBConfig.load_sonic_global_db_config(namespace=args.namespace)
Feb 28 22:15:16.578314 supervisor INFO database.sh[6624]:   File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1249, in load_sonic_global_db_config
Feb 28 22:15:16.578786 supervisor INFO database.sh[6624]:     SonicDBConfig.initializeGlobalConfig(global_db_file_path)
Feb 28 22:15:16.578927 supervisor INFO database.sh[6624]:   File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1244, in initializeGlobalConfig
Feb 28 22:15:16.579382 supervisor INFO database.sh[6624]:     return _swsscommon.SonicDBConfig_initializeGlobalConfig(*args, **kwargs)
Feb 28 22:15:16.579524 supervisor INFO database.sh[6624]: RuntimeError: Sonic database config file syntax error >> Sonic database config file syntax error >> parse error - unexpected end of input

This commit adds function call waitForAllInstanceDatabaseConfigJsonFilesReady() in to the database.sh. It checks and waits for all instance /var/run/redis#/sonic-db/database_config.json available, then continue to precede to execute SONIC_CFGGEN to load its config_db#.json file

How to verify it

After the supervisor card is reboot, Execute CLI command "show interfaces status". on the supervisor card. The following should be shown:

admin@supervisor:~$ show interfaces status
  Interface    Lanes    Speed    MTU    FEC    Alias    Vlan    Oper    Admin    Type    Asym PFC
-----------  -------  -------  -----  -----  -------  ------  ------  -------  ------  ----------

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@mlok-nokia mlok-nokia requested a review from lguohan as a code owner March 1, 2022 00:21
@mlok-nokia
Copy link
Contributor Author

@judyjoseph This PR is to address the issue of Supervisor card fails to load the instance config_db#.json file. Thanks

@rlhui rlhui added the Chassis 🤖 Modular chassis support label Mar 16, 2022
@rlhui rlhui added the P0 Priority of the issue label Mar 16, 2022
… in chassis when system is reboot

Signed-off-by: mlok <marty.lok@nokia.com>
@mlok-nokia
Copy link
Contributor Author

@judyjoseph I just updated the PR with 60 seconds change. Thanks

@mlok-nokia
Copy link
Contributor Author

Hi Judy, Please take a look this PR with update change. Thanks

@judyjoseph judyjoseph merged commit 23f9126 into sonic-net:master May 9, 2022
@judyjoseph judyjoseph added Request for 202111 Branch For PRs being requested for 202111 branch Included in 202111 Branch labels May 9, 2022
judyjoseph pushed a commit that referenced this pull request May 16, 2022
… in chassis when system is reboot (#10106)

Supervisor card fails to load config_db#.json in chassis when system reboot. 
This is an intermittent issue, fixes #10105
@mlok-nokia mlok-nokia deleted the supervisor-load-config branch January 26, 2023 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chassis 🤖 Modular chassis support Included in 202111 Branch P0 Priority of the issue Request for 202111 Branch For PRs being requested for 202111 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[VoQ][config] Multiasic Supervisor card fails to load config_db#.json in chassis when system reboot
3 participants