Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot start system-health daemon #12802

Open
Pesa opened this issue Nov 22, 2022 · 6 comments
Open

Cannot start system-health daemon #12802

Pesa opened this issue Nov 22, 2022 · 6 comments
Labels
NVIDIA Triaged this issue has been triaged

Comments

@Pesa
Copy link

Pesa commented Nov 22, 2022

Description

The system-health daemon fails to start. It crashes immediately with the following error:

Nov 22 18:31:32 sonic healthd[5354]: Starting up...
Nov 22 18:31:32 sonic healthd[5354]: sonic_platform package not installed. Cannot start system-health daemon
Nov 22 18:31:32 sonic healthd[5354]: Traceback (most recent call last):
Nov 22 18:31:32 sonic healthd[5354]:   File "/usr/local/bin/healthd", line 115, in <module>
Nov 22 18:31:32 sonic healthd[5354]:     main()
Nov 22 18:31:32 sonic healthd[5354]:   File "/usr/local/bin/healthd", line 111, in main
Nov 22 18:31:32 sonic healthd[5354]:     health_monitor.run()
Nov 22 18:31:32 sonic healthd[5354]:   File "/usr/local/bin/healthd", line 92, in run
Nov 22 18:31:32 sonic healthd[5354]:     sysmon.task_stop()
Nov 22 18:31:32 sonic healthd[5354]: UnboundLocalError: local variable 'sysmon' referenced before assignment

This keeps happening in a loop as systemd tries to restart the service but it crashes every time.

Steps to reproduce the issue:

  1. Start the sonic-vs image in qemu, e.g.: qemu-system-x86_64 -machine q35 -m 4096 -smp 4 -hda sonic-vs.img -nographic -netdev user,id=sonic0,hostfwd=tcp::5555-:22 -device e1000,netdev=sonic0
  2. Check the logs with sudo journalctl -f

Describe the results you received:

Describe the results you expected:

Output of show version:

SONiC Software Version: SONiC.master.178296-283de9ac8
Distribution: Debian 11.5
Kernel: 5.10.0-18-2-amd64
Build commit: 283de9ac8
Build date: Tue Nov 22 16:15:26 UTC 2022
Built by: AzDevOps@vmss-soni0000N4

Platform: x86_64-kvm_x86_64-r0
HwSKU: Force10-S6000
ASIC: vs
ASIC Count: 1
Serial Number: N/A
Model Number: N/A
Hardware Revision: N/A
Uptime: 18:42:12 up 20 min,  1 user,  load average: 6.31, 7.77, 7.20
Date: Tue 22 Nov 2022 18:42:12

Docker images:
REPOSITORY                    TAG                       IMAGE ID       SIZE
docker-orchagent              latest                    7ab4a4170e53   525MB
docker-orchagent              master.178296-283de9ac8   7ab4a4170e53   525MB
docker-fpm-frr                latest                    0d885b42a1c8   536MB
docker-fpm-frr                master.178296-283de9ac8   0d885b42a1c8   536MB
docker-teamd                  latest                    ca68274f07e0   506MB
docker-teamd                  master.178296-283de9ac8   ca68274f07e0   506MB
docker-macsec                 latest                    23ac43281082   508MB
docker-dhcp-relay             latest                    ce7cd58e629a   499MB
docker-eventd                 latest                    78582709616f   490MB
docker-eventd                 master.178296-283de9ac8   78582709616f   490MB
docker-gbsyncd-vs             latest                    97d8b631f724   498MB
docker-gbsyncd-vs             master.178296-283de9ac8   97d8b631f724   498MB
docker-snmp                   latest                    4d29ccd76c32   536MB
docker-snmp                   master.178296-283de9ac8   4d29ccd76c32   536MB
docker-sonic-p4rt             latest                    4becdb7b78cb   572MB
docker-sonic-p4rt             master.178296-283de9ac8   4becdb7b78cb   572MB
docker-platform-monitor       latest                    90ac8b98d60f   617MB
docker-platform-monitor       master.178296-283de9ac8   90ac8b98d60f   617MB
docker-database               latest                    75f9b286de92   490MB
docker-database               master.178296-283de9ac8   75f9b286de92   490MB
docker-sonic-telemetry        latest                    94a2669fb792   784MB
docker-sonic-telemetry        master.178296-283de9ac8   94a2669fb792   784MB
docker-router-advertiser      latest                    ef2fe08f818b   490MB
docker-router-advertiser      master.178296-283de9ac8   ef2fe08f818b   490MB
docker-mux                    latest                    2aef3beb3e4a   538MB
docker-mux                    master.178296-283de9ac8   2aef3beb3e4a   538MB
docker-lldp                   latest                    6a5ab217cc73   532MB
docker-lldp                   master.178296-283de9ac8   6a5ab217cc73   532MB
docker-nat                    latest                    b4609e334e73   478MB
docker-nat                    master.178296-283de9ac8   b4609e334e73   478MB
docker-sflow                  latest                    c97435eb87e8   476MB
docker-sflow                  master.178296-283de9ac8   c97435eb87e8   476MB
docker-syncd-vs               latest                    1730878544d6   473MB
docker-syncd-vs               master.178296-283de9ac8   1730878544d6   473MB
docker-sonic-mgmt-framework   latest                    12cf63b0ab87   609MB
docker-sonic-mgmt-framework   master.178296-283de9ac8   12cf63b0ab87   609MB

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

@azure-pipelines-wrapper
Copy link

Thanks for opening this issue!

@tjchadaga
Copy link
Contributor

@keboliu - please help take a look

@tjchadaga tjchadaga added Triaged this issue has been triaged NVIDIA labels Dec 7, 2022
@Junchao-Mellanox
Copy link
Collaborator

Junchao-Mellanox commented Dec 9, 2022

Hi @tjchadaga , sysmon.task_stop() is part of feature SYSTEM READY which was contributed by BRCM. HLD here: https://github.com/sonic-net/SONiC/blob/master/doc/system_health_monitoring/system-ready-HLD.md.

@sg893052, could you please take a look?

@sg893052
Copy link
Contributor

sg893052 commented Dec 9, 2022

Sure, I will take a look at it. Thanks!

@Junchao-Mellanox
Copy link
Collaborator

Junchao-Mellanox commented Dec 9, 2022

Hi @Pesa , it failed because this line import sonic_platform.platform. It seems your system does not have platform API installed. Could you please check?

Nov 22 18:31:32 sonic healthd[5354]: sonic_platform package not installed. Cannot start system-health daemon

@bluecmd
Copy link
Contributor

bluecmd commented Apr 14, 2023

Same issue for me running the latest master build. Indeed, the sonic_platform is not installed, and no platform packages are available:

$ ls /host/image-master.252369-280499876/platform/
grub

UPDATE: I made it a bit further by running ln -sf /usr/share/sonic/device/x86_64-kvm_x86_64-r0/sonic_platform /usr/local/lib/python3.9/dist-packages/, but that makes system-health complain about missing EEPROM I2C devices.

UPDATE #2: Seems that directory was a red herring. It was not supposed to even be there in the first place; see PR #14667

UPDATE #3: Seems to be the same issue as #7862

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NVIDIA Triaged this issue has been triaged
Projects
None yet
Development

No branches or pull requests

5 participants