Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monit] Monitor multiple processes with the same name but using different arguments. #4257

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
7 changes: 7 additions & 0 deletions dockers/docker-dhcp-relay/base_image_files/monit_dhcp_relay
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
###############################################################################
## Monit configuration for dhcp_relay container
## process list
## dhcrelay
###############################################################################
check program monit_dhcrelay with path "/usr/bin/monit_multiprocesses.py --container-name dhcp_relay"
if status != 0 then alert
7 changes: 7 additions & 0 deletions dockers/docker-teamd/base_image_files/monit_teamd
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
###############################################################################
## Monit configuration for teamd container
## process list:
## teamd
###############################################################################
check program monit_teamd with path "/usr/bin/monit_multiprocesses.py --container-name teamd"
if status != 0 then alert
2 changes: 2 additions & 0 deletions files/build_templates/sonic_debian_extension.j2
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,8 @@ sudo cp $IMAGE_CONFIGS/monit/monitrc $FILESYSTEM_ROOT/etc/monit/
sudo chmod 600 $FILESYSTEM_ROOT/etc/monit/monitrc
sudo cp $IMAGE_CONFIGS/monit/conf.d/* $FILESYSTEM_ROOT/etc/monit/conf.d/
sudo chmod 600 $FILESYSTEM_ROOT/etc/monit/conf.d/*
sudo cp $IMAGE_CONFIGS/monit/monit_multiprocesses.py $FILESYSTEM_ROOT/usr/bin/
sudo chmod 755 $FILESYSTEM_ROOT/usr/bin/monit_multiprocesses.py

# Copy crontabs
sudo cp -f $IMAGE_CONFIGS/cron.d/* $FILESYSTEM_ROOT/etc/cron.d/
Expand Down
88 changes: 88 additions & 0 deletions files/image_config/monit/monit_multiprocesses.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
#!/usr/bin/python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file should be broken into two separate files, one for teamd and one for dhcp_relay. In the repo, the files should reside in the directories of their respective dockers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will break it into two separate files and place each one into their docker directories in the repo.

'''
This script is used to monitor teamd process and dhcrelay process in teamd and dhcp_relay
docker container respectively. Since Monit can only monitor the process with unique name,
it is unable to do this monitoring for teamd and dhcrelay processes. Usually there will be
multiple teamd and dhcrelay processes which executes a same commad but with different arguments.
The number of teamd processes is decided by the number of port channels in Config_DB and
the number of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers.
As such, we let Monit to monitor this script which will read number of port channles and
vlans with no-empty list of dhcp servers form Config_DB, then find whether there exist a
process in Linux corresponding to a port channel or a vlan. If this script fails to find
such process, it will write an alert message into syslog file.
'''

import os
import subprocess
import re
import sys
import syslog
import argparse

from swsssdk import ConfigDBConnector


def retrieve_portchannels():
port_channels = []

config_db = ConfigDBConnector()
config_db.connect()
port_channel_table = config_db.get_table('PORTCHANNEL')

for key in port_channel_table.keys():
port_channels.append(key)

return port_channels

def check_teamd_process():
port_channels = retrieve_portchannels()
cmd = "sudo monit procmatch '/usr/bin/teamd -r -t '"
cmd_res = subprocess.check_output(cmd, shell=True)

for port_channel in port_channels:
found_process = re.findall(port_channel, cmd_res)
if len(found_process) == 0:
syslog.syslog(syslog.LOG_ERR, "Teamd process with {} is not running.".format(port_channel))

def retrieve_vlans():
vlans = []

config_db = ConfigDBConnector()
config_db.connect()
vlan_table = config_db.get_table('VLAN')

for vlan in vlan_table.keys():
if vlan_table[vlan].has_key('dhcp_servers') and len(vlan_table[vlan]['dhcp_servers']) != 0:
vlans.append(vlan)

return vlans

def check_dhcp_relay_process():
vlans = retrieve_vlans()
cmd = "sudo monit procmatch '/usr/sbin/dhcrelay -d -m discard'"
cmd_res = subprocess.check_output(cmd, shell=True)

for vlan in vlans:
found_process = re.findall(vlan, cmd_res)
if len(found_process) == 0:
syslog.syslog(syslog.LOG_ERR, "dhcrelay process with {} is not running.".format(vlan))


def main():
parser = argparse.ArgumentParser()
parser.add_argument('-c', '--container-name')
args = parser.parse_args()
if args.container_name == '':
syslog.syslog(syslog.LOG_ERR, "contianer name is not specified. Exiting...")
sys.exit(1)

if args.container_name == 'teamd':
check_teamd_process()
elif args.container_name == 'dhcp_relay':
check_dhcp_relay_process()
else:
syslog.syslog(syslog.LOG_ERR, "container name is invalid. Exiting...")
sys.exit(2)

if __name__ == '__main__':
main()
1 change: 1 addition & 0 deletions rules/docker-dhcp-relay.mk
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,4 @@ $(DOCKER_DHCP_RELAY)_CONTAINER_NAME = dhcp_relay
$(DOCKER_DHCP_RELAY)_RUN_OPT += --privileged -t
$(DOCKER_DHCP_RELAY)_RUN_OPT += -v /etc/sonic:/etc/sonic:ro
$(DOCKER_DHCP_RELAY)_FILES += $(SUPERVISOR_PROC_EXIT_LISTENER_SCRIPT)
$(DOCKER_DHCP_RELAY)_BASE_IMAGE_FILES += monit_dhcp_relay:/etc/monit/conf.d
1 change: 1 addition & 0 deletions rules/docker-teamd.mk
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,5 @@ $(DOCKER_TEAMD)_RUN_OPT += -v /etc/sonic:/etc/sonic:ro
$(DOCKER_TEAMD)_RUN_OPT += -v /host/warmboot:/var/warmboot

$(DOCKER_TEAMD)_BASE_IMAGE_FILES += teamdctl:/usr/bin/teamdctl
$(DOCKER_TEAMD)_BASE_IMAGE_FILES += monit_teamd:/etc/monit/conf.d
$(DOCKER_TEAMD)_FILES += $(SUPERVISOR_PROC_EXIT_LISTENER_SCRIPT)