Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monit] Monitor multiple processes with the same name but using different arguments. #4257

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

[Monit] Monitor multiple processes with the same name but using different arguments. #4257

wants to merge 9 commits into from

Conversation

yozhao101
Copy link
Contributor

- What I did
This script is used to monitor teamd process and dhcrelay process in teamd and dhcp_relay
docker container respectively. Since Monit can only monitor the process with unique name,
it is unable to do this monitoring for teamd and dhcrelay processes. Usually there will be
multiple teamd and dhcrelay processes which executes a same commad but with different arguments.

- How I did it
The number of teamd processes is decided by the number of port channels in Config_DB and
the number of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers. As such, we let Monit to monitor this script which will read number of port channles and
vlans with no-empty list of dhcp servers form Config_DB, then find whether there exist a
process in Linux corresponding to a port channel or a vlan. If this script fails to find
such process, it will write an alert message into syslog file.

- How to verify it
We can explicitly kill a teamd process or dhcrelay process and then check whether there
will be an alert message written in syslog file.

different arguments.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
name is valid or not.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
@yozhao101 yozhao101 requested a review from jleveque March 12, 2020 16:44
@@ -0,0 +1,88 @@
#!/usr/bin/python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file should be broken into two separate files, one for teamd and one for dhcp_relay. In the repo, the files should reside in the directories of their respective dockers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will break it into two separate files and place each one into their docker directories in the repo.

teamd and dhcrelay processes.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
check_teamd_processes.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Since Monit can only monitor the process with unique name, it is unable to do
this monitoring for dhcrelay processes. Usually there will be multiple dhcrelay
processes which executes a same commad but with different arguments. The number
of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/non-empry/non-empty/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

#!/usr/bin/python
'''
This script is used to monitor dhcrelay processes in dhcp_relay docker container.
Since Monit can only monitor the process with unique name, it is unable to do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/the process with unique name/processes with unique names/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded.

'''
This script is used to monitor dhcrelay processes in dhcp_relay docker container.
Since Monit can only monitor the process with unique name, it is unable to do
this monitoring for dhcrelay processes. Usually there will be multiple dhcrelay
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Usually there will be multiple/There can exist multiple/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded.

This script is used to monitor dhcrelay processes in dhcp_relay docker container.
Since Monit can only monitor the process with unique name, it is unable to do
this monitoring for dhcrelay processes. Usually there will be multiple dhcrelay
processes which executes a same commad but with different arguments. The number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/commad/command

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

processes which executes a same commad but with different arguments. The number
of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers.
As such, we let Monit to monitor this script which will read number of vlans with
no-empty list of dhcp servers form Config_DB, then find whether there exist a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/no-empry/non-empty/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

processes which executes a same commad but with different arguments. The number
of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers.
As such, we let Monit to monitor this script which will read number of vlans with
no-empty list of dhcp servers form Config_DB, then find whether there exist a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/exist/exists/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

of dhcrelay processes is determined by Vlans which have non-empry list of dhcp servers.
As such, we let Monit to monitor this script which will read number of vlans with
no-empty list of dhcp servers form Config_DB, then find whether there exist a
process in Linux corresponding to a vlan. If this script fails to find such process,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove extra space before "such"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed extra space.


def check_teamd_processes():
port_channels = retrieve_portchannels()
cmd = "sudo monit procmatch '/usr/bin/teamd -r -t '"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than making a call to monit, I'd prefer if we use a Python library like psutil.

Copy link
Contributor Author

@yozhao101 yozhao101 Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! I will do that. I also found psutil library is not installed by default in host image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used psutil library to check whether one of teamd processes is running or not. Please help me review.


def check_dhcrelay_processes():
vlans = retrieve_vlans()
cmd = "sudo monit procmatch '/usr/sbin/dhcrelay -d -m discard'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than making a call to monit, I'd prefer if we use a Python library like psutil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used psutil library to check whether one of dhcrelay processes is running or not. Please help me review.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
…relay

processes is running or not.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>

from swsssdk import ConfigDBConnector

def retrieve_vlans():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the approach is complicated. suggest to use supervisor ctl to check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants