-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dockers] Update critical_processes file syntax #4831
[dockers] Update critical_processes file syntax #4831
Conversation
… entries. One kind of entry is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes managed by supervisord using the name "xxx". I also updated the logic to parse the file critical_processes in supervisor-proc-event-listener script. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
critical_processes file. Signed-off-by: Yong Zhao <yozhao@microsoft.com>
to handle the cases such as the process name is "group" or "program". Signed-off-by: Yong Zhao <yozhao@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix conflicts. Looks like your repo was out-of-date. I recently renamed "docker-lldp-sv2" to "docker-lldp" and "docker-snmp-sv2" to "docker-snmp".
Fixed the conflicts. |
@yozhao101: This will not cherry-pick cleanly to the 201911 branch due to the directory name changes and the absence of new containers, so you will also need to open a separate PR against that branch. |
Yes, I will open a new PR against 201911 branch. |
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
retest broadcom please |
Retest broadcom please |
Retest broadcom please |
Backport of #4831 to the 201911 branch
**- Why I did it** Initially, the critical_processes file contains either the name of critical process or the name of group. For example, the critical_processes file in the dhcp_relay container contains a single group name `isc-dhcp-relay`. When testing the autorestart feature of each container, we need get all the critical processes and test whether a container can be restarted correctly if one of its critical processes is killed. However, it will be difficult to differentiate whether the names in the critical_processes file are the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user. Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes managed by supervisord using the name "xxx". At the same time, I also updated the logic to parse the file critical_processes in supervisor-proc-event-listener script. **- How to verify it** We can first enable the autorestart feature of a specified container for example `dhcp_relay` by running the comman `sudo config container feature autorestart dhcp_relay enabled` on DUT. Then we can select a critical process from the command `docker top dhcp_relay` and use the command `sudo kill -SIGKILL <pid>` to kill that critical process. Final step is to check whether the container is restarted correctly or not.
Signed-off-by: Yong Zhao yozhao@microsoft.com
- Why I did it
Initially, the critical_processes file contains either the name of critical process or the name of group.
For example, the critical_processes file in the dhcp_relay container contains a single group name
isc-dhcp-relay
. When testing the autorestart feature of each container, we need get all the criticalprocesses and test whether a container can be restarted correctly if one of its critical processes is
killed. However, it will be difficult to differentiate whether the names in the critical_processes file are
the critical processes or group names. At the same time, changing the syntax in this file will separate the individual process from the groups and also makes it clear to the user.
Right now the critical_processes file contains two different kind of entries. One is "program:xxx" which indicates a critical process. Another is "group:xxx" which indicates a group of critical processes
managed by supervisord using the name "xxx". At the same time, I also updated the logic to
parse the file critical_processes in supervisor-proc-event-listener script.
- How I did it
- How to verify it
We can first enable the autorestart feature of a specified container for example
dhcp_relay
by running the commansudo config container feature autorestart dhcp_relay enabled
on DUT. Then we can select a critical process from the commanddocker top dhcp_relay
and use the commandsudo kill -SIGKILL <pid>
to kill that critical process. Final step is to check whether the container is restarted correctly or not.- Description for the changelog