Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dhcp_relay] Use dhcprelayd to manage critical processes #17236

Merged
merged 1 commit into from
Nov 27, 2023

Conversation

yaqiangz
Copy link
Contributor

@yaqiangz yaqiangz commented Nov 20, 2023

Why I did it

Currently, dhcpv4 related processes in dhcp_relay container are managed by supervisord (Configuration file for them is generated when container start), which require restart dhcp_relay container when change.
In ipv4 dhcp_server enabled scenario (HLD: sonic-net/SONiC#1282), we need DHCP packets transferred to netdev docker0, cmds is like below:

/usr/sbin/dhcrelay -d -m discard -a %h:%p %P --name-alias-map-file /tmp/port-name-alias-map.txt -id Vlan1000 -iu docker0 240.127.1.2

It means that we need to run dhcrelay/dhcpmon processes in another format when dhcp_server feature is enabled. To avoid container restart while state of dhcp_server feature enabled, use dhcprelayd to manage dhcrelay/dhcpmon process.

Work item tracking
  • Microsoft ADO (number only): 25904413

How I did it

  1. Modify j2 template files in docker-dhcp-relay. Add dhcprelayd to group dhcp-relay instead of isc-dhcp-relay-VlanXXX, which would make dhcprelayd to become critical process.
  2. In dhcprelayd, subscribe FEATURE table to check whether dhcp_server feature is enabled.
    2.1 If dhcp_server feature is disabled, means we need original dhcp_relay functionality, dhcprelayd would do nothing. Because dhcrelay/dhcpmon configuration is generated in supervisord configuration, they will automatically run.
    2.2 If dhcp_server feature is enabled, dhcprelayd will stop dhcpmon/dhcrelay processes started by supervisord and subscribe dhcp_server related tables in config_db to start dhcpmon/dhcrelay processes.
    2.3 While dhcprelayd running, it will regularly check feature status (by default per 5s) and would encounter below 4 state change about dhcp_server feature:
    A) disabled -> enabled
    In this scenario, dhcprelayd will subscribe dhcp_server related tables and stop dhcpmon/dhcrelay processes started by supervisord and start new pair of dhcpmon/dhcrelay processes. After this, dhcpmon/dhcrelay processes are totally managed by dhcprelayd.
    B) enabled -> enabled
    In this scenaro, dhcprelayd will monitor db changes in dhcp_server related tables to determine whether to restart dhcpmon/dhrelay processes.
    C) enabled -> disabled
    In this scenario, dhcprelayd would unsubscribe dhcp_server related tables and kill dhcpmon/dhcrelay processes started by itself. And then dhcprelayd will start dhcpmon/dhcrelay processes via supervisorctl.
    D) disabled -> disabled
    dhcprelayd will check whether dhcrelay processes running status consistent with supervisord configuration file. If they are not consistent, dhcprelayd will kill itself, then dhcp_relay container will stop because dhcprelayd is critical process.

How to verify it

  1. Unit tests passed.
=============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.9.2, pytest-6.0.2, py-1.10.0, pluggy-0.13.0
rootdir: /sonic/src/sonic-dhcp-server, configfile: setup.cfg, testpaths: tests
plugins: pyfakefs-5.3.0, cov-2.10.1
collected 408 items                                                                                                                                                                                               

tests/test_dhcp_cfggen.py ..................                                                                                                                                                                [  4%]
tests/test_dhcp_db_monitor.py ............................................................................................................................................................................. [ 46%]
...................................................................                                                                                                                                         [ 63%]
tests/test_dhcp_lease.py .....                                                                                                                                                                              [ 64%]
tests/test_dhcprelayd.py ......................................................................................................                                                                             [ 89%]
tests/test_dhcpservd.py .......                                                                                                                                                                             [ 91%]
tests/test_utils.py ....................................                                                                                                                                                    [100%]

----------- coverage: platform linux, python 3.9.2-final-0 -----------
Name                                   Stmts   Miss Branch BrPart     Cover   Missing
-------------------------------------------------------------------------------------
dhcp_server/dhcprelayd/dhcprelayd.py     179     21     68      0    86.64%   94-133
dhcp_server/dhcpservd/dhcpservd.py        67      4     14      0    92.59%   91-98
dhcp_server/dhcpservd/dhcp_lease.py       95      3     30      4    94.40%   57->59, 59, 60->61, 61, 74->76, 112->134, 153
dhcp_server/dhcpservd/dhcp_cfggen.py     217      6     92      1    97.73%   201-205, 291->293, 293
dhcp_server/common/utils.py               76      1     42      1    98.31%   129->134, 134
-------------------------------------------------------------------------------------
TOTAL                                    899     35    334      6    95.54%

2 files skipped due to complete coverage.

Required test coverage of 80.0% reached. Total coverage: 95.54%
  1. Build wheel and install it manually to test.
  2. Run dhcp_relay tests in sonic-mgmt in t0/m0/m0-2vlan topos, all passed

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@yaqiangz yaqiangz force-pushed the master_dhcp_relay_critical_change branch 2 times, most recently from 2454b14 to 5f44566 Compare November 20, 2023 10:15
@yaqiangz yaqiangz force-pushed the master_dhcp_relay_critical_change branch 2 times, most recently from d2d5013 to c336233 Compare November 20, 2023 15:48
@yaqiangz yaqiangz marked this pull request as ready for review November 20, 2023 15:52
@yaqiangz yaqiangz force-pushed the master_dhcp_relay_critical_change branch from c336233 to 171adc2 Compare November 21, 2023 00:33
@yaqiangz
Copy link
Contributor Author

@kellyyeh Could you pls help to review this PR?

@yaqiangz
Copy link
Contributor Author

@yxieca Could you help to merge this PR?

@yxieca yxieca merged commit da80593 into sonic-net:master Nov 27, 2023
20 checks passed
yxieca pushed a commit that referenced this pull request Dec 4, 2023
Modify j2 template files in docker-dhcp-relay. Add dhcprelayd to group dhcp-relay instead of isc-dhcp-relay-VlanXXX, which would make dhcprelayd to become critical process.
In dhcprelayd, subscribe FEATURE table to check whether dhcp_server feature is enabled.
2.1 If dhcp_server feature is disabled, means we need original dhcp_relay functionality, dhcprelayd would do nothing. Because dhcrelay/dhcpmon configuration is generated in supervisord configuration, they will automatically run.
2.2 If dhcp_server feature is enabled, dhcprelayd will stop dhcpmon/dhcrelay processes started by supervisord and subscribe dhcp_server related tables in config_db to start dhcpmon/dhcrelay processes.
2.3 While dhcprelayd running, it will regularly check feature status (by default per 5s) and would encounter below 4 state change about dhcp_server feature:
A) disabled -> enabled
In this scenario, dhcprelayd will subscribe dhcp_server related tables and stop dhcpmon/dhcrelay processes started by supervisord and start new pair of dhcpmon/dhcrelay processes. After this, dhcpmon/dhcrelay processes are totally managed by dhcprelayd.
B) enabled -> enabled
In this scenaro, dhcprelayd will monitor db changes in dhcp_server related tables to determine whether to restart dhcpmon/dhrelay processes.
C) enabled -> disabled
In this scenario, dhcprelayd would unsubscribe dhcp_server related tables and kill dhcpmon/dhcrelay processes started by itself. And then dhcprelayd will start dhcpmon/dhcrelay processes via supervisorctl.
D) disabled -> disabled
dhcprelayd will check whether dhcrelay processes running status consistent with supervisord configuration file. If they are not consistent, dhcprelayd will kill itself, then dhcp_relay container will stop because dhcprelayd is critical process.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants