Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes the host service crash during fast-reboot #700

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Kalimuthu-Velappan
Copy link
Contributor

@Kalimuthu-Velappan Kalimuthu-Velappan commented Oct 9, 2019

Change-Id: I634f60453dfc70f755381fdf0aec56457e1b45c7

- What I did
Fixes the host service crash during fast-reboot:

Fast reboot command restarts the system by killing the services quickly instead of doing a graceful shutdown. As part of that, it stops the docker service before stopping the host services that cause database connection reset errors.

     Aug 13 06:00:43.877495 Leaf1 INFO caclmgrd[2403]: self.config_db.listen()
     Aug 13 06:00:43.877520 Leaf1 INFO caclmgrd[2403]: File "/usr/local/lib/python2.7/dist-ackages/swsssdk/configdb.py", line 94, in listen

     Aug 13 06:00:43.877651 Leaf1 INFO hostcfgd[2463]: for item in self.pubsub.listen():
     Aug 13 06:00:43.877671 Leaf1 INFO hostcfgd[2463]: File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 2501, in listen
     Aug 13 06:00:43.879178 Leaf1 INFO hostcfgd[2463]: raise ConnectionError(self._error_message(e))
     Aug 13 06:00:43.879197 Leaf1 INFO hostcfgd[2463]: redis.exceptions.ConnectionError: Error 111 connecting to 127.0.0.1:6379. Connection refused.
     Aug 13 06:00:43.879253 Leaf1 INFO caclmgrd[2403]: raise ConnectionError(self._error_message(e))
     Aug 13 06:00:43.879272 Leaf1 INFO caclmgrd[2403]: redis.exceptions.ConnectionError: Error 111 connecting to 127.0.0.1:6379. Connection refused.

- How I did it
Stop the following host services before stopping the docker service.

  root# systemctl stop hostcfgd.service caclmgrd.service

- How to verify it
Run the fast-reboot command multiple times.

- Previous command output (if the output of a command-line utility has changed)

- New command output (if the output of a command-line utility has changed)

-->

Kalimuthu-Velappan and others added 2 commits October 9, 2019 03:17
Change-Id: I634f60453dfc70f755381fdf0aec56457e1b45c7
Copy link
Contributor Author

@Kalimuthu-Velappan Kalimuthu-Velappan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the systemctl service

@Kalimuthu-Velappan Kalimuthu-Velappan marked this pull request as ready for review October 11, 2019 09:19
@lguohan lguohan requested a review from yxieca October 18, 2019 01:12
@yxieca
Copy link
Contributor

yxieca commented Oct 18, 2019

@Kalimuthu-Velappan Just want to confirm, the issue is only error messages in the syslog, there is not fast reboot failure without this change, right?

@Kalimuthu-Velappan
Copy link
Contributor Author

Yes, the issue is only error messages in the Syslog, there is no fast reboot failure without this change

@yxieca
Copy link
Contributor

yxieca commented Oct 22, 2019

retest this please

1 similar comment
@yxieca
Copy link
Contributor

yxieca commented Oct 23, 2019

retest this please

@jleveque
Copy link
Contributor

Retest this please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants