Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] Monitoring and Auto-mitigating the unhealthy of docker containers in SONiC #564

Open
wants to merge 57 commits into
base: master
Choose a base branch
from

Commits on Feb 18, 2020

  1. [monitoring] Add a document to provide the details about the monitoring

    the running status of critical process and resource usage.
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 18, 2020
    Configuration menu
    Copy the full SHA
    15c53ce View commit details
    Browse the repository at this point in the history

Commits on Feb 19, 2020

  1. [Monitoring] Add an item in the section of overview.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    689c5a7 View commit details
    Browse the repository at this point in the history
  2. [Moniting] Add functional requirements.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    2b31fef View commit details
    Browse the repository at this point in the history
  3. [Monitoring] Add section of design overview.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    6a2c01a View commit details
    Browse the repository at this point in the history
  4. [Monitoring] add section of design overview.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    ac56da8 View commit details
    Browse the repository at this point in the history
  5. [Monitoring] Add section of design overview.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    e294a9c View commit details
    Browse the repository at this point in the history
  6. [Monitoring] Add introduction for auto-restart feature in overview.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    9546882 View commit details
    Browse the repository at this point in the history
  7. [Monitoring] Add the section of basic approach.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    8f157ec View commit details
    Browse the repository at this point in the history
  8. [Monitoring] Add paragraph in section of basic approach.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    752dad0 View commit details
    Browse the repository at this point in the history
  9. [Monitoring] Add description in the section of feature overview.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    38d6cab View commit details
    Browse the repository at this point in the history
  10. [Monitoring] Delete some extra blank lines.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    df37188 View commit details
    Browse the repository at this point in the history
  11. [Monitoring] Reword in the feature overview.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    6d04987 View commit details
    Browse the repository at this point in the history
  12. [Monitoring] Add a section of use cases.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    9724d9e View commit details
    Browse the repository at this point in the history
  13. [Monitoring] Add section of Monitoring Critical Processes.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    fe17999 View commit details
    Browse the repository at this point in the history
  14. [Moniting] Add a section about monitoring the critical process.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    5d3bdfa View commit details
    Browse the repository at this point in the history
  15. [Monitoring] Add a section of monitoring critical resources.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 19, 2020
    Configuration menu
    Copy the full SHA
    c948aa2 View commit details
    Browse the repository at this point in the history

Commits on Feb 20, 2020

  1. [Monitoring] Add a section of auto-restart docker container.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    c5c0191 View commit details
    Browse the repository at this point in the history
  2. [Monitoring] Correct the hyper-link.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    4023874 View commit details
    Browse the repository at this point in the history
  3. [Monitoring] Correct the typo in the hyper-link.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    7a84612 View commit details
    Browse the repository at this point in the history
  4. [Monitoring] Correct a typo in the hyper-link.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    9941852 View commit details
    Browse the repository at this point in the history
  5. [Monitoring] Add a hyper-link for container feature table.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    58c1f79 View commit details
    Browse the repository at this point in the history
  6. [Monitoring] Reword the sentence in the section of feature overview.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    da03448 View commit details
    Browse the repository at this point in the history

Commits on Feb 21, 2020

  1. [Monitoring] Reword the sentences in the section of auto-restart

    feature.
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 21, 2020
    Configuration menu
    Copy the full SHA
    9884fc2 View commit details
    Browse the repository at this point in the history

Commits on Feb 24, 2020

  1. [Doc-Monitoring] Reword the title and the section of feature overview.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 24, 2020
    Configuration menu
    Copy the full SHA
    e0f0d96 View commit details
    Browse the repository at this point in the history
  2. [Doc-monitoring] Reworded the sentences and fixed the typo.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 24, 2020
    Configuration menu
    Copy the full SHA
    1dc3a96 View commit details
    Browse the repository at this point in the history
  3. [Doc-monitoring] Reword and correct the typos.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 24, 2020
    Configuration menu
    Copy the full SHA
    0124b94 View commit details
    Browse the repository at this point in the history
  4. [Doc-monitoring] Revised the functional requirement.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 24, 2020
    Configuration menu
    Copy the full SHA
    0774344 View commit details
    Browse the repository at this point in the history
  5. [Doc-monitoring] Reword the basic approach.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 24, 2020
    Configuration menu
    Copy the full SHA
    a852c35 View commit details
    Browse the repository at this point in the history
  6. [Doc-monitoring] Reworded basic approach and fix the typos.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 24, 2020
    Configuration menu
    Copy the full SHA
    a5d094b View commit details
    Browse the repository at this point in the history
  7. [Doc-monitoring] Correct the typo of supervisord.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 24, 2020
    Configuration menu
    Copy the full SHA
    93826e4 View commit details
    Browse the repository at this point in the history
  8. [Doc-monitoring] When a process changes from running to exited, the

    event type should be PROCESS_STATE_EXITED.
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 24, 2020
    Configuration menu
    Copy the full SHA
    0e84f87 View commit details
    Browse the repository at this point in the history
  9. [Doc-monitoring] Reword the mechanism of event listener to 'event

    listener' mechanism.
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 24, 2020
    Configuration menu
    Copy the full SHA
    965fc61 View commit details
    Browse the repository at this point in the history

Commits on Feb 25, 2020

  1. [Doc-monitoring] Correct a typo and remove the init_cfg.json in line 90

    since the status of auto-restart feature in init_cfg.json is fixed and
    we should not change the content in this file.
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    5c69e6e View commit details
    Browse the repository at this point in the history
  2. [Doc-monitoring] Reword the gives to provides in line 101.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    a040c34 View commit details
    Browse the repository at this point in the history
  3. [Doc-monitoring] Reword the sentence "we emplyed 'event listener'

    mechanism" to "we employed the 'event listener' mechanism".
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    a28459a View commit details
    Browse the repository at this point in the history
  4. [Doc-monitoring] Reword the line 68 to we leveraged the 'event listener'

    mechanism ...
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    8b270be View commit details
    Browse the repository at this point in the history
  5. [Doc-monitoring] Add the proposed section for memory, cpu and disk

    alert.
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    7710e1b View commit details
    Browse the repository at this point in the history
  6. [Doc-monitoring] Add a section for the new proposal resource alerting.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    6d73a9d View commit details
    Browse the repository at this point in the history
  7. [Doc-monitoring] Place the value of memory threshold in section 2.5.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    d4c4fd4 View commit details
    Browse the repository at this point in the history
  8. [Doc-monitoring] Reorganize the sections 2.2.3 and 2.2.4.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    f36f5ef View commit details
    Browse the repository at this point in the history
  9. [Doc-monitoring] Reword the section 2.2.2.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    2edc3d4 View commit details
    Browse the repository at this point in the history
  10. [Doc-monitoring] Reword in the section 2.2.4 Monitoring Critical

    Resource Usage.
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    d041d78 View commit details
    Browse the repository at this point in the history

Commits on Mar 8, 2020

  1. [Monitoring] Add a section to describe the relationship between

    auto-restart and warm re-boot. Add a paragraph to introduce how can
    we use Monit to monitor multiple processes with the same command.
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Mar 8, 2020
    Configuration menu
    Copy the full SHA
    f94d019 View commit details
    Browse the repository at this point in the history
  2. [Monitoring] Add a word "same" in the last sentence of section 2.2.1

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Mar 8, 2020
    Configuration menu
    Copy the full SHA
    02bd31c View commit details
    Browse the repository at this point in the history

Commits on Mar 9, 2020

  1. [Monitrong] Reword the section 1.3.3.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Mar 9, 2020
    Configuration menu
    Copy the full SHA
    fa20bea View commit details
    Browse the repository at this point in the history
  2. [Monitoring] Correct a commection symbol.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Mar 9, 2020
    Configuration menu
    Copy the full SHA
    e4d9a8d View commit details
    Browse the repository at this point in the history
  3. [Monitoring] Fix a error for connection symbol.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Mar 9, 2020
    Configuration menu
    Copy the full SHA
    7c917c0 View commit details
    Browse the repository at this point in the history

Commits on Mar 10, 2020

  1. [Monitoring] Swap the location of section 2.2.2 and section 2.2.3.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Mar 10, 2020
    Configuration menu
    Copy the full SHA
    3fad48f View commit details
    Browse the repository at this point in the history
  2. [Monitoring] Correct a typo.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Mar 10, 2020
    Configuration menu
    Copy the full SHA
    eb30432 View commit details
    Browse the repository at this point in the history
  3. [Monitoring] Delete an extra space.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Mar 10, 2020
    Configuration menu
    Copy the full SHA
    a84bfdf View commit details
    Browse the repository at this point in the history
  4. [Monitoring] Delete the file which is added mistakenly.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Mar 10, 2020
    Configuration menu
    Copy the full SHA
    8a908c2 View commit details
    Browse the repository at this point in the history

Commits on Jul 22, 2021

  1. [memory_restart] Add the description of monitoring the critical process

    by Supervisord and high memory restart.
    
    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Jul 22, 2021
    Configuration menu
    Copy the full SHA
    7056a9f View commit details
    Browse the repository at this point in the history
  2. [memory_restart] Fix the format issue.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Jul 22, 2021
    Configuration menu
    Copy the full SHA
    9b30502 View commit details
    Browse the repository at this point in the history
  3. [memory_restart] Fix the format issues.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Jul 22, 2021
    Configuration menu
    Copy the full SHA
    dc80bcb View commit details
    Browse the repository at this point in the history
  4. [memory_restart] Change the syntax of show and config commands.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Jul 22, 2021
    Configuration menu
    Copy the full SHA
    7ed89b7 View commit details
    Browse the repository at this point in the history
  5. [mem_restart] Fix the typos.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Jul 22, 2021
    Configuration menu
    Copy the full SHA
    702e4d8 View commit details
    Browse the repository at this point in the history
  6. [mem_restart] Fix the typos.

    Signed-off-by: Yong Zhao <yozhao@microsoft.com>
    yozhao101 committed Jul 22, 2021
    Configuration menu
    Copy the full SHA
    91c5d9b View commit details
    Browse the repository at this point in the history