Skip to content

Monitoring

Amy Buck edited this page Nov 20, 2018 · 17 revisions

OpenSwitch OPX supports network monitoring features to monitor and capture network traffic in the system. It also provides tools to collect port and VLAN statistics and port media information.

System alarms

System alarms alert you to conditions that might prevent normal operation of the switch—ranked by their impact on the network. The following shows the range of alarms—from alarms that have the most impact to alarms that have the least impact on the network:

  • Critical — critical condition exists and requires immediate action. A critical alarm may be triggered if one or more hardware components has failed, or one or more hardware components has exceeded temperature thresholds.

  • Major — major error occurred and requires escalation or notification. A major alarm may be triggered if an interface configuration has triggered a critical warning—such as a port-channel being down.

  • Minor — minor error or non-critical condition occurred that, if left unchecked, might cause system interruption in service or degradation in performance. A minor alarm requires monitoring or maintenance.

  • Informational — informational error occurred which does not impact performance. An information alarm should be monitored until the condition changes.

Once an alarm is active, it has one of these states:

  • Active — alarms that are current and not yet acknowledged or cleared

  • Cleared — alarms that are resolved and the device has returned to normal operation

Some alarms go directly from active to cleared state and require little to no administrative effort. Other alarms with a high severity should be acknowledged or investigated.

Show alarms

$ opx-show-alms
2018-11-16 13:31:12.170129 Fan tray 1 absent
2018-11-16 13:34:09.012345 Temperature sensor NPU sensor warning
Clone this wiki locally