Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerts][Docs] Alert types doc update. Added refs to applications specific alerts groups. #91787

Merged
merged 15 commits into from
Feb 23, 2021
161 changes: 14 additions & 147 deletions docs/user/alerting/alert-types.asciidoc
Original file line number Diff line number Diff line change
@@ -1,159 +1,26 @@
[role="xpack"]
[[alert-types]]
== Standard stack alert types
== Alert types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about removing the word "type" from the headings. I don't feel that it adds any value. So the navigation would look like this:

  • Actions and connectors
  • Alerts
    Index threshold
    ES query

Copy link
Contributor Author

@YulNaumenko YulNaumenko Feb 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree 👍


{kib} supplies alert types in two ways: some are built into {kib} (these are known as stack alerts), while domain-specific alert types are registered by {kib} apps such as <<xpack-apm,*APM*>>, <<xpack-ml,*{ml-cap}*>>, <<metrics-app,*Metrics*>>, and <<uptime-app,*Uptime*>>.
{kib} supplies alert types in two ways: some are built into {kib} (these are known as stack alerts), while domain-specific alert types are registered by {kib} apps.
YulNaumenko marked this conversation as resolved.
Show resolved Hide resolved

This section covers stack alerts. For domain-specific alert types, refer to the documentation for that app.
=== Standard stack alert types

This section covers stack alerts.
YulNaumenko marked this conversation as resolved.
Show resolved Hide resolved
Users will need `all` access to the *Stack Alerts* feature to be able to create and edit any of the alerts listed below.
YulNaumenko marked this conversation as resolved.
Show resolved Hide resolved
See <<kibana-feature-privileges, feature privileges>> for more information on configuring roles that provide access to this feature.
YulNaumenko marked this conversation as resolved.
Show resolved Hide resolved

Currently {kib} provides two stack alerts: <<alert-type-index-threshold>> and <<alert-type-es-query>>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Currently {kib} provides two stack alerts: <<alert-type-index-threshold>> and <<alert-type-es-query>>.
{kib} provides two stack alerts:
* <<alert-type-index-threshold>>
* <<alert-type-es-query>>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the geo alert be added to this list?

Copy link
Contributor Author

@YulNaumenko YulNaumenko Feb 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I spoke with Maps team and they want it to be separately, because they are planning more alerts types for Maps.
(cc @aaronjcaldwell )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was awkward when I was doing the docs from the geo side too. On one hand, Geo Containment is a stack alert and I expect the next one, Geo Proximity, will be one too. It's possible we'll do more beyond this but that's TBD. They're distinct from other stack alerts in that they're geo-specific and require a Gold+ license, however I'm aware that's not something we need to distinguish in the docs. I guess I'm flexible in where and how we list them. @gchaps you probably have the best bird's eye view on docs. Any recommendations here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this PR out and then review the placement of the geo alert at a later date.


[float]
[[alert-type-index-threshold]]
=== Index threshold

The index threshold alert type is designed to run an {es} query over indices, aggregating field values from documents, comparing them to threshold values, and scheduling actions to run when the thresholds are met.

[float]
==== Creating the alert

An index threshold alert can be created from the *Create* button in the <<alert-management, alert management UI>>. Fill in the <<defining-alerts-general-details, general alert details>>, then select *Index Threshold*.

[role="screenshot"]
image::images/alert-types-index-threshold-select.png[Choosing an index threshold alert type]

[float]
==== Defining the conditions

The index threshold has 5 clauses that define the condition to detect.

[role="screenshot"]
image::images/alert-types-index-threshold-conditions.png[Five clauses define the condition to detect]

Index:: This clause requires an *index or index pattern* and a *time field* that will be used for the *time window*.
When:: This clause specifies how the value to be compared to the threshold is calculated. The value is calculated by aggregating a numeric field a the *time window*. The aggregation options are: `count`, `average`, `sum`, `min`, and `max`. When using `count` the document count is used, and an aggregation field is not necessary.
Over/Grouped Over:: This clause lets you configure whether the aggregation is applied over all documents, or should be split into groups using a grouping field. If grouping is used, an <<alerting-concepts-alert-instances, alert instance>> will be created for each group when it exceeds the threshold. To limit the number of instances on high cardinality fields, you must specify the number of groups to check against the threshold. Only the *top* groups are checked.
Threshold:: This clause defines a threshold value and a comparison operator (one of `is above`, `is above or equals`, `is below`, `is below or equals`, or `is between`). The result of the aggregation is compared to this threshold.
Time window:: This clause determines how far back to search for documents, using the *time field* set in the *index* clause. Generally this value should be to a value higher than the *check every* value in the <<defining-alerts-general-details, general alert details>>, to avoid gaps in detection.

If data is available and all clauses have been defined, a preview chart will render the threshold value and display a line chart showing the value for the last 30 intervals. This can provide an indication of recent values and their proximity to the threshold, and help you tune the clauses.

[role="screenshot"]
image::images/alert-types-index-threshold-preview.png[Five clauses define the condition to detect]

[float]
=== Example

In this section, you will use the {kib} <<add-sample-data, weblog sample dataset>> to setup and tune the conditions on an index threshold alert. For this example, we want to detect when any of our top three sites have served more than 420,000 bytes over a 24 hour period.

From the <<alert-management, alert management UI>>, create a new alert, and fill in the <<defining-alerts-general-details, general alert details>>. This alert will be checked every 4 hours, and will not execute actions more than once per day. Choose the index threshold alert type.

[role="screenshot"]
image::images/alert-types-index-threshold-select.png[Choosing an index threshold alert type]

Click on each clause to open a control that helps you set the value:

[float]
==== Index clause
The index clause control will list and allow you to search for available indices. Choose *kibana_sample_data_logs*

[role="screenshot"]
image::images/alert-types-index-threshold-example-index.png[Choosing an index]

Once an index is selected, the list of time fields for that index will be available to select. Choose *@timestamp*.

[role="screenshot"]
image::images/alert-types-index-threshold-example-timefield.png[Choosing a time field]

[float]
==== When clause

We want to detect the number of bytes served during the time window, so we select `sum` as the aggregation, and `bytes` as the field to aggregate.

[role="screenshot"]
image::images/alert-types-index-threshold-example-aggregation.png[Choosing the aggregation]

[float]
==== Over/Grouped over clause

We want to alert on the three sites that have the most traffic, so we'll group the sum of bytes by the `host.keyword` field and take the top 3 values.

[role="screenshot"]
image::images/alert-types-index-threshold-example-grouping.png[Choosing the groups]

[float]
==== Threshold clause

We want to alert when any site exceeds 420,000 bytes over a 24 hour period, so we'll set the threshold to 420,000 and use the `is above` comparison.

[role="screenshot"]
image::images/alert-types-index-threshold-example-threshold.png[Setting the threshold]

[float]
==== Time window clause

Finally, set the time window to 24 hours to complete the alert configuration.

[role="screenshot"]
image::images/alert-types-index-threshold-example-window.png[Setting the time window]

The preview chart will render showing the 24 hour sum of bytes at 4 hours intervals (the *check every* interval) for the past 120 hours (the last 30 intervals).

[role="screenshot"]
image::images/alert-types-index-threshold-example-preview.png[Setting the time window]

[float]
==== Comparing time windows

You can interactively change the time window and observe the effect it has on the chart. Compare a 24 window to a 12 hour window. Notice the variability in the sum of bytes, due to different traffic levels during the day compared to at night. This variability would result in noisy alerts, so the 24 hour window is better. The preview chart can help you find the right values for your alert.

[role="screenshot"]
image::images/alert-types-index-threshold-example-comparison.png[Comparing two time windows]

[float]
[[alert-type-es-query]]
=== ES query

The ES query alert type is designed to run a user-configured {es} query over indices, compare the number of matches to a configured threshold, and schedule
actions to run when the threshold condition is met.

[float]
==== Creating the alert

An ES query alert can be created from the *Create* button in the <<alert-management, alert management UI>>. Fill in the <<defining-alerts-general-details, general alert details>>, then select *ES query*.

[role="screenshot"]
image::images/alert-types-es-query-select.png[Choosing an ES query alert type]

[float]
==== Defining the conditions

The ES query alert has 5 clauses that define the condition to detect.

[role="screenshot"]
image::images/alert-types-es-query-conditions.png[Four clauses define the condition to detect]

Index:: This clause requires an *index or index pattern* and a *time field* that will be used for the *time window*.
Size:: This clause specifies the number of documents to pass to the configured actions when the the threshold condition is met.
ES query:: This clause specifies the ES DSL query to execute. The number of documents that match this query will be evaulated against the threshold
condition. Aggregations are not supported at this time.
Threshold:: This clause defines a threshold value and a comparison operator (`is above`, `is above or equals`, `is below`, `is below or equals`, or `is between`). The number of documents that match the specified query is compared to this threshold.
Time window:: This clause determines how far back to search for documents, using the *time field* set in the *index* clause. Generally this value should be set to a value higher than the *check every* value in the <<defining-alerts-general-details, general alert details>>, to avoid gaps in detection.

[float]
==== Testing your query

Use the *Test query* feature to verify that your query DSL is valid.

When your query is valid:: Valid queries will be executed against the configured *index* using the configured *time window*. The number of documents that
match the query will be displayed.
=== Domain-specific alert types

[role="screenshot"]
image::images/alert-types-es-query-valid.png[Test ES query returns number of matches when valid]
For domain-specific alert types, refer to the documentation for that app.
YulNaumenko marked this conversation as resolved.
Show resolved Hide resolved
Currently we the next alerts grouped by the application:
YulNaumenko marked this conversation as resolved.
Show resolved Hide resolved

When your query is invalid:: An error message is shown if the query is invalid.
* <<{observability-guide}/create-alerts.html, Observability alerts>>
YulNaumenko marked this conversation as resolved.
Show resolved Hide resolved
* <<{security-guide}/prebuilt-rules.html, Security alerts>>
* <<geo-alerting, Maps alerts>>
* <<xpack-ml, ML alerts>>

[role="screenshot"]
image::images/alert-types-es-query-invalid.png[Test ES query shows error when invalid]
include::stack-alerts/index-threshold.asciidoc[]
YulNaumenko marked this conversation as resolved.
Show resolved Hide resolved
include::stack-alerts/es-query.asciidoc[]
45 changes: 45 additions & 0 deletions docs/user/alerting/stack-alerts/es-query.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
[role="xpack"]
[[alert-type-es-query]]
== ES query

The ES query alert type is designed to run a user-configured {es} query over indices, compare the number of matches to a configured threshold, and schedule
actions to run when the threshold condition is met.

[float]
=== Creating the alert

An ES query alert can be created from the *Create* button in the <<alert-management, alert management UI>>. Fill in the <<defining-alerts-general-details, general alert details>>, then select *ES query*.

[role="screenshot"]
image::user/alerting/images/alert-types-es-query-select.png[Choosing an ES query alert type]

[float]
=== Defining the conditions

The ES query alert has 5 clauses that define the condition to detect.

[role="screenshot"]
image::user/alerting/images/alert-types-es-query-conditions.png[Four clauses define the condition to detect]

Index:: This clause requires an *index or index pattern* and a *time field* that will be used for the *time window*.
Size:: This clause specifies the number of documents to pass to the configured actions when the the threshold condition is met.
ES query:: This clause specifies the ES DSL query to execute. The number of documents that match this query will be evaulated against the threshold
condition. Aggregations are not supported at this time.
Threshold:: This clause defines a threshold value and a comparison operator (`is above`, `is above or equals`, `is below`, `is below or equals`, or `is between`). The number of documents that match the specified query is compared to this threshold.
Time window:: This clause determines how far back to search for documents, using the *time field* set in the *index* clause. Generally this value should be set to a value higher than the *check every* value in the <<defining-alerts-general-details, general alert details>>, to avoid gaps in detection.

[float]
=== Testing your query

Use the *Test query* feature to verify that your query DSL is valid.

When your query is valid:: Valid queries will be executed against the configured *index* using the configured *time window*. The number of documents that
match the query will be displayed.

[role="screenshot"]
image::user/alerting/images/alert-types-es-query-valid.png[Test ES query returns number of matches when valid]

When your query is invalid:: An error message is shown if the query is invalid.

[role="screenshot"]
image::user/alerting/images/alert-types-es-query-invalid.png[Test ES query shows error when invalid]
101 changes: 101 additions & 0 deletions docs/user/alerting/stack-alerts/index-threshold.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
[role="xpack"]
[[alert-type-index-threshold]]
== Index threshold

The index threshold alert type is designed to run an {es} query over indices, aggregating field values from documents, comparing them to threshold values, and scheduling actions to run when the thresholds are met.

[float]
=== Creating the alert

An index threshold alert can be created from the *Create* button in the <<alert-management, alert management UI>>. Fill in the <<defining-alerts-general-details, general alert details>>, then select *Index Threshold*.

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-select.png[Choosing an index threshold alert type]

[float]
=== Defining the conditions

The index threshold has 5 clauses that define the condition to detect.

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-conditions.png[Five clauses define the condition to detect]

Index:: This clause requires an *index or index pattern* and a *time field* that will be used for the *time window*.
When:: This clause specifies how the value to be compared to the threshold is calculated. The value is calculated by aggregating a numeric field a the *time window*. The aggregation options are: `count`, `average`, `sum`, `min`, and `max`. When using `count` the document count is used, and an aggregation field is not necessary.
Over/Grouped Over:: This clause lets you configure whether the aggregation is applied over all documents, or should be split into groups using a grouping field. If grouping is used, an <<alerting-concepts-alert-instances, alert instance>> will be created for each group when it exceeds the threshold. To limit the number of instances on high cardinality fields, you must specify the number of groups to check against the threshold. Only the *top* groups are checked.
Threshold:: This clause defines a threshold value and a comparison operator (one of `is above`, `is above or equals`, `is below`, `is below or equals`, or `is between`). The result of the aggregation is compared to this threshold.
Time window:: This clause determines how far back to search for documents, using the *time field* set in the *index* clause. Generally this value should be to a value higher than the *check every* value in the <<defining-alerts-general-details, general alert details>>, to avoid gaps in detection.

If data is available and all clauses have been defined, a preview chart will render the threshold value and display a line chart showing the value for the last 30 intervals. This can provide an indication of recent values and their proximity to the threshold, and help you tune the clauses.

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-preview.png[Five clauses define the condition to detect]

[float]
=== Example

In this section, you will use the {kib} <<add-sample-data, weblog sample dataset>> to setup and tune the conditions on an index threshold alert. For this example, we want to detect when any of our top three sites have served more than 420,000 bytes over a 24 hour period.

From the <<alert-management, alert management UI>>, create a new alert, and fill in the <<defining-alerts-general-details, general alert details>>. This alert will be checked every 4 hours, and will not execute actions more than once per day. Choose the index threshold alert type.

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-select.png[Choosing an index threshold alert type]

Click on each clause to open a control that helps you set the value:

[float]
=== Index clause
The index clause control will list and allow you to search for available indices. Choose *kibana_sample_data_logs*

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-example-index.png[Choosing an index]

Once an index is selected, the list of time fields for that index will be available to select. Choose *@timestamp*.

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-example-timefield.png[Choosing a time field]

[float]
=== When clause

We want to detect the number of bytes served during the time window, so we select `sum` as the aggregation, and `bytes` as the field to aggregate.

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-example-aggregation.png[Choosing the aggregation]

[float]
=== Over/Grouped over clause

We want to alert on the three sites that have the most traffic, so we'll group the sum of bytes by the `host.keyword` field and take the top 3 values.

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-example-grouping.png[Choosing the groups]

[float]
=== Threshold clause

We want to alert when any site exceeds 420,000 bytes over a 24 hour period, so we'll set the threshold to 420,000 and use the `is above` comparison.

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-example-threshold.png[Setting the threshold]

[float]
=== Time window clause

Finally, set the time window to 24 hours to complete the alert configuration.

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-example-window.png[Setting the time window]

The preview chart will render showing the 24 hour sum of bytes at 4 hours intervals (the *check every* interval) for the past 120 hours (the last 30 intervals).

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-example-preview.png[Setting the time window]

[float]
=== Comparing time windows

You can interactively change the time window and observe the effect it has on the chart. Compare a 24 window to a 12 hour window. Notice the variability in the sum of bytes, due to different traffic levels during the day compared to at night. This variability would result in noisy alerts, so the 24 hour window is better. The preview chart can help you find the right values for your alert.

[role="screenshot"]
image::user/alerting/images/alert-types-index-threshold-example-comparison.png[Comparing two time windows]