Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Transaction duration anomaly alerting integration #75719

Merged
merged 11 commits into from
Aug 28, 2020

Conversation

ogupte
Copy link
Contributor

@ogupte ogupte commented Aug 22, 2020

Closes #72636.

Adds alerting integration for APM transaction duration anomalies.

Screen Shot 2020-08-26 at 5 18 04 PM

@ogupte ogupte requested review from a team as code owners August 22, 2020 07:42
@botelastic botelastic bot added the Team:APM All issues that need APM UI Team support label Aug 22, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:apm)

@formgeist

This comment has been minimized.

Comment on lines +36 to +40
const canReadAnomalies = !!(
isMlPluginEnabled &&
capabilities.ml.canAccessML &&
capabilities.ml.canGetJobs
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: looks like all three are booleans so there should be no reason to coerce with !!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the double negative because the logical and expression returns the type of the last operand (capabilities.ml.canGetJobs) which is: boolean | Readonly<{[x: string]: boolean;}>. Without the double NOT operator, canReadAnomalies fails the type assignment for the canReadAnomalies prop of the AlertIntegrations component.

onChange={(e) =>
setAlertParams(
'transactionType',
e.target.value as Params['transactionType']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to avoid as Params['transactionType']. Is it because e.target.value can be other types than string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably safely avoid the type assertion. It's a holdover from TransactionDurationAlertTrigger which this component was based on.

}
const alertParams = params as TypeOf<typeof paramsSchema>;
const mlClient = services.getLegacyScopedClusterClient(ml.mlClient);
const request = { params: 'DummyKibanaRequest' } as KibanaRequest;
Copy link
Member

@sorenlouv sorenlouv Aug 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "DummyKibanaRequest" some leftover boilerplate?

// to filter out legacy jobs we are filtering by the existence of `apm_ml_version` in `custom_settings`
// and checking that it is compatable.
const mlJobs = response.jobs.filter(
(job) => (job.custom_settings?.job_tags?.apm_ml_version ?? 0) >= 2
);
if (environment) {
if (environment && environment !== ENVIRONMENT_ALL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it a bug that we didn't take this into account previously?
also: this would go away if we got rid of "all" envs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't a bug before since the environment value was omitted (in the uiFilters object) when "All" was the selected environment filter. If we get rid of the "All" as a environment filter option from the alert creation overlay, then we can get rid of this. We could also omit this check if we do this check in the alert executor function up the call stack.

Comment on lines +26 to +32
// APM is using this service in anomaly alert, kibana alerting doesn't provide request object
// So we are adding a dummy request for now
// TODO: Remove this once kibana alerting provides request object
const hasMlCapabilities =
request.params !== 'DummyKibanaRequest'
? getHasMlCapabilities(request)
: (_caps: string[]) => Promise.resolve();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see now why you used "DummyKibanaRequest". This doesn't seem like the right approach. Either request should be optional or we should not use anomalyDetectorsProvider. Passing in a dummy request object is going to cause confusion down the line I think.

Copy link
Member

@jgowdyelastic jgowdyelastic Aug 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DummyKibanaRequest will not be needed after this PR is merged #74965
However a fake request object (e.g. {} as KibanaRequest) will still be needed until #64588 is fixed.
Making request optional will require more changes to undo our work arounds once this issue is resolved.

Copy link
Contributor Author

@ogupte ogupte Aug 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgowdyelastic do i need to hold off on merging this PR until yours is merged first? LMK if i should make any changes to the DummyKibanaRequest pattern in the meantime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that PR is a few days away from being merged. so i'm happy incorporate and update the changes in this PR if you merge first.

…on/environment_filter_values

- utilize getEnvironmentLabel in the alerting trigger components and added support for the 'All' label
{label}
</EuiHealth>
);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice with a dedicated component 👍

@formgeist
Copy link
Contributor

A possible bug: I noticed that the alerts popover doesn't close when the flyout is displayed.

Screenshot 2020-08-26 at 13 15 39

I understand that because the anomaly detection job is on the environment, we have to write out the service name in the condition, but I also find it not useful for the end-user. The alerts are only currently available within the selected service view, so unless we're thinking of adding this alert type to the general APM view (no service selected), I think we can do with hiding it.

Screenshot 2020-08-26 at 13 16 58

Example from duration threshold alert

Screenshot 2020-08-26 at 13 17 15

Existing duration anomaly alert

Copy link
Member

@sorenlouv sorenlouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this lgtm.
Would be good if we can find a better solution than the DummyKibanaRequest but not something blocking this PR

Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ML changes LGTM

- default environment alert settings to the selected filter
… anomaly detection jobs

- Removes Service name and transaction type from the set of expressions in the alerting setup
@ogupte
Copy link
Contributor Author

ogupte commented Aug 27, 2020

@formgeist, i reorganized the alerting menu to better support different alerting types for specific metrics:
alerting-menu-reorg

@ogupte
Copy link
Contributor Author

ogupte commented Aug 27, 2020

@elasticmachine merge upstream

@formgeist
Copy link
Contributor

@formgeist, i reorganized the alerting menu to better support different alerting types for specific metrics:
alerting-menu-reorg

Looks good, I think this makes sense. Minor nit; would you remove the icons from the "Create..." options inside the nested menu? They don't really belong there any longer.

@ogupte
Copy link
Contributor Author

ogupte commented Aug 27, 2020

@formgeist you suggested removing the service name read-only expression in the transaction duration anomaly alert setup. I also did the same for the transaction type since it will always be either request or page-load depending on the service. LMK if you think we should change it back:
Screen Shot 2020-08-27 at 12 51 11 AM

@formgeist
Copy link
Contributor

@ogupte I had a think after our meeting yesterday, and this morning I put together some proposals for improvements. I don't want to unnecessarily block this PR, since what we have works. But in regards to the service name expression, I've suggested to keep it in order to allow for broader scoping of the alert. We discussed this yesterday briefly, but I've materialized the UX a bit more in my proposal #76068 Look forward to hear everyone's thoughts on these improvements.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

@kbn/optimizer bundle module count

id value diff baseline
apm 945 +2 943

async chunks size

id value diff baseline
apm 4.7MB +25.8KB 4.7MB

page load bundle size

id value diff baseline
apm 41.7KB +1.1KB 40.6KB

distributable file count

id value diff baseline
total 53244 +9 53235

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@ogupte ogupte merged commit b1cb8b8 into elastic:master Aug 28, 2020
ogupte added a commit to ogupte/kibana that referenced this pull request Aug 28, 2020
* Closes elastic#72636. Adds alerting integration for APM transaction duration anomalies.

* Code review feedback

* Display alert summary with the selected anomaly severity label instead of the anomaly score.

* - refactored ALL_OPTION and NOT_DEFINED_OPTION to be shared from common/environment_filter_values
- utilize getEnvironmentLabel in the alerting trigger components and added support for the 'All' label

* refactor get_all_environments to minimize exports and be more consistent and clean

* - Reorg the alerts menu for different alert types (threshold/anomaly)
- default environment alert settings to the selected filter

* - Filters default transaction type to only those supported in the APM anomaly detection jobs
- Removes Service name and transaction type from the set of expressions in the alerting setup

* - remove bell icon from alerts menu

* Adds target service back into the anomaly alert setup as a ready-only expression

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
ogupte added a commit that referenced this pull request Aug 28, 2020
#76224)

* [APM] Transaction duration anomaly alerting integration (#75719)

* Closes #72636. Adds alerting integration for APM transaction duration anomalies.

* Code review feedback

* Display alert summary with the selected anomaly severity label instead of the anomaly score.

* - refactored ALL_OPTION and NOT_DEFINED_OPTION to be shared from common/environment_filter_values
- utilize getEnvironmentLabel in the alerting trigger components and added support for the 'All' label

* refactor get_all_environments to minimize exports and be more consistent and clean

* - Reorg the alerts menu for different alert types (threshold/anomaly)
- default environment alert settings to the selected filter

* - Filters default transaction type to only those supported in the APM anomaly detection jobs
- Removes Service name and transaction type from the set of expressions in the alerting setup

* - remove bell icon from alerts menu

* Adds target service back into the anomaly alert setup as a ready-only expression

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

* Fixes prettier linting

* Fixes prettier linting again

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
jloleysens added a commit to jloleysens/kibana that referenced this pull request Aug 31, 2020
…s-for-710

* 'master' of github.com:elastic/kibana: (43 commits)
  [APM] Chart units don't update when toggling the chart legends (elastic#74931)
  [ILM] Add support for frozen phase in UI (elastic#75968)
  Hides advanced json for count metric (elastic#74636)
  add client-side feature usage API (elastic#75486)
  [Maps] add drilldown support map embeddable (elastic#75598)
  [Enterprise Search] Request handler refactors/enhancements + update existing routes to use shared handler (elastic#76106)
  [Resolver] model `location.search` in redux (elastic#76140)
  [APM] Prevent imports of public in server code (elastic#75979)
  fix eslint issue
  skip flaky suite (elastic#76223)
  [APM] Transaction duration anomaly alerting integration (elastic#75719)
  [Transforms] Avoid using "Are you sure" (elastic#75932)
  [Security Solution][Exceptions] - Fix bug of alerts not updating after closure from exceptions modal (elastic#76145)
  [plugin-helpers] improve 3rd party KP plugin support (elastic#75019)
  [docs/getting-started] link to yarn v1 specifically (elastic#76169)
  [Security_Solution][Resolver] Resolver loading and error state (elastic#75600)
  Fixes App Search documentation links (elastic#76133)
  Fix alerts unable to create / update when the name has trailing whitepace(s) (elastic#76079)
  [Resolver] Fix useSelector usage (elastic#76129)
  [Enterprise Search] Migrate util and components from ent-search (elastic#76051)
  ...

# Conflicts:
#	x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/components/data_tier_allocation/node_allocation.tsx
#	x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/edit_policy.tsx
#	x-pack/plugins/index_lifecycle_management/public/application/services/policies/types.ts
#	x-pack/plugins/index_lifecycle_management/public/application/services/policies/warm_phase.ts
@cauemarcondes cauemarcondes self-assigned this Oct 13, 2020
@cauemarcondes
Copy link
Contributor

Tests ok

@cauemarcondes cauemarcondes added the apm:test-plan-done Pull request that was successfully tested during the test plan label Oct 13, 2020
@bmorelli25 bmorelli25 mentioned this pull request Oct 15, 2020
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apm:test-plan-done Pull request that was successfully tested during the test plan Team:APM All issues that need APM UI Team support v7.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[APM] Anomaly detection based alerts
7 participants