Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make action retries configurable #147876

Merged
merged 8 commits into from
Dec 23, 2022

Conversation

ersin-erdal
Copy link
Contributor

@ersin-erdal ersin-erdal commented Dec 20, 2022

Resolves: #146222

This PR makes maximum number of retries of an action configurable.

Follows the same pattern we used in alerting plugin.
xpack.actions.run.maxAttempts as a global settings and
xpack.actions.run.connectorTypeOverrides to override the global settings for specific connector types.

@ersin-erdal ersin-erdal added Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) release_note:feature Makes this part of the condensed release notes v8.7.0 labels Dec 20, 2022
@ersin-erdal ersin-erdal marked this pull request as ready for review December 21, 2022 01:58
@ersin-erdal ersin-erdal requested a review from a team as a code owner December 21, 2022 01:58
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

Copy link
Contributor

@doakalexi doakalexi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Verified locally, and works as expected

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Unknown metric groups

ESLint disabled in files

id before after diff
osquery 1 2 +1

ESLint disabled line counts

id before after diff
enterpriseSearch 19 21 +2
fleet 61 67 +6
osquery 109 115 +6
securitySolution 439 445 +6
total +20

Total ESLint disabled count

id before after diff
enterpriseSearch 20 22 +2
fleet 70 76 +6
osquery 110 117 +7
securitySolution 515 521 +6
total +21

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@@ -199,6 +199,22 @@ Specifies the time allowed for requests to external resources. Requests that tak
+
For example, `20m`, `24h`, `7d`, `1w`. Default: `60s`.

`xpack.actions.run.maxAttempts` {ess-icon}::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says that maxAttempts indicates the number of retries, but when verifying it seems that it means the number of tries. When I set maxAttempts to 1, I only see the action run once and then not retried, when I would expect it to see it retried 1 time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set maxAttempts: 1 and force an email error. After the first action run, I see this log:

[2022-12-21T13:22:25.464-05:00][ERROR][plugins.actions.email] Error: email action error
    at Object.executor (/Users/ying/Code/kibana/x-pack/plugins/stack_connectors/server/connector_types/stack/email/index.ts:230:9)
    at Object.wrapper [as executor] (/Users/ying/Code/kibana/node_modules/lodash/lodash.js:5255:19)
    at /Users/ying/Code/kibana/x-pack/plugins/actions/server/lib/action_executor.ts:154:38
[2022-12-21T13:22:25.464-05:00][WARN ][plugins.actions.email] action execution failure: .email:gmail: email: my gmail: an error occurred while running the action: email action error; retry: true

which says retry: true. Since maxAttempts is meant to be max retries, I would expect this to be retried once and the logs indicate it will be retried. However, the action task is deleted after this and the action is never retried.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, @ymao1! We should fix our docs to mention "attempt" instead of "retries". Basically this value confirms how many times an action will run (attempt) before aborting it.

Copy link
Contributor

@mikecote mikecote Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that the getRetry / retry: true stuff can be confusing 🙈 +1 to a follow up if we didn't break it in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I don't think this is caused in this PR. A followup issue is fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced retry with attempt :)
And checked the retry: true in the response, action executor returns it when there is an error other than a validation error. I think it means retryable not "the last execution was a retry".

Copy link
Contributor

@ymao1 ymao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ersin-erdal ersin-erdal merged commit ffb1dc3 into elastic:main Dec 23, 2022
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Dec 23, 2022
@ersin-erdal ersin-erdal deleted the 146222-configurable-retry branch December 23, 2022 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:feature Makes this part of the condensed release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Configurable retries for running a connector
7 participants