-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(slo): health status #181351
feat(slo): health status #181351
Conversation
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
0c60115
to
53f803d
Compare
1178a14
to
b0d7db6
Compare
Pinging @elastic/obs-ux-management-team (Team:obs-ux-management) |
04fc5e2
to
43445df
Compare
43445df
to
769e195
Compare
4906a47
to
1284f67
Compare
89cb82e
to
deb1808
Compare
/ci |
data-test-subj="sloHealthCalloutInspectTransformButton" | ||
color="warning" | ||
fill | ||
href={http?.basePath.prepend('/app/management/data/transform')} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to pre-filter the transform list ? possibly not , though, i guess we can contribute to the transform list page so that it filters using query params.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not supported at the moment :(
x-pack/plugins/observability_solution/slo/server/services/get_slo_health.ts
Outdated
Show resolved
Hide resolved
x-pack/plugins/observability_solution/slo/server/services/get_slo_health.ts
Show resolved
Hide resolved
/ci |
…slo_health.ts Co-authored-by: Shahzad <shahzad31comp@gmail.com>
e869452
to
4a606ee
Compare
queryFn: async ({ signal }) => { | ||
try { | ||
const response = await http.post<FetchSLOHealthResponse>( | ||
'/internal/observability/slos/_health', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kdelemme We were discussing with @lucabelluccini about enhancing the Kibana diagnostic tool to pull from this API. One question that was brought up was the fact that it is a POST
request. We can inject the appropriate 'kbn-xsrf' and 'elastic-api-version: 1' headers for this to the diagnostic tool.
What I was wondering now is the list payload that is required for this endpoint. Looking at this file, I am wondering how would we pass the list of SLOs? Looks like all requests in kibana yml file are GET requests, no? Could we make the list prop optional and if not passed, then it accepts all SLOs by default? What are your thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to limit to the provided list because fetching all SLOs at once would be too much in some case. Same way we don't return all SLOs in the find API.
But what we can do instead, is use directly the transform stats endpoint from this diagnostic tool with the slo-*
id, this will return in one request all the SLO transform stats. (I think there is still a limit on this API, like 1000)
Then the diagnostic tool could filter & transform the result to keep only the health.status part of it, or return the payload as is.
This API is already available and GET.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !!
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Module Count
Public APIs missing comments
Async chunks
Canvas Sharable Runtime
Page load bundle
History
To update your PR or re-run it, just comment with: |
Resolves #176088
🍒 Summary
This PR implements a new internal routes for fetching the health and state of a list of slo id. The state can be one of the following options:
no_data
,indexing
,running
orstale
.While the health is directly correlated to the related transforms' health.
The state decision tree is as follow:
summaryUpdatedAt
> 48hours: state = "stale"summaryUpdatedAt
-latestSliTimestamp
>= 10 minutes: state = "indexing"summaryUpdatedAt
-latestSliTimestamp
< 10 minutes: state = "running"We display a warning on the SLO details page when one of the transform is unhealthy, asking the user to go investigate: