Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task Manager] Throttle TM health logging #102783

Closed
chrisronline opened this issue Jun 21, 2021 · 5 comments · Fixed by #102804
Closed

[Task Manager] Throttle TM health logging #102783

chrisronline opened this issue Jun 21, 2021 · 5 comments · Fixed by #102804
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@chrisronline
Copy link
Contributor

Relates to #101751

Some possible ideas:

  • Gate entire logging changes behind a config switch (off by default)
  • Log a message telling users they can switch config on when the status changes (from green -> warn/error)
  • While in a warn/error status, perhaps log at some throttled interval (every 1h?)
@chrisronline chrisronline added bug Fixes for quality problems that affect the customer experience Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Jun 21, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@tsullivan
Copy link
Member

Chiming in to say I find this log message could be have more useful context information:
[warning][plugins][taskManager] Detected delay task start of 69.115s (which exceeds configured value of 60s)

Can you include the "task type" in the message? For example, I suspect my example log message applies to a report:execute task.

@chrisronline
Copy link
Contributor Author

@tsullivan That's a good idea! I added it to the PR in 4321ec8

@spong
Copy link
Member

spong commented Jun 24, 2021

Not sure if this is the right place to mention, but also noticed these new TM health logging and saw a misspelling in one of the Latest Monitored Stats fields.

capacity_requirments vs capacity_requirements (source)

Looks like this is new for 7.14 as of #100475, so may still be able to fix before release if we want.

@chrisronline
Copy link
Contributor Author

@spong Thanks, I'll log a ticket for it!

@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
5 participants