Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task Manager] Add partitions to tasks and assigns those task partitions to Kibana nodes #188758

Merged
merged 6 commits into from
Jul 19, 2024

Conversation

doakalexi
Copy link
Contributor

@doakalexi doakalexi commented Jul 19, 2024

Resolves #187700
Resolves #187698

Summary

This is a feature branch PR to main. Merging the following PRs that have already been approved, #188001 and #188368

doakalexi and others added 6 commits July 15, 2024 07:36
Resolves #187698

## Summary


This PR does the following:
- Adds a new `partition` field to the task manager index
- Assigns a partition to a task if there is not one when creating or
updating

### Checklist

- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios


### To verify

New tasks
- Create a rule and verify that the newly created tasks have the
`partition` field

Old tasks
- Checkout main and create a new rule, let it run
- Stop kibana
- Checkout this branch and restart kibana
- Verify that the old tasks get updated with the `partition` field

ex. the query I use to look at the ES query rule task
```
POST .kibana_task_manager*/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "task.taskType": {
              "value": "alerting:.es-query"
            }
          }
        }
      ]
    }
  }
}
```

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Resolves #187700

## Summary

This PR uses the discovery service assign a subset of the partitions to
each Kibana node so only two Kibana nodes fight for the same tasks.

### Checklist

- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios


### To verify

This change is only for mget, so add the following to `kibana.yml`
```
xpack.task_manager.claim_strategy: 'unsafe_mget'
```
**Testing locally**

Old tasks
- Checkout main and create a new rule, let it run
- Stop kibana
- Checkout this branch and restart kibana
- Verify that the on first run after restarting (when the task does not
have a partition) the rule runs. It might be helpful to create a rule
with a long interval and use run soon.

<details>
<summary>New tasks, but it might be easier to just test on
cloud</summary>

- Start Kibana
- Replace this
[line](https://github.com/elastic/kibana/pull/188368/files#diff-46ca6f79fdc2b69e1d6ddc2401eab6469f8dfb9521f93f90132de624a9693aa5R48)
with the following
```
return [this.podName, 'w', 'x', 'y', 'z'];
```
- Create a few rules and check their partition values using the example
query below:
```
POST .kibana_task_manager*/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "task.taskType": {
              "value": "alerting:.es-query"
            }
          }
        }
      ]
    }
  }
}
```
- Using the the partition map that is expected to be generated for the
current kibana node, verify that the tasks with partitions in the map
run and tasks with partitions that are not in the map do not run.

```
[
  0, 2, 5, 7, 10, 12, 15, 17, 20, 22, 25, 27, 30, 32, 35, 37, 40, 42, 45, 47, 50, 52, 55, 57, 60,
  62, 65, 67, 70, 72, 75, 77, 80, 82, 85, 87, 90, 92, 95, 97, 100, 102, 105, 107, 110, 112, 115,
  117, 120, 122, 125, 127, 130, 132, 135, 137, 140, 142, 145, 147, 150, 152, 155, 157, 160,162,
  165, 167, 170, 172, 175, 177, 180, 182, 185, 187, 190, 192, 195, 197, 200, 202, 205, 207, 210,
  212, 215, 217, 220, 222, 225, 227, 230, 232, 235, 237, 240, 242, 245, 247, 250, 252, 255
]
```
</details>

**Testing on cloud**

- The PR has been deployed to cloud, and you can create multiple rules
and verify that they all run. If some reason they do not run, that means
the nodes are not picking up their assigned partitions correctly.
@doakalexi
Copy link
Contributor Author

/ci

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
taskManager 60 62 +2
Unknown metric groups

API count

id before after diff
taskManager 103 105 +2

@doakalexi doakalexi changed the title Ro task partitioning [Task Manager] Add partitions to tasks and assign those task partitions to Kibana nodes Jul 19, 2024
@doakalexi doakalexi added Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) release_note:skip Skip the PR/issue when compiling release notes v8.16.0 labels Jul 19, 2024
@doakalexi doakalexi marked this pull request as ready for review July 19, 2024 16:13
@doakalexi doakalexi requested review from a team as code owners July 19, 2024 16:13
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@doakalexi doakalexi requested a review from mikecote July 19, 2024 16:14
Copy link
Contributor

@TinaHeiligers TinaHeiligers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core code changes (new mappings) are identical to those in #188001 and LGTM.

@doakalexi doakalexi merged commit 5adf5be into main Jul 19, 2024
42 of 43 checks passed
@doakalexi doakalexi deleted the ro-task-partitioning branch July 19, 2024 18:46
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Jul 19, 2024
@doakalexi doakalexi changed the title [Task Manager] Add partitions to tasks and assign those task partitions to Kibana nodes [Task Manager] Add partitions to tasks and assigns those task partitions to Kibana nodes Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Task Manager] Assign task partitions to Kibana nodes [Task Manager] Task Partitioning
5 participants