-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(plugin): Add new task-batching plugin. #495
base: master
Are you sure you want to change the base?
Conversation
d7f70c1
to
9b076b1
Compare
fa28ad3
to
08c3af2
Compare
e980730
to
ddfde5b
Compare
query, params, err := sqlgenerator.PGsql. | ||
Select("count (*)"). | ||
From("task t"). | ||
Join("batch b on b.id = t.id_batch"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id_batch
should be batch_id
to be coherent with the rest of the code, and it reads better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't agree more, but id_batch
is the column's name in the database, I have no power over it sadly. 🥲
https://github.com/ovh/utask/blob/master/sql/schema.sql#L44
pkg/batch/batch.go
Outdated
return taskIDs, nil | ||
} | ||
|
||
func conjMap(common, particular map[string]interface{}) (map[string]interface{}, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of this function does not clearly reflect its usage.
It does a merge/union of two map while checking that common keys are not overriden => btw, I'm skeptical of the check line 71/72. It seems legitimate to want to define a "default/common" value for all tasks, while still overriding it for some tasks, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function used to sit elsewhere (https://github.com/ovh/utask/blob/master/api/handler/batch.go#L80), I simply moved it and kept the exact same behaviour. Changing its name (something like mergeMaps
seems adequate) won't have any side effect, but I'm not sure changing its behaviour is 100% safe.
That being said, allowing each task to override the default value is a good idea and seems like a valid use case.
Added a task-batching plugin that creates X tasks of the same template and waits until all of them are in a final state. Moved batch-creation code to the engine package and made the POST batch handler use it. Exposed final tasks' states to be used by the batch plugin. Signed-off-by: Ruben Tordjman <ruben.tordjman.ext@ovhcloud.com>
Signed-off-by: Ruben Tordjman <ruben.tordjman.ext@ovhcloud.com>
Signed-off-by: Ruben Tordjman <ruben.tordjman.ext@ovhcloud.com>
…when the task belongs to a batch where other tasks are still running. Added a way of getting the amount of tasks still running in a batch (moved from the batch plugin). Signed-off-by: Ruben Tordjman <ruben.tordjman.ext@ovhcloud.com>
Simplified output formatting. Signed-off-by: Ruben Tordjman <ruben.tordjman.ext@ovhcloud.com>
Added a Readme. Signed-off-by: Ruben Tordjman <ruben.tordjman.ext@ovhcloud.com>
Signed-off-by: Ruben Tordjman <ruben.tordjman.ext@ovhcloud.com>
Increased the clarity of some variables, functions and the Readme. Signed-off-by: Ruben Tordjman <ruben.tordjman.ext@ovhcloud.com>
22d7a64
to
fc87e13
Compare
Signed-off-by: Ruben Tordjman <144785435+DrRebus@users.noreply.github.com>
fc87e13
to
63bd3e6
Compare
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
A new feature: a task batching plugin.
What is the current behavior? (You can also link to an open issue here)
The only way of batching tasks so far is to use a
subtask
plugin with theforeach
feature, which dynamically adds a new step persubtask
. This approach isn't efficient DB-wise, because everytime a step is updated, the whole step tree (a JSON formatted string) gets updated, triggering anUPDATE
in the database. TheseUPDATE
s create new bigger-and-bigger rows and delete the previous ones, bloating the DB and requiring a strongerVACUUM
.What is the new behavior (if this is a feature change)?
The task-batching plugin spawns child tasks as independent tasks that wake the parent once completed. Since the parent task no longer embed its children, no hard link exists between them, but they're linked via:
ParentTaskID
tag, the public ID of the parent task, allowing the child to wake its parent.BatchID
property, the public ID of the batch the task belongs two.Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
Nope.
Other information:
This plugin doesn't yet aggregate results of the children tasks. It'll come in a subsequent version.
Contrary to what's been discussed, when using sub batches (
sub_batch_size
> 0), children tasks don't always wake their parent when they're done. Instead, the last task in the sub batch will wake the parent. That way, we limit even more the amount of UPDATEs made. The downside is that we have to wait for the whole sub batch to be done before spawning more tasks in the batch.