-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing Task Run Locking for Enhanced Concurrency Control in Gokart #353
Introducing Task Run Locking for Enhanced Concurrency Control in Gokart #353
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
Skip completed tasks with `complete_check_at_run` | ||
--------------------------- | ||
By setting `gokart.TaskOnKart.complete_check_at_run` to True, the existence of the cache can be rechecked at run() time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When should we set this to be false? I feel this can be always True :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with this soution #358 :)
Co-authored-by: Keisuke OGAKI <hikingko1@gmail.com>
…m:mski-iksm/gokart into feature/add_run_lock_to_gokart_taskonkart
@mski-iksm THX for adding brand new feature 👍 |
Introducing Task Run Locking for Enhanced Concurrency Control in Gokart
tl;dr
Summary
This pull request introduces significant updates aimed at enhancing the efficiency and reliability of running tasks on multiple workers in a Gokart/Luigi pipeline. Specifically, it adds new documentation on efficient multi-worker execution, updates task conflict prevention mechanisms, and integrates backoff strategies for handling task lock exceptions. These changes are designed to prevent redundant task executions and ensure more robust task locking in distributed environments.
Changes
Documentation Addition: Added a new documentation file
efficient_run_on_multi_workers.rst
that guides users on how to improve efficiency when running similar Gokart pipelines on multiple workers. This includes strategies to skip completed tasks and suppress the execution of tasks already being run by another worker.Documentation Update: Updated the
index.rst
to include the new documentation in the User Guide section.Task Conflict Prevention Lock: Renamed
using_task_cache_collision_lock.rst
tousing_task_task_conflict_prevention_lock.rst
to better reflect the mechanism's purpose. The documentation within has also been updated to align with the new naming convention and clarify the prevention of task cache conflicts.Code Enhancements:
gokart/build.py
to include backoff strategies when encounteringTaskLockException
, allowing for automatic retrying with exponential backoff until a maximum number of tries or wait time is reached.task_lock.py
andtask_lock_wrappers.py
to support the new locking mechanism during task execution (run
method), ensuring that tasks are not executed redundantly across workers.wrap_run_with_lock.py
to facilitate wrapping the task'srun
method with a lock, preventing simultaneous execution of the same task by multiple workers.gokart/task.py
to automatically apply run locking based on task configuration, enhancing task execution efficiency in distributed environments.Dependency Addition: Added
backoff
library topyproject.toml
and updatedpoetry.lock
accordingly. This library is utilized to implement exponential backoff strategy when handling task lock exceptions.Impact
Testing
Documentation