Ctxvars stimulus ID #6068

fjetter · 2022-04-05T13:28:10Z

This builds on top of #6046 (or at least a version of it) and moves the set_stimulus_id (former Scheduler.stimulus_id) to the utils.py module and applies the contextvar to the worker as well.

I encountered some problems with the dataclasses we're using. I'm sure this can be ironed out but I didn't want to waste time.

inspect.iscoroutinefunction doesn't recognise cythonised async functions

This reverts commit 1ce45f0.

This reverts commit 08a6812.

This reverts commit ef4e29c.

github-actions · 2022-04-05T22:03:34Z

Unit Test Results

      12 files -       6       12 suites - 6 6h 13m 29s ⏱️ - 3h 9m 57s
  2 724 tests ±      0   2 616 ✔️ -     25   99 💤 +  18   9 ❌ +  7
16 260 runs - 8 104 15 339 ✔️ - 7 804 877 💤 - 342 44 ❌ +42

For more details on these failures, see this check.

Results for commit c780413. ± Comparison against base commit 64615ed.

sjperkins

I've been thinking about this PR. Here's an example with the stimulus_id on the right after the # comment.

Client._send_to_scheduler({"op": "update-graph-hlg", "stimulus_id": "CHLG"}) # CHLG
Scheduler.update_graph_hlg # CHLG
Scheduler..update_graph # CHLG
Scheduler.worker_send({"op": "compute-task"}) # CHLG
Worker.handle_compute_task # CHLG
Worker.execute # CHLG
Worker._handle_instruction # "task-finished"
Worker.batched_stream({"op": "task-finished"}) # "task-finished"
Scheduler.handle_task_finished # "task-finished"

I've been thinking about this in a Cluster vs a LocalCluster approach.

Within a Distributed Cluster, the STIMULUS_ID ContextVar will be unset and must derived from a message from an external entity (Client, Scheduler, Worker).
Within a LocalCluster (Client, Scheduler, Worker in the same process), the STIMULUS_ID may already have been set.

I am considering how to write code to handle both cases and this is complicated by examples such as update_graph_hlg calling update_graph.

I also wonder if a Distributed ContextVar might be possible with the simple use of STIMULUS_ID.set(stimulus_id) at various handler boundaries and copy_context().run at thread boundaries.

I think there's a lot of complexity here (async, threads and inter-process communication) that I'm trying to get my head around. To improve my understanding I would like to model this with some minimal distributed.core.Server's in a test case.

A further thought on managing this complexity would be to add some kwargs to default_stimulus_id such as override (always override existing STIMULUS_ID) and require_empty (require STIMULUS_ID to not be present prior to setting it)

sjperkins and others added 30 commits April 1, 2022 12:19

test_scheduler.py succeeds

c2a3ead

Working test_worker.py and test_client.py

5d10407

Support transition_log in http output

e30be1b

Rename assert_worker_story assert_story

308568e

If possible, defer to STIMULUS_ID when sending messages

b017a12

Support passing stimulus_id in Scheduler handlers

dd6efcb

Transmit stimulus_id's from client

40f87f7

Generate new stimulus_id on completion/failure of Worker.execute

52697de

Use decorator to manage stimulus injection

17f069a

Merge branch 'main' into stimulus-ids-contextvars

36e9ef7

Enable github tmate

ef4e29c

Target specific test case

08a6812

bump

1ce45f0

Explicitly specify sync/async stimulus_handler

222b24d

inspect.iscoroutinefunction doesn't recognise cythonised async functions

Assert with is_coroutine_function

55e5216

Document sync parameter

22e3300

Revert "bump"

9c42c3f

This reverts commit 1ce45f0.

Revert "Target specific test case"

a517fcb

This reverts commit 08a6812.

Revert "Enable github tmate"

e29ad73

This reverts commit ef4e29c.

comments

8458969

Template stimulus_id var in dashboard

caf9a1d

Pass stimulus_id to Client._decref

a57d9c1

stimulus_handler changes

b311bef

worker changes

1cf4032

Merge branch 'main' into stimulus-ids-contextvars

9b789f6

Use a contextmanager instead of a decorator

4433754

Remove default stimulus_id's throughout the scheduler

046b49a

RuntimeError -> AssertionError

9241c72

Move STIMULUS_ID ctxvar to utils

789080d

use ctx.run for gather_dep

93f8aa7

add reason kwarg

c780413

sjperkins self-requested a review April 6, 2022 09:54

sjperkins reviewed Apr 6, 2022

View reviewed changes

sjperkins mentioned this pull request Apr 6, 2022

Support Stimulus ID's in Scheduler with ContextVars #6046

Closed

3 tasks

fjetter closed this Apr 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ctxvars stimulus ID #6068

Ctxvars stimulus ID #6068

fjetter commented Apr 5, 2022

github-actions bot commented Apr 5, 2022

sjperkins left a comment •

edited

Loading

Ctxvars stimulus ID #6068

Ctxvars stimulus ID #6068

Conversation

fjetter commented Apr 5, 2022

github-actions bot commented Apr 5, 2022

Unit Test Results

sjperkins left a comment • edited Loading

Choose a reason for hiding this comment

sjperkins left a comment •

edited

Loading