Separation of concerns between Shuffle and WorkerShuffle #7195

fjetter · 2022-10-26T13:34:07Z

Builds on #7186

This PR refactors the Worker extension and the Shuffle class to have stricter separation of concerns.

Specifically,

Shuffle is responsible for all splitting, sending, flushing, receiving, etc. and is sole owner of associated resources (e.g. comms, buffers, threads, background tasks, etc.). It is entirely asynchronous and not intended to be interacted with directly. This is basically agnostic of what a worker even is.

The extension otoh is the interface between worker and the shuffle instance. It routes RPC calls to the shuffle and exposes synchronous methods to be called in the worker thread. It also is responsible to communicate with the scheduler.

Additional changes include some fixes around concurrency. Particularly this should close #6277 (See test_shuffle.py test_error_offload and test_slow_offload)
The gist is that the shuffle/extension did not wait for threads to complete when flushing (or rather it didn't wait for the receives to finish when flushing)

fjetter · 2022-10-26T13:37:53Z

Unfortunately, I haven't fixed test_bad_disk yet (#7185 (comment))

github-actions · 2022-10-26T14:27:29Z

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      15 files ±    0       15 suites ±0 6h 45m 18s ⏱️ + 38m 49s
  3 196 tests +  39   3 110 ✔️ +  44   83 💤 ±0 3 ❌ -   5
23 633 runs +277 22 723 ✔️ +290 901 💤 +2 9 ❌ - 15

For more details on these failures, see this check.

Results for commit fd15b6f. ± Comparison against base commit 5dccad4.

♻️ This comment has been updated with latest results.

fjetter · 2022-10-26T15:53:13Z

distributed/shuffle/tests/test_shuffle.py

+class ShuffleTestPool:
+    def __init__(self, *args, **kwargs):
+        self.shuffles = {}
+        super().__init__(*args, **kwargs)
+
+    def __call__(self, addr: str, *args: Any, **kwargs: Any) -> PooledRPCShuffle:
+        return PooledRPCShuffle(self.shuffles[addr])
+
+    async def fake_broadcast(self, msg):
+
+        op = msg.pop("op").replace("shuffle_", "")
+        out = {}
+        for addr, s in self.shuffles.items():
+            out[addr] = await getattr(s, op)()
+        return out
+
+    def new_shuffle(
+        self, name, worker_for_mapping, schema, directory, loop, Shuffle=Shuffle
+    ):
+        s = Shuffle(
+            column="_partition",
+            worker_for=worker_for_mapping,
+            # FIXME: Is output_workers redundant with worker_for?
+            output_workers=set(worker_for_mapping.values()),
+            schema=schema,
+            directory=directory / name,
+            id=ShuffleId(name),
+            local_address=name,
+            nthreads=2,
+            rpc=self,
+            loop=loop,
+            broadcast=self.fake_broadcast,
+        )
+        self.shuffles[name] = s
+        return s


This might be the most interesting part of this refactoring. I separated Shuffle and ShuffleWorkerExtension sufficiently to allow us to write down tests without scheduler and workers. This allows for a much more granular control over concurrency issues.

Particularly the offload tests below produce race conditions that would drop data

fjetter · 2022-10-26T15:57:42Z

distributed/shuffle/tests/test_shuffle.py

+    dfs = []
+    rows_per_df = 10
+    n_input_partitions = 2
+    npartitions = 2
+    for ix in range(n_input_partitions):
+        df = pd.DataFrame({"x": range(rows_per_df * ix, rows_per_df * (ix + 1))})
+        df["_partition"] = df.x % npartitions
+        dfs.append(df)
+
+    workers = ["A", "B"]
+
+    worker_for_mapping = {}
+    partitions_for_worker = defaultdict(list)
+
+    for part in range(npartitions):
+        worker_for_mapping[part] = w = get_worker_for(part, workers, npartitions)
+        partitions_for_worker[w].append(part)
+    schema = pa.Schema.from_pandas(dfs[0])


This is all boilerplate I hope we can get rid of eventually. Right now not a prio

fjetter · 2022-10-26T15:59:19Z

distributed/shuffle/_shuffle_extension.py

+        rpc: Callable[[str], PooledRPCCall],
+        broadcast: Callable,


I guess this is the most controversial part. It is very useful, though

fjetter · 2022-10-28T15:00:43Z

I fixed a couple of deadlocks in multicomm an multifile in case of exceptions. I think I took this PR already too far. Will branch of with any following work.

This closes #7208

fjetter · 2022-11-10T14:40:58Z

closed by #7268

fjetter added 7 commits October 26, 2022 17:49

Push down receive to Shuffle

e7ea815

file close async

6ba5475

wait for threads to finish

11ab948

Remove explitict loop

6ae57f6

Shuffle SRP

d91556a

Ensure data is not lost if deserialization is slow

ffb3d08

fix comms test

3d547b0

fjetter force-pushed the improve_abstraction_shuffle branch from c6abefc to 3d547b0 Compare October 26, 2022 15:49

fjetter commented Oct 26, 2022

View reviewed changes

fjetter changed the title ~~Improve abstraction shuffle~~ Separation of concerns between Shuffle and WorkerShuffle Oct 26, 2022

fjetter commented Oct 26, 2022

View reviewed changes

fjetter added 5 commits October 26, 2022 19:47

Reproducer for multi file deadlock

c8d72f6

Ensure exceptions in multi file cannot deadlock

de1c81f

Do not allow read before flush

025dff8

rename condition

8e8556d

Ensure multicomm does not deadlock

62c018b

fjetter marked this pull request as ready for review October 28, 2022 14:59

fjetter mentioned this pull request Oct 28, 2022

Consistent deadlock with shuffle="p2p when merging dataframes with many partitions #6981

Closed

Patch memory_limit

fd15b6f

fjetter mentioned this pull request Nov 8, 2022

Rewrite of P2P control flow #7268

Merged

fjetter closed this Nov 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separation of concerns between Shuffle and WorkerShuffle #7195

Separation of concerns between Shuffle and WorkerShuffle #7195

fjetter commented Oct 26, 2022 •

edited

Loading

fjetter commented Oct 26, 2022

github-actions bot commented Oct 26, 2022 •

edited

Loading

fjetter Oct 26, 2022

fjetter Oct 26, 2022

fjetter Oct 26, 2022

fjetter commented Oct 28, 2022

fjetter commented Nov 10, 2022

Separation of concerns between Shuffle and WorkerShuffle #7195

Separation of concerns between Shuffle and WorkerShuffle #7195

Conversation

fjetter commented Oct 26, 2022 • edited Loading

fjetter commented Oct 26, 2022

github-actions bot commented Oct 26, 2022 • edited Loading

Unit Test Results

fjetter Oct 26, 2022

Choose a reason for hiding this comment

fjetter Oct 26, 2022

Choose a reason for hiding this comment

fjetter Oct 26, 2022

Choose a reason for hiding this comment

fjetter commented Oct 28, 2022

fjetter commented Nov 10, 2022

fjetter commented Oct 26, 2022 •

edited

Loading

github-actions bot commented Oct 26, 2022 •

edited

Loading