Rewrite test_reconnect to use subproc to kill scheduler reliably #6967

fjetter · 2022-08-29T09:10:07Z

I noticed that even on a successful run of this test, we get a lot of "AsyncioGroup already closed" etc. errors since we clearly cripple the running scheduler by closing all the internals manually. Particularly the client and worker disconnects can trigger us trying to schedule messages to deal with the loss.

Looking at the test, we can see a lot of CancelledErrors with long tracebacks. I'm wondering if we're simply hitting #6211 and #6847 is not responsible after all but merely changed timing such that we hit this condition more reliably.

Let's see what CI thinks about this theory.

github-actions · 2022-08-29T10:11:05Z

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      15 files ±0       15 suites ±0 6h 35m 55s ⏱️ - 7m 18s
  3 052 tests +1   2 965 ✔️ - 3   84 💤 +1 3 ❌ +3
22 577 runs +8 21 601 ✔️ +8 973 💤 - 3 3 ❌ +3

For more details on these failures, see this check.

Results for commit e5a6c2d. ± Comparison against base commit c083790.

♻️ This comment has been updated with latest results.

fjetter · 2022-08-29T10:18:59Z

🟢 🎉

gjoseph92

Thanks for fixing this. Couple style nits, take them or leave them

distributed/tests/test_client.py

Co-authored-by: Gabe Joseph <gjoseph92@gmail.com>

hendrikmakait

LGTM, thanks for taking care of fixing this!

…k#6967) Co-authored-by: Gabe Joseph <gjoseph92@gmail.com>

Rewrite test_reconnect

8d9d42b

fjetter changed the title ~~WIP Rewrite test_reconnect to use subproc to kill scheduler reliably~~ Rewrite test_reconnect to use subproc to kill scheduler reliably Aug 29, 2022

fjetter marked this pull request as ready for review August 29, 2022 10:17

fjetter requested review from crusaderky, gjoseph92 and hendrikmakait August 29, 2022 10:19

gjoseph92 approved these changes Aug 29, 2022

View reviewed changes

distributed/tests/test_client.py Outdated Show resolved Hide resolved

distributed/tests/test_client.py Show resolved Hide resolved

Update distributed/tests/test_client.py

e5a6c2d

Co-authored-by: Gabe Joseph <gjoseph92@gmail.com>

hendrikmakait approved these changes Aug 30, 2022

View reviewed changes

fjetter merged commit 6a1b089 into dask:main Aug 30, 2022

fjetter deleted the rewrite_test_reconnect branch August 30, 2022 09:34

hendrikmakait mentioned this pull request Aug 31, 2022

Improve testing of {Scheduler|Worker}MetricCollector #6945

Merged

2 tasks

gjoseph92 added a commit to gjoseph92/distributed that referenced this pull request Oct 31, 2022

Rewrite test_reconnect to use subproc to kill scheduler reliably (das…

2f799ee

…k#6967) Co-authored-by: Gabe Joseph <gjoseph92@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite test_reconnect to use subproc to kill scheduler reliably #6967

Rewrite test_reconnect to use subproc to kill scheduler reliably #6967

fjetter commented Aug 29, 2022

github-actions bot commented Aug 29, 2022 •

edited

Loading

fjetter commented Aug 29, 2022

gjoseph92 left a comment

hendrikmakait left a comment

Rewrite test_reconnect to use subproc to kill scheduler reliably #6967

Rewrite test_reconnect to use subproc to kill scheduler reliably #6967

Conversation

fjetter commented Aug 29, 2022

github-actions bot commented Aug 29, 2022 • edited Loading

Unit Test Results

fjetter commented Aug 29, 2022

gjoseph92 left a comment

Choose a reason for hiding this comment

hendrikmakait left a comment

Choose a reason for hiding this comment

github-actions bot commented Aug 29, 2022 •

edited

Loading