Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Rework PageStream connection state handling: (#7611)
* Make PS connection startup use async APIs This allows for improved query cancellation when we start connections * Make PS connections have per-shard connection retry state. Previously they shared global backoff state, which is bad for quickly getting all connections started and/or back online. * Make sure we clean up most connection state on failed connections. Previously, we could technically leak some resources that we'd otherwise clean up. Now, the resources are correctly cleaned up. * pagestore_smgr.c now PANICs on unexpected response message types. Unexpected responses are likely a symptom of having a desynchronized view of the connection state. As a desynchronized connection state can cause corruption, we PANIC, as we don't know what data may have been written to buffers: the only solution is to fail fast & hope we didn't write wrong data. * Catch errors in sync pagestream request handling. Previously, if a query was cancelled after a message was sent to the pageserver, but before the data was received, the backend could forget that it sent the synchronous request, and let others deal with the repercussions. This could then lead to incorrect responses, or errors such as "unexpected response from page server with tag 0x68"
- Loading branch information
0e4f182
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3202 tests run: 3062 passed, 0 failed, 140 skipped (full report)
Code coverage* (full report)
functions
:31.4% (6445 of 20541 functions)
lines
:48.3% (49839 of 103239 lines)
* collected from Rust tests only
0e4f182 at 2024-05-23T22:52:08.198Z :recycle: