Release 2023-09-13 #5296

vipvap · 2023-09-13T08:07:39Z

Release 2023-09-13

Please merge this PR using 'Create a merge commit'!

## Problem - https://github.com/neondatabase/neon/security/dependabot/28 ## Summary of changes Remove atty, and remove the `with_ansi` arg to scrubber's stdout logger.

The sequence that can lead to a deadlock: 1. DELETE request gets all the way to `tenant.shutdown(progress, false).await.is_err() ` , while holding TENANTS.read() 2. POST request for tenant creation comes in, calls `tenant_map_insert`, it does `let mut guard = TENANTS.write().await;` 3. Something that `tenant.shutdown()` needs to wait for needs a `TENANTS.read().await`. The only case identified in exhaustive manual scanning of the code base is this one: Imitate size access does `get_tenant().await`, which does `TENANTS.read().await` under the hood. In the above case (1) waits for (3), (3)'s read-lock request is queued behind (2)'s write-lock, and (2) waits for (1). Deadlock. I made a reproducer/proof-that-above-hypothesis-holds in #5281 , but, it's not ready for merge yet and we want the fix _now_. fixes #5284

This adds PostgreSQL 16 as a vendored postgresql version, and adapts the code to support this version. The important changes to PostgreSQL 16 compared to the PostgreSQL 15 changeset include the addition of a neon_rmgr instead of altering Postgres's original WAL format. Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>

Motivation ========== We observed two "indigestion" events on staging, each shortly after restarting `pageserver-0.eu-west-1.aws.neon.build`. It has ~8k tenants. The indigestion manifests as `Timeline::get` calls failing with `exceeded evict iter limit` . The error is from `page_cache.rs`; it was unable to find a free page and hence failed with the error. The indigestion events started occuring after we started deploying builds that contained the following commits: ``` [~/src/neon]: git log --oneline c0ed362..15eaf78 091da1a1c8b4f60ebf8 15eaf78 Disallow block_in_place and Handle::block_on (#5101) a18d6d9 Make File opening in VirtualFile async-compatible (#5280) 76cc873 Use tokio locks in VirtualFile and turn with_file into macro (#5247) ``` The second and third commit are interesting. They add .await points to the VirtualFile code. Background ========== On the read path, which is the dominant user of page cache & VirtualFile during pageserver restart, `Timeline::get` `page_cache` and VirtualFile interact as follows: 1. Timeline::get tries to read from a layer 2. This read goes through the page cache. 3. If we have a page miss (which is known to be common after restart), page_cache uses `find_victim` to find an empty slot, and once it has found a slot, it gives exclusive ownership of it to the caller through a `PageWriteGuard`. 4. The caller is supposed to fill the write guard with data from the underlying backing store, i.e., the layer `VirtualFile`. 5. So, we call into `VirtualFile::read_at`` to fill the write guard. The `find_victim` method finds an empty slot using a basic implementation of clock page replacement algorithm. Slots that are currently in use (`PageReadGuard` / `PageWriteGuard`) cannot become victims. If there have been too many iterations, `find_victim` gives up with error `exceeded evict iter limit`. Root Cause For Indigestion ========================== The second and third commit quoted in the "Motivation" section introduced `.await` points in the VirtualFile code. These enable tokio to preempt us and schedule another future __while__ we hold the `PageWriteGuard` and are calling `VirtualFile::read_at`. This was not possible before these commits, because there simply were no await points that weren't Poll::Ready immediately. With the offending commits, there is now actual usage of `tokio::sync::RwLock` to protect the VirtualFile file descriptor cache. And we __know__ from other experiments that, during the post-restart "rush", the VirtualFile fd cache __is__ too small, i.e., all slots are taken by _ongoing_ VirtualFile operations and cannot be victims. So, assume that VirtualFile's `find_victim_slot`'s `RwLock::write().await` calls _will_ yield control to the executor. The above can lead to the pathological situation if we have N runnable tokio tasks, each wanting to do `Timeline::get`, but only M slots, N >> M. Suppose M of the N tasks win a PageWriteGuard and get preempted at some .await point inside `VirtualFile::read_at`. Now suppose tokio schedules the remaining N-M tasks for fairness, then schedules the first M tasks again. Each of the N-M tasks will run `find_victim()` until it hits the `exceeded evict iter limit`. Why? Because the first M tasks took all the slots and are still holding them tight through their `PageWriteGuard`. The result is massive wastage of CPU time in `find_victim()`. The effort to find a page is futile, but each of the N-M tasks still attempts it. This delays the time when tokio gets around to schedule the first M tasks again. Eventually, tokio will schedule them, they will make progress, fill the `PageWriteGuard`, release it. But in the meantime, the N-M tasks have already bailed with error `exceeded evict iter limit`. Eventually, higher level mechanisms will retry for the N-M tasks, and this time, there won't be as many concurrent tasks wanting to do `Timeline::get`. So, it will shake out. But, it's a massive indigestion until then. This PR ======= This PR reverts the offending commits until we find a proper fix. ``` Revert "Use tokio locks in VirtualFile and turn with_file into macro (#5247)" This reverts commit 76cc873. Revert "Make File opening in VirtualFile async-compatible (#5280)" This reverts commit a18d6d9. ```

## Problem We don't have this instruction written anywhere but in internal Slack ## Summary of changes - Add `How to run a CI pipeline on Pull Requests from external contributors` section to `CONTRIBUTING.md` --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>

## Problem `ci-run/pr-*` branches (and attached PRs) should be deleted automatically when their parent PRs get closed. But there are not ## Summary of changes - Fix if-condition

## Problem If @github-actions creates release PR, the CI pipeline is not triggered (but we have `release-notify.yml` workflow that we expect to run on this event). I suspect this happened because @github-actions is not a repository member. Ref #5283 (comment) ## Summary of changes - Use `CI_ACCESS_TOKEN` to create a PR - Use `gh` instead of `thomaseizinger/create-pull-request` - Restrict permissions for GITHUB_TOKEN to `contents: write` only (required for `git push`)

- pagestore_smgr.c had unnecessary WALSync() (see #5287 ) - Compute node dockerfile didn't build the neon_rmgr extension - Add PostgreSQL 16 image to docker-compose tests - Fix issue with high CPU usage in Safekeeper due to a bug in WALSender Co-authored-by: Alexander Bayandin <alexander@neon.tech>

xfail test reproducing issue #4698

## Problem There was a bug in lfc_ensure_opened which actually disables LFC ## Summary of changes Return true ifLFC file is normally opened ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>

github-actions · 2023-09-13T08:33:49Z

2472 tests run: 2354 passed, 0 failed, 118 skipped (full report)

Flaky tests (6)

Postgres 16

test_partial_evict_tenant: release
test_get_tenant_size_with_multiple_branches: debug

Postgres 15

test_crafted_wal_end[last_wal_record_crossing_segment]: debug

Postgres 14

test_download_remote_layers_api[local_fs]: debug, release
test_get_tenant_size_with_multiple_branches: release

Code coverage (full report)

functions: 53.1% (7669 of 14453 functions)
lines: 81.0% (44789 of 55281 lines)

_{The comment gets automatically updated with the latest test results
1697e7b at 2023-09-13T10:46:06.693Z :recycle:}

jcsp and others added 10 commits September 12, 2023 10:11

s3_scrubber: remove atty dependency (#5171)

36c2618

## Problem - https://github.com/neondatabase/neon/security/dependabot/28 ## Summary of changes Remove atty, and remove the `with_ansi` arg to scrubber's stdout logger.

approved-for-ci-run.yml: fix ci-run/pr-* branch deletion (#5278)

e1661c3

## Problem `ci-run/pr-*` branches (and attached PRs) should be deleted automatically when their parent PRs get closed. But there are not ## Summary of changes - Fix if-condition

proxy http: reproduce issue with transactions in pool (#5293)

8556d94

xfail test reproducing issue #4698

vipvap requested review from a team as code owners September 13, 2023 08:07

vipvap requested review from bojanserafimov, petuhovskiy, koivunej and adi-griever and removed request for a team September 13, 2023 08:07

shanyp approved these changes Sep 13, 2023

View reviewed changes

shanyp merged commit 7d6ec16 into release Sep 13, 2023
76 checks passed

shanyp deleted the releases/2023-09-13 branch September 13, 2023 10:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 2023-09-13 #5296

Release 2023-09-13 #5296

vipvap commented Sep 13, 2023

github-actions bot commented Sep 13, 2023 •

edited

Loading

Postgres 16

Postgres 15

Postgres 14

Release 2023-09-13 #5296

Release 2023-09-13 #5296

Conversation

vipvap commented Sep 13, 2023