Epic: resolve the pageserver backpressure problems #2028

kelvich · 2022-07-05T08:42:26Z

knizhnik · 2022-07-05T14:47:01Z

So my understanding of the current status is the following:

100MB write_replication_lag is too much: it cause minutes delays and wait_for_lsn timeout expiration.
Reducing write_replication_lag to 10MB mostly eliminate large delay problem (it is within few seconds). But at some cases it shows significant (~2 times) slowdown of insertion comparing with 100MB lag. This is why I tried to implement some other throttling strategies other that stop-and-wait
@aome510 implemented exhausted test for backpressure which demonstrates few things:

There is no big difference in write speed with different write_lag values
Results greatly depends on target system IO system. Particularly performance on EC2 servers is much higher than on laptop.
Some pageserver operations (like flushing open layers) cause write storm which cause several seconds delays of all IO operations (including reads). Backpressure can not protect us fro such delays.

There is an alternative to backpressure which can reduce select queries latency - more precisely calculate last written LSN for the particular page (relation/chunk). I have PR for it.

So based on the results of test_wal_backpressure test I have made the following conclusions:

Reducing write_lag to 10MB is enough to avoid too larger delays
Write speed is not significantly affected by reducing write_lag, so it seems to be not so critical right now to change throttling algorithm. We still can use naive stop-and-wait.
More precise calculation of last_written_lsn (maintaining global cache for it) is good idea and help to significantly increase performance.

As far as changing default max_replication_write_lag to 10MB is already merged in main, I think that the only thing we should do with backpressure in July is to review and commit PR neondatabase/postgres#177

hlinnaka · 2022-07-05T15:09:31Z

As far as changing default max_replication_write_lag to 10MB is already merged in main, ...

Oh, can we close #1793 then?

knizhnik · 2022-07-05T19:06:47Z

As far as changing default max_replication_write_lag to 10MB is already merged in main, ...

Oh, can we close #1793 then?

O sorry, it is not yet merged. Can you add review of #1793 so that I can merge it?

aome510 · 2022-07-25T16:02:29Z

To summarize what we have done with the current backpressure approach,

I added backpressure tests in Add wal backpressure performance tests #1919 that confirms the big read latency in Intensive write workload blocks postgres instance #1763
@knizhnik added a patch (Large last written lsn cache postgres#177) adding LSN cache to mitigate the above issue. It was confirmed by the new tests that the maximum read latency can be reduced from about 15s to 2s with the LSN cache. I think this patch can resolve some of the issues in the Epic list involving large latencies for read request
- Intensive write workload blocks postgres instance #1763
- Big insert effectively prevents any getPage requests #1207
we encountered some failed tests with Konstantin's new patch in PG main branch. Discussed in Update last written LSN for gin/gist index metadata postgres#182.
as @hlinnaka suggested in that PR, Konstantin created a patch (Revert "Update last written LSN for gin/gist index metadata (#182)" postgres#183) to revert the new changes (needs reviewing)

In short, the current status is quite "messy": we did have a patch to mitigate some of the backpressure issues, but we cannot update neon to use that because of some failed tests [1].

[1]: I'm not too sure about the status of the failed tests. Are they because of flaky tests unrelated to the new changes or because of the new changes? Maybe Konstantin can provide more insights on this.

knizhnik · 2022-07-25T20:28:02Z

Some more information from my side:

Even without last written LSN cache (pageserver#177) I have not seen large latencies with Add wal backpressure performance tests #1919 and Intensive write workload blocks postgres instance #1763 tests with max-replication_write_lag=15MB and wal_log_hints=off, Second one is important because when enabled, autovacuum may produce a lot of WAL without obtaining XID and so not be blocked by GC.
I never saw (even in CI results) errors like "could not read block...". Errors wich I saw in CI are mostly related with "flaky" tests like test_wal_acceptor. At least I saw similar failures on other PRs not related with backpressure.
My concern about reducing max_replication_write_lag was that it may slow-down writers. But results of Add wal backpressure performance tests #1919 and Intensive write workload blocks postgres instance #1763 doesn't prove it. So there is not urgent need to choose another throttling policy which can replace current stop-and-wait.

Concerning the idea to have time-based backpressure I do not think that it can some radically reduce latencies comparing with current implementation:

It assumes that size of producing and replaying WAL is the same at compute node and pageserver. But it obviously not true.
Time is not currently included in feedback message. We can certainly add then to the protocol. Or use our LSN->timestamp mapping... But I afraid that it will be too expensive.
There may be time difference between localtime at different computers.

aome510 · 2022-07-26T15:32:10Z

Even without last written LSN cache (pageserver#177) I have not seen large latencies with Add wal backpressure performance tests #1919 and Intensive write workload blocks postgres instance #1763 tests with max-replication_write_lag=15MB and wal_log_hints=off, Second one is important because when enabled, autovacuum may produce a lot of WAL without obtaining XID and so not be blocked by GC.

I did disable wal_log_hints in tests but still got the similar maximum latency in https://github.com/neondatabase/neon/runs/7508653405?check_suite_focus=true.

test_pgbench_intensive_init_workload[neon_on-1000].read_latency_max: 10.495 s
18422
test_pgbench_intensive_init_workload[neon_on-1000].read_latency_avg: 6.723 s
18423
test_pgbench_intensive_init_workload[neon_on-1000].read_latency_stdev: 2.501 s

knizhnik · 2022-07-26T19:28:55Z

Yes, you are right.
Backpressure help to minimize delay of single get_page_at_lsn call.
But if we have to fetch hundreds of pages (as in test_pgbench_intensive_init_workload performing select count(*) from foo),
then delay can be really large. And the larger source table is, the larger delay will be. With none backpressure settings we can provide performance similar with vanilla if table doesn't fir in memory.

In this particular case ( test_pgbench_intensive_init_workload) last written LSN cache should definitely help.

stepashka · 2022-08-01T15:30:08Z

@kelvich mentioned that we may be ok with tweaking the backpressure settings 10MB or 15MB & without the immediate changes in the backpressure logic
this is configured via the console and we can change it and test it
everybody agreed

ololobus · 2022-08-01T15:45:54Z

There are two bp settings max_replication_write_lag and max_replication_flush_lag, set to 500MB and 10GB now. Are we going to set both to 10MB?

ololobus · 2022-08-04T16:35:33Z

There are two bp settings max_replication_write_lag and max_replication_flush_lag, set to 500MB and 10GB now. Are we going to set both to 10MB?

@hlinnaka @knizhnik can you clarify?

knizhnik · 2022-08-04T18:51:54Z

There are two bp settings max_replication_write_lag and max_replication_flush_lag, set to 500MB and 10GB now. Are we going to set both to 10MB?

@hlinnaka @knizhnik can you clarify?

No, just max_replication_write_lag

stepashka · 2022-09-05T15:32:06Z

we've done everything that we wanted for now, most of the issues should be gone, the remaining we're leaving for later (backlog)

ololobus · 2022-09-05T15:38:25Z

Just set max_replication_write_lag to 15 MB on prod. My new and old computes started well

shanyp · 2023-07-19T07:15:16Z

@kelvich is this something that we need to put more effort into ?

stepashka · 2023-07-27T10:24:31Z

@shanyp to rescope this, there's still unfinished work, but we're in a different world now :)

jcsp · 2024-03-11T13:04:41Z

Stale.

kelvich added c/storage/pageserver Component: storage: pageserver c/cloud/compute t/Epic Issue type: Epic labels Jul 5, 2022

kelvich assigned knizhnik and aome510 Jul 5, 2022

kelvich added this to the 2022/07 milestone Jul 5, 2022

stepashka changed the title ~~Epic: pageserver backpressure~~ Epic: resolve the pageserver backpressure problems Jul 18, 2022

stepashka modified the milestones: 2022/07, 2022/08 Jul 25, 2022

stepashka assigned kelvich Jul 25, 2022

aome510 mentioned this issue Jul 25, 2022

Experiment #2028 #2155

Closed

stepashka assigned ololobus Aug 1, 2022

petuhovskiy mentioned this issue Aug 12, 2022

error: could not read block xx in rel xx from pageserver at lsn xx #2257

Open

kelvich added c/storage Component: storage and removed c/cloud/compute labels Aug 22, 2022

stepashka assigned stepashka and unassigned knizhnik and kelvich Sep 5, 2022

stepashka unassigned aome510 Sep 5, 2022

ololobus removed their assignment Sep 5, 2022

vadim2404 added the H22023 label Nov 18, 2022

shanyp modified the milestones: 2022/08, 2023/03 Dec 20, 2022

stepashka removed their assignment Dec 27, 2022

shanyp removed this from the 2023/03 milestone Oct 2, 2023

jcsp closed this as completed Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epic: resolve the pageserver backpressure problems #2028

Epic: resolve the pageserver backpressure problems #2028

kelvich commented Jul 5, 2022 •

edited by stepashka

Loading

knizhnik commented Jul 5, 2022

hlinnaka commented Jul 5, 2022

knizhnik commented Jul 5, 2022

aome510 commented Jul 25, 2022 •

edited

Loading

knizhnik commented Jul 25, 2022

aome510 commented Jul 26, 2022

knizhnik commented Jul 26, 2022

stepashka commented Aug 1, 2022

ololobus commented Aug 1, 2022

ololobus commented Aug 4, 2022

knizhnik commented Aug 4, 2022

stepashka commented Sep 5, 2022

ololobus commented Sep 5, 2022

shanyp commented Jul 19, 2023

stepashka commented Jul 27, 2023

jcsp commented Mar 11, 2024

Epic: resolve the pageserver backpressure problems #2028

Epic: resolve the pageserver backpressure problems #2028

Comments

kelvich commented Jul 5, 2022 • edited by stepashka Loading

knizhnik commented Jul 5, 2022

hlinnaka commented Jul 5, 2022

knizhnik commented Jul 5, 2022

aome510 commented Jul 25, 2022 • edited Loading

knizhnik commented Jul 25, 2022

aome510 commented Jul 26, 2022

knizhnik commented Jul 26, 2022

stepashka commented Aug 1, 2022

ololobus commented Aug 1, 2022

ololobus commented Aug 4, 2022

knizhnik commented Aug 4, 2022

stepashka commented Sep 5, 2022

ololobus commented Sep 5, 2022

shanyp commented Jul 19, 2023

stepashka commented Jul 27, 2023

jcsp commented Mar 11, 2024

kelvich commented Jul 5, 2022 •

edited by stepashka

Loading

aome510 commented Jul 25, 2022 •

edited

Loading