Add wal backpressure performance tests #1891

aome510 · 2022-06-06T21:31:02Z

Resolves #1889.

This PR adds new tests to measure the WAL backpressure's performance under different workloads.

Changes

add new performance tests in test_wal_backpressure.py
allow safekeeper's fsync to be configurable when running tests

…workload

aome510 · 2022-06-07T21:09:19Z

Preliminary results:

	run_duratitrue	read_latency_max	read_latency_avg	read_latency_stdev
heavy_write[vanilla]	8.438	0.114	0.046	0.029
heavy_write[neon, fsync=false]	34.892	2.047	1.137	0.564
heavy_write[neon, fsync=true]	60.459	3.562	1.951	0.851
pgbench_simple_update[vanilla]	2.488	0.404	0.236	0.238
pgbench_simple_update[neon, fsync=false]	6.158	0.578	0.528	0.039
pgbench_simple_update[neon, fsync=true]	48.186	11.407	1.465	2.826
other_table[vanilla]	21.407	0.036	0.012	0.006
other_table[neon, fsync=false]	46.768	0.479	0.135	0.130
other_table[neon, fsync=true]	105.960	20.600	0.630	2.589

There is a noticeable performance drop when fsync is enabled in the safekeeper config (disabled by default for tests). For example, I cannot reproduce the issue in #1763 if fsync is disabled. My best guess is that fsync is the bottleneck here.

knizhnik

First of all i noticed that throttling is rarely triggered by this test (with default parameters).
I never saw "backpressure throttling" backend status (but I have to say that throttling sometime happen, because once I have added correspondent trace, I saw such messages in the log).

May be we should increase write workload (produce more WAL) or adjust default parameters for this test.

Seconds I saw three warnings:

 Command '['/home/knizhnik/zenith/tmp_install/bin/pgbench', '-s1000', '-i', 'host=localhost port=5432 dbname=postgres options=-cstatement_timeout=120s']' returned non-zero exit status 1.

I wonder if it indicates some error...

aome510 · 2022-06-07T21:26:05Z

Seconds I saw three warnings:

 Command '['/home/knizhnik/zenith/tmp_install/bin/pgbench', '-s1000', '-i', 'host=localhost port=5432 dbname=postgres options=-cstatement_timeout=120s']' returned non-zero exit status 1.

I wonder if it indicates some error...

I saw that too. It's because a query's timeout is set to be 120s. That's one reason why I don't increase the workload too much.

Edit: I only see that log when increasing the test's parameters. I guess the query is timeout because the test runs faster on my machine than yours.

May be we should increase write workload (produce more WAL) or adjust default parameters for this test.

Yeah we should adjust the default parameters to make the throttling happen more often

I never saw "backpressure throttling" backend status (but I have to say that throttling sometime happen, because once I have added correspondent trace, I saw such messages in the log).

Can you share the modifications you make? It's helpful for me to use that for reference when making changes

petuhovskiy · 2022-06-07T21:26:59Z

My best guess is that fsync is the bottleneck here.

It could be. Where do you run these tests? I suggest trying to run some benchmarks to understand how fast is your disk. pg_test_fsync should be enough to get basic fsync performance, but you can also run fio and look at https://github.com/neondatabase/perftest-sk/blob/master/common/disk-perf.md for examples.

aome510 · 2022-06-07T21:29:36Z

It could be. Where do you run these tests?

I run locally on my Mac M1 air using release build.

knizhnik · 2022-06-08T13:11:14Z

At my laptop test is not completed in 20 minutes.

Also printing avg/stddev of read request delay seems to be not enough. It will be much more interesting to see histogram or at least maximal value.

My best guess is that fsync is the bottleneck here.

I do not think so.
First of all in case of bulk load (copy, i.e. pgbench -i) all changes are inserted inside one transaction. fsync is rarely performed.
Performance is limited by disk write speed.

With large number of small update transaction (pgbehch -N) situation is quite different. Please notice that to commit transaction compute node should want until it is synced by all safekeepers. If safe keeper is slow then we get small transaction rate. But there should not be a big replication lag. Assume that you run pgbench -c 10 -N. In this case there can be up to ten concurrently executed transactions. The LSN lag between compute node and safekeeper should not be arge than 10*transaction_size, in where transaction size i this case is very small ~100 bytes.

Actually to trigger backpressure we should make safekeepers to work faster and pageserver - slower. It cn be achienved by setting fsync=off for safekeeper (disabling fsyncs at page server is not currently supported, them are always performed).
In this case transaction will be committed faster and appended to the safekeeper log, but pageserver can not consume them with the same rate).

But if safekeeper and pageserver are working at the same node (as in case of local tests), then fsyncs performed by safeleeper may significantly afect total system performance At my laptop intensive disk writes may cause several seconds delay for disk operations (reads and writers). At Mac OS/X fsync is very expensive. So here the picture can be even more complicated and obscure.

knizhnik · 2022-06-08T13:22:03Z

Can you share the modifications you make? It's helpful for me to use that for reference when making changes

I just added corresponding logging to handling replication lag in ProcessInterrupts. This messages are printed to pg.log. It is not so convenient to monitor them from tests. May be we should just periodically check backends status (pg_stat_activity) for backpressure throttling and also output current replication lag usig backpressure_lsns() and pg_current_wal_flush_lsn () functions.

knizhnik · 2022-06-08T13:30:18Z

Size of foo table is ~3Mb. With shared_buffers=1MB (default value) it doesn't fir in ompute shared memory and so select count(*_ from foo cause fetching pages fro pageserver. But with large value of shared buffers (i.e. 128Mb used at staging and production), this query may not cause access to page server at all. So our test will not notice delay caused by replication lag.

aome510 · 2022-06-08T16:09:05Z

At my laptop test is not completed in 20 minutes.

Also printing avg/stddev of read request delay seems to be not enough. It will be much more interesting to see histogram or at least maximal value.

Look like you're running the old version of the tests. Can you update the branch and try again? I have removed the vacuum step when doing pgbench -s100 -i. Without that, it will run for a really long time (> 5min in my laptop). Not sure why it's that slow compared to vanilla, 4x slower read time doesn't seem to justify the scale, maybe there are other things, idk.

I also added the maximum latency read as a report value. Agree that it would be much better to have some kind of visualization.

aome510 · 2022-06-08T16:42:19Z

But if safekeeper and pageserver are working at the same node (as in case of local tests), then fsyncs performed by safeleeper may significantly afect total system performance At my laptop intensive disk writes may cause several seconds delay for disk operations (reads and writers). At Mac OS/X fsync is very expensive. So here the picture can be even more complicated and obscure.

That makes sense and also explain the performance difference when running with fsync on/off. I'll revert the changes (that adds fsync when running the tests) and do some other experiments.

knizhnik · 2022-06-08T17:09:08Z

I also noticed significant difference fsync on/off for pgbench -i
Looks like I was wrong: looks like 128kb (maximal size of wal sender message) is not large enough to eliminate overhead of fsyncing. As far as copy is done inside one transaction, sync replication wait can not slowdown walsender. So in this case bottleneck causing replication lag is safekeeper, not pageserver.

test_measure_read_latency_heavy_write_workload[vanilla-10-1].run_duration: 13.895 s
test_measure_read_latency_heavy_write_workload[vanilla-10-1].read_latency_max: 0.306 s
test_measure_read_latency_heavy_write_workload[vanilla-10-1].read_latency_avg: 0.139 s
test_measure_read_latency_heavy_write_workload[vanilla-10-1].read_latency_stdev: 0.112 s
test_measure_read_latency_heavy_write_workload[neon_fsync_off-10-1].run_duration: 36.132 s
test_measure_read_latency_heavy_write_workload[neon_fsync_off-10-1].read_latency_max: 1.774 s
test_measure_read_latency_heavy_write_workload[neon_fsync_off-10-1].read_latency_avg: 1.096 s
test_measure_read_latency_heavy_write_workload[neon_fsync_off-10-1].read_latency_stdev: 0.485 s
test_measure_read_latency_heavy_write_workload[neon_fsync_on-10-1].run_duration: 61.670 s
test_measure_read_latency_heavy_write_workload[neon_fsync_on-10-1].read_latency_max: 4.779 s
test_measure_read_latency_heavy_write_workload[neon_fsync_on-10-1].read_latency_avg: 2.171 s
test_measure_read_latency_heavy_write_workload[neon_fsync_on-10-1].read_latency_stdev: 1.352 s
test_measure_read_latency_pgbench_simple_update_workload[vanilla-1000-10].run_duration: 12.488 s
test_measure_read_latency_pgbench_simple_update_workload[vanilla-1000-10].read_latency_max: 6.474 s
test_measure_read_latency_pgbench_simple_update_workload[vanilla-1000-10].read_latency_avg: 1.490 s
test_measure_read_latency_pgbench_simple_update_workload[vanilla-1000-10].read_latency_stdev: 2.793 s
test_measure_read_latency_pgbench_simple_update_workload[neon_fsync_off-1000-10].run_duration: 7.886 s
test_measure_read_latency_pgbench_simple_update_workload[neon_fsync_off-1000-10].read_latency_max: 0.642 s
test_measure_read_latency_pgbench_simple_update_workload[neon_fsync_off-1000-10].read_latency_avg: 0.569 s
test_measure_read_latency_pgbench_simple_update_workload[neon_fsync_off-1000-10].read_latency_stdev: 0.049 s
test_measure_read_latency_pgbench_simple_update_workload[neon_fsync_on-1000-10].run_duration: 26.541 s
test_measure_read_latency_pgbench_simple_update_workload[neon_fsync_on-1000-10].read_latency_max: 4.353 s
test_measure_read_latency_pgbench_simple_update_workload[neon_fsync_on-1000-10].read_latency_avg: 1.988 s
test_measure_read_latency_pgbench_simple_update_workload[neon_fsync_on-1000-10].read_latency_stdev: 1.658 s
test_measure_read_latency_other_table[vanilla-100].run_duration: 130.723 s
test_measure_read_latency_other_table[vanilla-100].read_latency_max: 1.199 s
test_measure_read_latency_other_table[vanilla-100].read_latency_avg: 0.067 s
test_measure_read_latency_other_table[vanilla-100].read_latency_stdev: 0.164 s
test_measure_read_latency_other_table[neon_fsync_off-100].run_duration: 115.215 s
test_measure_read_latency_other_table[neon_fsync_off-100].read_latency_max: 8.140 s
test_measure_read_latency_other_table[neon_fsync_off-100].read_latency_avg: 0.377 s
test_measure_read_latency_other_table[neon_fsync_off-100].read_latency_stdev: 1.081 s
test_measure_read_latency_other_table[neon_fsync_on-100].run_duration: 242.563 s
test_measure_read_latency_other_table[neon_fsync_on-100].read_latency_max: 40.526 s
test_measure_read_latency_other_table[neon_fsync_on-100].read_latency_avg: 0.890 s
test_measure_read_latency_other_table[neon_fsync_on-100].read_latency_stdev: 3.844 s

aome510 · 2022-06-08T22:16:17Z

It looks like fsync running in CI is quite fast compared to my local, so I increase the tests' default parameters to get more interesting results.

New result (from https://app.circleci.com/pipelines/github/neondatabase/neon/7033/workflows/36dcb168-3213-45c5-9820-68434997d4a9/jobs/72231)

test_heavy_write_workload[vanilla-10-5].run_duration: 10.666 s
test_heavy_write_workload[vanilla-10-5].read_latency_max: 0.123 s
test_heavy_write_workload[vanilla-10-5].read_latency_avg: 0.061 s
test_heavy_write_workload[vanilla-10-5].read_latency_stdev: 0.039 s
test_heavy_write_workload[neon_fsync_off-10-5].lsn_write_lag_max: 4296 kB
test_heavy_write_workload[neon_fsync_off-10-5].lsn_write_lag_avg: 1862 kB
test_heavy_write_workload[neon_fsync_off-10-5].lsn_write_lag_stdev: 824 kB
test_heavy_write_workload[neon_fsync_off-10-5].run_duration: 34.765 s
test_heavy_write_workload[neon_fsync_off-10-5].read_latency_max: 8.415 s
test_heavy_write_workload[neon_fsync_off-10-5].read_latency_avg: 3.313 s
test_heavy_write_workload[neon_fsync_off-10-5].read_latency_stdev: 2.660 s
test_heavy_write_workload[neon_fsync_on-10-5].lsn_write_lag_max: 3296 kB
test_heavy_write_workload[neon_fsync_on-10-5].lsn_write_lag_avg: 1729 kB
test_heavy_write_workload[neon_fsync_on-10-5].lsn_write_lag_stdev: 734 kB
test_heavy_write_workload[neon_fsync_on-10-5].run_duration: 36.363 s
test_heavy_write_workload[neon_fsync_on-10-5].read_latency_max: 10.264 s
test_heavy_write_workload[neon_fsync_on-10-5].read_latency_avg: 3.497 s
test_heavy_write_workload[neon_fsync_on-10-5].read_latency_stdev: 3.366 s
test_pgbench_simple_update_workload[vanilla-1000-100].run_duration: 19.652 s
test_pgbench_simple_update_workload[vanilla-1000-100].read_latency_max: 12.901 s
test_pgbench_simple_update_workload[vanilla-1000-100].read_latency_avg: 3.909 s
test_pgbench_simple_update_workload[vanilla-1000-100].read_latency_stdev: 6.084 s
test_pgbench_simple_update_workload[neon_fsync_off-1000-100].lsn_write_lag_max: 19431 kB
test_pgbench_simple_update_workload[neon_fsync_off-1000-100].lsn_write_lag_avg: 1871 kB
test_pgbench_simple_update_workload[neon_fsync_off-1000-100].lsn_write_lag_stdev: 4491 kB
test_pgbench_simple_update_workload[neon_fsync_off-1000-100].run_duration: 99.212 s
test_pgbench_simple_update_workload[neon_fsync_off-1000-100].read_latency_max: 44.542 s
test_pgbench_simple_update_workload[neon_fsync_off-1000-100].read_latency_avg: 18.813 s
test_pgbench_simple_update_workload[neon_fsync_off-1000-100].read_latency_stdev: 17.174 s
test_pgbench_simple_update_workload[neon_fsync_on-1000-100].lsn_write_lag_max: 19802 kB
test_pgbench_simple_update_workload[neon_fsync_on-1000-100].lsn_write_lag_avg: 2446 kB
test_pgbench_simple_update_workload[neon_fsync_on-1000-100].lsn_write_lag_stdev: 4973 kB
test_pgbench_simple_update_workload[neon_fsync_on-1000-100].run_duration: 154.156 s
test_pgbench_simple_update_workload[neon_fsync_on-1000-100].read_latency_max: 55.226 s
test_pgbench_simple_update_workload[neon_fsync_on-1000-100].read_latency_avg: 24.500 s
test_pgbench_simple_update_workload[neon_fsync_on-1000-100].read_latency_stdev: 23.025 s
test_pgbench_intensive_init_workload[vanilla-1000].run_duration: 122.360 s
test_pgbench_intensive_init_workload[vanilla-1000].read_latency_max: 0.020 s
test_pgbench_intensive_init_workload[vanilla-1000].read_latency_avg: 0.007 s
test_pgbench_intensive_init_workload[vanilla-1000].read_latency_stdev: 0.002 s
test_pgbench_intensive_init_workload[neon_fsync_off-1000].lsn_write_lag_max: 15460 kB
test_pgbench_intensive_init_workload[neon_fsync_off-1000].lsn_write_lag_avg: 2287 kB
test_pgbench_intensive_init_workload[neon_fsync_off-1000].lsn_write_lag_stdev: 3236 kB
test_pgbench_intensive_init_workload[neon_fsync_off-1000].run_duration: 193.829 s
test_pgbench_intensive_init_workload[neon_fsync_off-1000].read_latency_max: 2.508 s
test_pgbench_intensive_init_workload[neon_fsync_off-1000].read_latency_avg: 0.323 s
test_pgbench_intensive_init_workload[neon_fsync_off-1000].read_latency_stdev: 0.468 s
test_pgbench_intensive_init_workload[neon_fsync_on-1000].lsn_write_lag_max: 15467 kB
test_pgbench_intensive_init_workload[neon_fsync_on-1000].lsn_write_lag_avg: 12968 kB
test_pgbench_intensive_init_workload[neon_fsync_on-1000].lsn_write_lag_stdev: 3889 kB
test_pgbench_intensive_init_workload[neon_fsync_on-1000].run_duration: 123.336 s
test_pgbench_intensive_init_workload[neon_fsync_on-1000].read_latency_max: 19.415 s
test_pgbench_intensive_init_workload[neon_fsync_on-1000].read_latency_avg: 9.231 s
test_pgbench_intensive_init_workload[neon_fsync_on-1000].read_latency_stdev: 8.565 s

As far as copy is done inside one transaction, sync replication wait can not slowdown walsender. So in this case bottleneck causing replication lag is safekeeper, not pageserver.

Yeah, look like the only place where there is a significant difference between fsync=true and fsync=false is in test_pgbench_intensive_init_workload. In the other two, the performance with fsync on is still worse than fsync off, but not too much.

aome510 · 2022-06-08T22:18:16Z

Important note: the new tests in this PR (with default parameters) will make benchmark workflow 1.5 times slower from ~20min to ~30min.

knizhnik · 2022-06-10T13:06:57Z

test_measure_read_latency_heavy_write_workload[vanilla-1].run_duration: 1,503.690 s
test_measure_read_latency_heavy_write_workload[vanilla-1].read_latency_avg: 0.047 s
test_measure_read_latency_heavy_write_workload[vanilla-1].read_latency_stdev: 0.192 s
test_measure_read_latency_heavy_write_workload[neon-1].run_duration: 1,350.481 s
test_measure_read_latency_heavy_write_workload[neon-1].read_latency_avg: 6.367 s
test_measure_read_latency_heavy_write_workload[neon-1].read_latency_stdev: 3.747 s
test_measure_read_latency_pgbench_simple_update_workload[vanilla-100000-10].run_duration: 122.362 s
test_measure_read_latency_pgbench_simple_update_workload[vanilla-100000-10].read_latency_avg: 0.026 s
test_measure_read_latency_pgbench_simple_update_workload[vanilla-100000-10].read_latency_stdev: 0.110 s
test_measure_read_latency_pgbench_simple_update_workload[neon-100000-10].run_duration: 127.898 s
test_measure_read_latency_pgbench_simple_update_workload[neon-100000-10].read_latency_avg: 0.020 s
test_measure_read_latency_pgbench_simple_update_workload[neon-100000-10].read_latency_stdev: 0.010 s
test_measure_read_latency_other_table[vanilla-1000].run_duration: 143.355 s
test_measure_read_latency_other_table[vanilla-1000].read_latency_avg: 0.088 s
test_measure_read_latency_other_table[vanilla-1000].read_latency_stdev: 0.206 s
test_measure_read_latency_other_table[neon-1000].run_duration: 138.211 s
test_measure_read_latency_other_table[neon-1000].read_latency_avg: 1.302 s
test_measure_read_latency_other_table[neon-1000].read_latency_stdev: 2.775 s

knizhnik · 2022-06-10T16:14:21Z

The last version of test_wal_backpressure from thang/add-wal-backpressure-tests branch
do not shows large read delays, but 2 tests are failed with error:

FAILED test_runner/performance/test_wal_backpressure.py::test_pgbench_simple_update_workload[neon_off_500MB-10000-10] - psycopg2.DatabaseError: no commit timestamp...
FAILED test_runner/performance/test_wal_backpressure.py::test_pgbench_simple_update_workload[neon_on_500MB-10000-10] - psycopg2.DatabaseError: no commit timestamps...

I wonder whether find_lsn_for_timestamp should bail if it doesn't find any timestamp (may because database is empty)?

aome510 · 2022-06-10T17:01:45Z

The last version of test_wal_backpressure from thang/add-wal-backpressure-tests branch do not shows large read delays, but 2 tests are failed with error:
FAILED test_runner/performance/test_wal_backpressure.py::test_pgbench_simple_update_workload[neon_off_500MB-10000-10] - psycopg2.DatabaseError: no commit timestamp...
FAILED test_runner/performance/test_wal_backpressure.py::test_pgbench_simple_update_workload[neon_on_500MB-10000-10] - psycopg2.DatabaseError: no commit timestamps...
I wonder whether find_lsn_for_timestamp should bail if it doesn't find any timestamp (may because database is empty)?

In this PR, I tuned down the parameters a bit and only made some structural changes.

Now, most of the changes about the tests' behaviour will be in #1919. I've changed CI workflows in that branch to have a better feedback loop. Also, I have added Heikki's suggestions based and observed the write lag to approach the limit quickly. The tests in that branch are quite broken now, so will need to look into them first.

Regarding the psycopg2.DatabaseError: no commit timestamps..., I haven't seen that before. I'll try to reproduce it locally.

aome510 · 2022-06-10T18:20:05Z

@knizhnik I think tests work well now on #1919. You can try run them on that branch.

An example result from https://app.circleci.com/pipelines/github/neondatabase/neon/7096/workflows/7160f1d4-5a97-40d7-9ebb-6d8b5b07c3b0/jobs/72866:

test_heavy_write_workload[vanilla-10-5-5].run_duration: 31.004 s
test_heavy_write_workload[vanilla-10-5-5].read_latency_max: 0.014 s
test_heavy_write_workload[vanilla-10-5-5].read_latency_avg: 0.005 s
test_heavy_write_workload[vanilla-10-5-5].read_latency_stdev: 0.002 s
test_heavy_write_workload[neon_off_15MB-10-5-5].run_duration: 186.006 s
test_heavy_write_workload[neon_off_15MB-10-5-5].lsn_write_lag_max: 16936 kB
test_heavy_write_workload[neon_off_15MB-10-5-5].lsn_write_lag_avg: 14737 kB
test_heavy_write_workload[neon_off_15MB-10-5-5].lsn_write_lag_stdev: 3102 kB
test_heavy_write_workload[neon_off_15MB-10-5-5].read_latency_max: 12.382 s
test_heavy_write_workload[neon_off_15MB-10-5-5].read_latency_avg: 4.346 s
test_heavy_write_workload[neon_off_15MB-10-5-5].read_latency_stdev: 3.121 s
test_heavy_write_workload[neon_off_500MB-10-5-5].run_duration: 183.251 s
test_heavy_write_workload[neon_off_500MB-10-5-5].lsn_write_lag_max: 466755 kB
test_heavy_write_workload[neon_off_500MB-10-5-5].lsn_write_lag_avg: 137283 kB
test_heavy_write_workload[neon_off_500MB-10-5-5].lsn_write_lag_stdev: 97640 kB
test_heavy_write_workload[neon_off_500MB-10-5-5].read_latency_max: 21.312 s
test_heavy_write_workload[neon_off_500MB-10-5-5].read_latency_avg: 15.360 s
test_heavy_write_workload[neon_off_500MB-10-5-5].read_latency_stdev: 5.996 s
test_heavy_write_workload[neon_on_15MB-10-5-5].run_duration: 208.458 s
test_heavy_write_workload[neon_on_15MB-10-5-5].lsn_write_lag_max: 17541 kB
test_heavy_write_workload[neon_on_15MB-10-5-5].lsn_write_lag_avg: 14406 kB
test_heavy_write_workload[neon_on_15MB-10-5-5].lsn_write_lag_stdev: 3418 kB
test_heavy_write_workload[neon_on_15MB-10-5-5].read_latency_max: 15.585 s
test_heavy_write_workload[neon_on_15MB-10-5-5].read_latency_avg: 6.489 s
test_heavy_write_workload[neon_on_15MB-10-5-5].read_latency_stdev: 4.034 s
test_heavy_write_workload[neon_on_500MB-10-5-5].run_duration: 205.617 s
test_heavy_write_workload[neon_on_500MB-10-5-5].lsn_write_lag_max: 513389 kB
test_heavy_write_workload[neon_on_500MB-10-5-5].lsn_write_lag_avg: 187969 kB
test_heavy_write_workload[neon_on_500MB-10-5-5].lsn_write_lag_stdev: 145143 kB
test_heavy_write_workload[neon_on_500MB-10-5-5].read_latency_max: 23.683 s
test_heavy_write_workload[neon_on_500MB-10-5-5].read_latency_avg: 16.557 s
test_heavy_write_workload[neon_on_500MB-10-5-5].read_latency_stdev: 6.815 s
test_pgbench_simple_update_workload[vanilla-45-100].run_duration: 65.209 s
test_pgbench_simple_update_workload[vanilla-45-100].read_latency_max: 13.288 s
test_pgbench_simple_update_workload[vanilla-45-100].read_latency_avg: 0.347 s
test_pgbench_simple_update_workload[vanilla-45-100].read_latency_stdev: 1.927 s
test_pgbench_simple_update_workload[neon_off_15MB-45-100].run_duration: 131.969 s
test_pgbench_simple_update_workload[neon_off_15MB-45-100].lsn_write_lag_max: 25241 kB
test_pgbench_simple_update_workload[neon_off_15MB-45-100].lsn_write_lag_avg: 1618 kB
test_pgbench_simple_update_workload[neon_off_15MB-45-100].lsn_write_lag_stdev: 4360 kB
test_pgbench_simple_update_workload[neon_off_15MB-45-100].read_latency_max: 35.336 s
test_pgbench_simple_update_workload[neon_off_15MB-45-100].read_latency_avg: 1.701 s
test_pgbench_simple_update_workload[neon_off_15MB-45-100].read_latency_stdev: 6.336 s
test_pgbench_simple_update_workload[neon_off_500MB-45-100].run_duration: 139.366 s
test_pgbench_simple_update_workload[neon_off_500MB-45-100].lsn_write_lag_max: 90112 kB
test_pgbench_simple_update_workload[neon_off_500MB-45-100].lsn_write_lag_avg: 1106 kB
test_pgbench_simple_update_workload[neon_off_500MB-45-100].lsn_write_lag_stdev: 7601 kB
test_pgbench_simple_update_workload[neon_off_500MB-45-100].read_latency_max: 30.133 s
test_pgbench_simple_update_workload[neon_off_500MB-45-100].read_latency_avg: 1.377 s
test_pgbench_simple_update_workload[neon_off_500MB-45-100].read_latency_stdev: 5.386 s
test_pgbench_simple_update_workload[neon_on_15MB-45-100].run_duration: 183.948 s
test_pgbench_simple_update_workload[neon_on_15MB-45-100].lsn_write_lag_max: 18951 kB
test_pgbench_simple_update_workload[neon_on_15MB-45-100].lsn_write_lag_avg: 1523 kB
test_pgbench_simple_update_workload[neon_on_15MB-45-100].lsn_write_lag_stdev: 3703 kB
test_pgbench_simple_update_workload[neon_on_15MB-45-100].read_latency_max: 49.567 s
test_pgbench_simple_update_workload[neon_on_15MB-45-100].read_latency_avg: 2.419 s
test_pgbench_simple_update_workload[neon_on_15MB-45-100].read_latency_stdev: 9.258 s
test_pgbench_simple_update_workload[neon_on_500MB-45-100].run_duration: 156.513 s
test_pgbench_simple_update_workload[neon_on_500MB-45-100].read_latency_max: 35.752 s
test_pgbench_simple_update_workload[neon_on_500MB-45-100].read_latency_avg: 1.854 s
test_pgbench_simple_update_workload[neon_on_500MB-45-100].read_latency_stdev: 7.063 s
test_pgbench_simple_update_workload[neon_on_500MB-45-100].lsn_write_lag_max: 512040 kB
test_pgbench_simple_update_workload[neon_on_500MB-45-100].lsn_write_lag_avg: 16648 kB
test_pgbench_simple_update_workload[neon_on_500MB-45-100].lsn_write_lag_stdev: 76113 kB
test_pgbench_intensive_init_workload[vanilla-1000].run_duration: 119.182 s
test_pgbench_intensive_init_workload[vanilla-1000].read_latency_max: 0.106 s
test_pgbench_intensive_init_workload[vanilla-1000].read_latency_avg: 0.013 s
test_pgbench_intensive_init_workload[vanilla-1000].read_latency_stdev: 0.016 s
test_pgbench_intensive_init_workload[neon_off_15MB-1000].run_duration: 66.136 s
test_pgbench_intensive_init_workload[neon_off_15MB-1000].lsn_write_lag_max: 5734 kB
test_pgbench_intensive_init_workload[neon_off_15MB-1000].lsn_write_lag_avg: 1687 kB
test_pgbench_intensive_init_workload[neon_off_15MB-1000].lsn_write_lag_stdev: 1614 kB
test_pgbench_intensive_init_workload[neon_off_15MB-1000].read_latency_max: 2.846 s
test_pgbench_intensive_init_workload[neon_off_15MB-1000].read_latency_avg: 0.865 s
test_pgbench_intensive_init_workload[neon_off_15MB-1000].read_latency_stdev: 0.488 s
test_pgbench_intensive_init_workload[neon_off_500MB-1000].run_duration: 62.151 s
test_pgbench_intensive_init_workload[neon_off_500MB-1000].read_latency_max: 16.530 s
test_pgbench_intensive_init_workload[neon_off_500MB-1000].read_latency_avg: 1.507 s
test_pgbench_intensive_init_workload[neon_off_500MB-1000].read_latency_stdev: 3.165 s
test_pgbench_intensive_init_workload[neon_off_500MB-1000].lsn_write_lag_max: 455565 kB
test_pgbench_intensive_init_workload[neon_off_500MB-1000].lsn_write_lag_avg: 70809 kB
test_pgbench_intensive_init_workload[neon_off_500MB-1000].lsn_write_lag_stdev: 132950 kB
test_pgbench_intensive_init_workload[neon_on_15MB-1000].run_duration: 122.887 s
test_pgbench_intensive_init_workload[neon_on_15MB-1000].lsn_write_lag_max: 15465 kB
test_pgbench_intensive_init_workload[neon_on_15MB-1000].lsn_write_lag_avg: 12467 kB
test_pgbench_intensive_init_workload[neon_on_15MB-1000].lsn_write_lag_stdev: 4096 kB
test_pgbench_intensive_init_workload[neon_on_15MB-1000].read_latency_max: 20.566 s
test_pgbench_intensive_init_workload[neon_on_15MB-1000].read_latency_avg: 7.853 s
test_pgbench_intensive_init_workload[neon_on_15MB-1000].read_latency_stdev: 7.159 s
test_pgbench_intensive_init_workload[neon_on_500MB-1000].run_duration: 115.814 s
test_pgbench_intensive_init_workload[neon_on_500MB-1000].lsn_write_lag_max: 512105 kB
test_pgbench_intensive_init_workload[neon_on_500MB-1000].lsn_write_lag_avg: 492815 kB
test_pgbench_intensive_init_workload[neon_on_500MB-1000].lsn_write_lag_stdev: 64286 kB
test_pgbench_intensive_init_workload[neon_on_500MB-1000].read_latency_max: 115.978 s
test_pgbench_intensive_init_workload[neon_on_500MB-1000].read_latency_avg: 58.409 s
test_pgbench_intensive_init_workload[neon_on_500MB-1000].read_latency_stdev: 81.415 s

Most of the read queries have been changed to something simple (e.g, read a single row using primary key). So, the big read latency in some tests is probably because of the write lag.

knizhnik · 2022-06-11T16:52:56Z

test_wal_backpressure on #1919 branch now consumes ore than 200Gb of disk space so I can't run it locally.
So we have now a lot of tests for backpressure. But what's next?
This tests proves well known fact that 500Mb write lag is too much and may cause long delays (minutes).
With 15Mb write lag delays are much smaller except this:

test_pgbench_simple_update_workload[neon_on_15MB-45-100].read_latency_max: 49.567 s

But it is hard to say whether such delay is actually caused by wait_lsn and what pageserver is doing during this 50 seconds.
May be node was just overloaded with work.

My original assumption was that small (10Mb) max_write_relication_lag allows to eliminate huge delays (>10 sec).
It's a pity if it is not true.

The next problem which makes me to look for replacement of current stop-and-wait throttling algorithm was that with max_replication_write_lag write speed was about 3 times slower.
This tests do not prove it: only in one case there is noticeable difference:

test_pgbench_simple_update_workload[neon_on_15MB-45-100].run_duration: 183.948 s
test_pgbench_simple_update_workload[neon_on_500MB-45-100].run_duration: 156.513 s

But even in this case it is not large enough. So based on the results of this tests there is o strong need in replacement of current simple stop-and-wait throttling algorithm.

What else we should try is cache for last written LSN (neondatabase/postgres#167).
It is alternative way (to backpressure) to fix problem with read latency.

knizhnik · 2022-06-13T06:09:21Z

PR (neondatabase/postgres#167) can not help to reduce read latency in test_pgbench_simple_update_workload just because reader access updated table.

Actually results of this particular test on my laptop looks really confusing:

test_pgbench_simple_update_workload[vanilla-45-100].run_duration: 87.705 s
test_pgbench_simple_update_workload[vanilla-45-100].read_latency_max: 33.317 s
test_pgbench_simple_update_workload[vanilla-45-100].read_latency_avg: 0.752 s
test_pgbench_simple_update_workload[vanilla-45-100].read_latency_stdev: 4.714 s
test_pgbench_simple_update_workload[neon_off_15MB-45-100].run_duration: 178.398 s
test_pgbench_simple_update_workload[neon_off_15MB-45-100].read_latency_max: 56.333 s
test_pgbench_simple_update_workload[neon_off_15MB-45-100].read_latency_avg: 2.029 s
test_pgbench_simple_update_workload[neon_off_15MB-45-100].read_latency_stdev: 8.736 s
test_pgbench_simple_update_workload[neon_off_15MB-45-100].lsn_write_lag_max: 16322 kB
test_pgbench_simple_update_workload[neon_off_15MB-45-100].lsn_write_lag_avg: 1055 kB
test_pgbench_simple_update_workload[neon_off_15MB-45-100].lsn_write_lag_stdev: 3006 kB
test_pgbench_simple_update_workload[neon_off_500MB-45-100].run_duration: 164.120 s
test_pgbench_simple_update_workload[neon_off_500MB-45-100].read_latency_max: 38.964 s
test_pgbench_simple_update_workload[neon_off_500MB-45-100].read_latency_avg: 1.783 s
test_pgbench_simple_update_workload[neon_off_500MB-45-100].read_latency_stdev: 6.944 s
test_pgbench_simple_update_workload[neon_off_500MB-45-100].lsn_write_lag_max: 65924 kB
test_pgbench_simple_update_workload[neon_off_500MB-45-100].lsn_write_lag_avg: 1294 kB
test_pgbench_simple_update_workload[neon_off_500MB-45-100].lsn_write_lag_stdev: 6521 kB
test_pgbench_simple_update_workload[neon_on_15MB-45-100].lsn_write_lag_max: 15809 kB
test_pgbench_simple_update_workload[neon_on_15MB-45-100].lsn_write_lag_avg: 2040 kB
test_pgbench_simple_update_workload[neon_on_15MB-45-100].lsn_write_lag_stdev: 4684 kB
test_pgbench_simple_update_workload[neon_on_15MB-45-100].read_latency_max: 107.208 s
test_pgbench_simple_update_workload[neon_on_15MB-45-100].read_latency_avg: 50.435 s
test_pgbench_simple_update_workload[neon_on_15MB-45-100].read_latency_stdev: 53.878 s
test_pgbench_simple_update_workload[neon_on_500MB-45-100].lsn_write_lag_max: 512448 kB
test_pgbench_simple_update_workload[neon_on_500MB-45-100].lsn_write_lag_avg: 76596 kB
test_pgbench_simple_update_workload[neon_on_500MB-45-100].lsn_write_lag_stdev: 178892 kB
test_pgbench_simple_update_workload[neon_on_500MB-45-100].read_latency_max: 128.584 s
test_pgbench_simple_update_workload[neon_on_500MB-45-100].read_latency_avg: 55.517 s
test_pgbench_simple_update_workload[neon_on_500MB-45-100].read_latency_stdev: 66.059 s

Average and maximal delays with max_replication_write_lag=15MB are even larger then with max_replication_write_lag=500MB , although lag in the last case is really much larger.
So look like lag value limit itself is not enough warranty of small latency. At least at my laptop.

knizhnik · 2022-06-13T11:58:12Z

I understand the reasons of large read latency in test_pgbench_simple_update_workload and other tests and why backpresser doesn;t help much;

By default wal_log_hints=on, to them are inserted into WAL without XID (by vacuum or select). Backpressure is not performed for backends having not XID, so it is not jelping in this case.
This test performes "SELECT * from pgbench_accounts where aid = 1" query in separate thread but this query is also executed BEFORE index is constructed, so it required sequential scan. This is why read_latency_max is high.

With wal_log_hits=on

test_pgbench_simple_update_workload[neon_on_15MB-45-100].read_latency_avg: 50.435 s

wal_log_hints=off

test_pgbench_simple_update_workload[neon_on_15MB-45-100].read_latency_avg: 2.340 s

aome510 · 2022-06-13T16:16:29Z

By default wal_log_hints=on, to them are inserted into WAL without XID (by vacuum or select). Backpressure is not performed for backends having not XID, so it is not jelping in this case.

Is wal_log_hints=on enabled in prod? Should we disable wal_log_hints for the wal backpressure tests?

This test performes "SELECT * from pgbench_accounts where aid = 1" query in separate thread but this query is also executed BEFORE index is constructed, so it required sequential scan. This is why read_latency_max is high.

Good catch! I have updated the tests. in #1919, the max read latency is reduced significantly on my local.

aome510 · 2022-06-13T16:26:18Z

So we have now a lot of tests for backpressure. But what's next?
This tests proves well known fact that 500Mb write lag is too much and may cause long delays (minutes).

I'm not so sure. In fact, with only new wal backpressure tests, the total test runtime is increased by ~40min, which is extremely slow. It can probably be helped with parallelism.

Edit: if 500Mb write lag is too much, we can remove those tests. However, given the fact that 500MB is the default setting in prod, we should probably wait.

…ssure-tests

hlinnaka · 2022-06-13T16:52:48Z

@hlinnaka I tried your test with required relcache improvements at my laptop. Its execution takes so long time (so it is actually based on old version of test_wal_bckpressure) but latency is not so horrible, although >10 seconds delay seems to be not acceptable. It means that even 15Mb max_replication_write_lag is too larger value.

Hmm, I'm not sure I'm ready to jump to that conclusion. What is the speed that the pageserver can process the incoming WAL? 10 MB/s? 100 MB/s? I remember we did some quick measurements long time ago, and got number 30 MB / s. If we assume conservatively that it's actually 10 MB/s, then a 15 MB max_replication_write_lag should give a max 1.5 seconds delay. So why are are we seeing read latency > 10 seconds for some queries?

Is the pageserver getting stuck on a lock sometimes, so that it pauses processing any WAL for 5-10 seconds? Or does the query need to wait for multiple pages, so that the overall response time is > 10 seconds, even though the wait on each page is only 1.5 s. Or what's going on?

It would be useful to print out the current last_received_lsn once as second during the test. You could then see the speed that the pageserver is processing the WAL. Is it constant, or is there some pauses?

knizhnik · 2022-06-13T19:48:05Z

Is wal_log_hints=on enabled in prod?

Yes

In fact, with only new wal backpressure tests, the total test runtime is increased by ~40min, which is extremely slow.

I do not think that we should perform tests with 500Mb max_replication_write_lag.

However, given the fact that 500MB is the default setting in prod, we should probably wait.

IMHO we should change prod settings ASAP.

Hmm, I'm not sure I'm ready to jump to that conclusion. What is the speed that the pageserver can process the incoming WAL?

It greatly depends on workload (i.e. FPI from COPY vs. individual insert records, especially updating indexes)
and disk IO speed. But looks like 10Mb/sec is good estimation of write consumption speed.

So why are are we seeing read latency > 10 seconds for some queries?

10 MB/sec we get if there is no other activity.
If we there are active:

autovacuum
checkpointing (writing frozen lyers)
flushing layers
compaction

then speed can be decreased several times.
I tried to find out which of the activities above have the most impact on performance.
But looks like t my laptop the main showstopper is SSD itself: in case of intensive writes sometimes IO operations (fsyncs, writes and even reads) are sometimes blocked for dozen of seconds. At server hardware such effect is less noticeable.

Is the pageserver getting stuck on a lock sometimes, so that it pauses processing any WAL for 5-10 seconds? Or does the query need to wait for multiple pages, so that the overall response time is > 10 seconds, even though the wait on each page is only 1.5 s. Or what's going on?

I didn't ever see that pageserver is blocked on some locks. Most frequently I see threads blocks in write/fsync.

Also please notice, that originally I say the huge delay problem with new client connection and first execution of some query.
In both cases client has to read much more than just one page: actually it reads substantial part of system catalog. It is hadren or may be even thousands of pages. This is why total time can be much larger than 1 second. But with 10Mb max_replication_write_lag or with cached lst written LSN patch I rarely see delays larger than few seconds.

It would be useful to print out the current last_received_lsn once as second during the test. You could then see the speed that the pageserver is processing the WAL. Is it constant, or is there some pauses?

Backpressure tests are printing lag now, adding it to current flush LSN we can monitor speed of consuming WAL by pageserver.

aome510 · 2022-06-13T19:53:36Z

In fact, with only new wal backpressure tests, the total test runtime is increased by ~40min, which is extremely slow.

I do not think that we should perform tests with 500Mb max_replication_write_lag.

However, given the fact that 500MB is the default setting in prod, we should probably wait.

IMHO we should change prod settings ASAP.

Got it. Will disable it in tests later.

It would be useful to print out the current last_received_lsn once as second during the test. You could then see the speed that the pageserver is processing the WAL. Is it constant, or is there some pauses?

Backpressure tests are printing lag now, adding it to current flush LSN we can monitor speed of consuming WAL by pageserver.

I have added the code to print lsn_process_speed and lsn_produce_speed in #1919. I also disable wal_log_hints in the latest commit to see if that makes any difference.

aome510 · 2022-06-14T17:42:01Z

If we assume conservatively that it's actually 10 MB/s, then a 15 MB max_replication_write_lag should give a max 1.5 seconds delay. So why are are we seeing read latency > 10 seconds for some queries?

This is only true if the difference between produce speed and process speed is consistently 10MB/s. IIUC, as a read request requires the latest LSN, theoretically, there can be a situation when produce speed = process speed and the lag is 14 MB. If that happens, the read request'll never finish because the page server keeps failing behind.

knizhnik · 2022-06-14T19:57:21Z

IIUC, as a read request requires the latest LSN, theoretically, there can be a situation when produce speed = process speed and the lag is 14 MB. If that happens, the read request'll never finish because the page server keeps failing behind.

Read request may require latest LSN. But latest at the moment of issuing this request. If soe WAL is produced after it, we should not wait for it. So the situation you have described is never possible. Achilles won't overtake the tortoise: walreceiver may never catch-up wih walsender without some kind of synchronization. But we are solving different problem: read_page_at_lsn doesn't need to catch-up. It just waits for the particular LSN. And even if request has "latest" flag, we still wait for last_record_lsn at the moment of receiving this requests.

aome510 · 2022-06-14T21:10:59Z

IIUC, as a read request requires the latest LSN, theoretically, there can be a situation when produce speed = process speed and the lag is 14 MB. If that happens, the read request'll never finish because the page server keeps failing behind.

Read request may require latest LSN. But latest at the moment of issuing this request. If soe WAL is produced after it, we should not wait for it. So the situation you have described is never possible. Achilles won't overtake the tortoise: walreceiver may never catch-up wih walsender without some kind of synchronization. But we are solving different problem: read_page_at_lsn doesn't need to catch-up. It just waits for the particular LSN. And even if request has "latest" flag, we still wait for last_record_lsn at the moment of receiving this requests.

Yeah, you're right. The requested page stream only waits for at most req.lsn (the LSN at the time making the read request) LSN. And the time delay for waiting once the page service receives such page request is not big. What I don't understand is that for big read latency request, there is a delay between the time the client starts the request and the time the page service receives the corresponding page stream.

Not sure if the delay is because the page service processes page stream messages sequentially (doesn't it?).

aome510 · 2022-06-15T21:13:59Z

@knizhnik I think I quite get the problem now. It's true that a single get_page_at_lsn request is fast, but when the page service receives multiple get_page_at_lsn requests, the total runtime is slow because the page service handles them sequentially and needs to wait for the latest LSN specified by each request. During the wait, the latest LSN for the next request is incremented as PG produces and the page server processes more WALs. For example, assume 10MB/s WAL processing -> 15/10 = 1.5s maximum wait time for each request. A select count(*) from t with 100_000 rows consists of around 375 blocks. So, in the worst cases, the total wait time is (375 * 1.5)s.

Please correct me if I misunderstand something here.

In case of test_heavy_write_workload, SELECT * from t0 where key = 0 should involve one single get_page_at_lsn request, the reason for large latency (~4-5s) is because after we establishing a new connection, handling a read request requires PG to also read other relations. Looking into the logs, they seem to include pg_* tables such as pg_authid_rolname_index, pg_authid_oid_index, etc. This scales up the number of get_page_at_lsn requests for each read-latency record. I have tried to address this problem by reusing a PG cursor when recording read latencies in 041dced.

Compared to the previous commit, the read latency average for this test is reduced by 10x. However, the read latency and runtime for the test_pgbench_intensive_init_workload test are increased significantly. I'll need to look into a bit deeper to find the cause. My quick investigation for the slow read latency is because select count(*) from t request involves reading multiple blocks (mentioned above), hence require requesting multiple get_page_at_lsn requests. But it's not clear to me why it's much slower now.

knizhnik · 2022-06-16T08:46:18Z

Please correct me if I misunderstand something here.
Yes, you understanding is correct.
PR #1919 address this problem, because updated relation dot not increase last written lsn for all other tables, so we can access catalog relation with old lsn and do not wait each time for page server to catch-up. In this PR cache is very small (just 4), so if many tables (and their indexes) are updated, then this cache will not help. As alternative we can think about much larger cache, may be not for the whole relation but for some relation chunk (i.e. 1Mb) or even individual pages. But such large cache shoudl be shared and so requires synchronized access. And it can become bottleneck.

aome510 · 2022-06-17T18:18:59Z

However, the read latency and runtime for the test_pgbench_intensive_init_workload test are increased significantly.

Actually, after reruning the test, there is not much of a difference between the two runs. I guess the previous anomaly happens because CircleCI is under a heavy load.

@knizhnik if the tests' behaviours defined in #1919 look good to you, I can merge it back to this branch and we can start working on reviewing this PR and merging it into main.

aome510 · 2022-06-17T18:20:04Z

Or we can rebase #1919 and start from there

knizhnik · 2022-06-17T19:58:29Z

@knizhnik if the tests' behaviours defined in #1919 look good to you, I can merge it back to this branch and we can start working on reviewing this PR and merging it into main.

May be we should better merge #1919 in main?

aome510 · 2022-06-17T21:40:00Z

Close in favour of #1919

aome510 added 4 commits June 6, 2022 15:16

add a simple performance test to measure read latency on heavy write …

c9bf46c

…workload

re-structure test_measure_read_latency_heavy_write_workload

03dc155

add test_measure_read_latency_pgbench_simple_update_workload

92b14f5

add test_measure_read_latency_other_table

1d2d525

hlinnaka requested a review from knizhnik June 6, 2022 22:04

clean up

4828dab

add batch_others/__init__.py

af75c50

knizhnik reviewed Jun 7, 2022

View reviewed changes

allow fsync to be configurable when running tests

358c953

fix mypy error

df97923

aome510 added 4 commits June 8, 2022 14:04

add LSN lags watcher and clean up codes

b362c9f

increase scale by 10x

c3ac2c8

refactor codes

f0e5e72

rename variables, cleanup codes

90938f9

aome510 changed the title ~~Add wal backpressure performance tests (WIP)~~ Add wal backpressure performance tests Jun 8, 2022

aome510 marked this pull request as ready for review June 8, 2022 21:38

add thread join

d48141a

aome510 force-pushed the thang/add-wal-backpressure-tests branch from d5917b1 to d48141a Compare June 8, 2022 22:27

aome510 mentioned this pull request Jun 10, 2022

Parallelize performance tests #1924

Closed

Merge remote-tracking branch 'origin/main' into thang/add-wal-backpre…

208d9a5

…ssure-tests

report write lags in MB

e885a4a

knizhnik closed this Jun 14, 2022

aome510 reopened this Jun 14, 2022

aome510 mentioned this pull request Jun 17, 2022

Add test for last written lsn cache #1949

Merged

aome510 closed this Jun 17, 2022

aome510 mentioned this pull request Jun 20, 2022

Cache last written LSN for last updated relations to reduce wait LSN time for queries to other relations neondatabase/postgres#167

Closed

aome510 deleted the thang/add-wal-backpressure-tests branch June 29, 2022 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add wal backpressure performance tests #1891

Add wal backpressure performance tests #1891

aome510 commented Jun 6, 2022 •

edited

Loading

aome510 commented Jun 7, 2022 •

edited

Loading

knizhnik left a comment

aome510 commented Jun 7, 2022 •

edited

Loading

petuhovskiy commented Jun 7, 2022

aome510 commented Jun 7, 2022 •

edited

Loading

knizhnik commented Jun 8, 2022

knizhnik commented Jun 8, 2022 •

edited

Loading

knizhnik commented Jun 8, 2022

aome510 commented Jun 8, 2022

aome510 commented Jun 8, 2022 •

edited

Loading

knizhnik commented Jun 8, 2022

aome510 commented Jun 8, 2022 •

edited

Loading

aome510 commented Jun 8, 2022 •

edited

Loading

knizhnik commented Jun 10, 2022

knizhnik commented Jun 10, 2022

aome510 commented Jun 10, 2022 •

edited

Loading

aome510 commented Jun 10, 2022

knizhnik commented Jun 11, 2022

knizhnik commented Jun 13, 2022

knizhnik commented Jun 13, 2022

aome510 commented Jun 13, 2022 •

edited

Loading

aome510 commented Jun 13, 2022 •

edited

Loading

hlinnaka commented Jun 13, 2022

knizhnik commented Jun 13, 2022

aome510 commented Jun 13, 2022 •

edited

Loading

aome510 commented Jun 14, 2022 •

edited

Loading

knizhnik commented Jun 14, 2022

aome510 commented Jun 14, 2022 •

edited

Loading

aome510 commented Jun 15, 2022 •

edited

Loading

knizhnik commented Jun 16, 2022

aome510 commented Jun 17, 2022 •

edited

Loading

aome510 commented Jun 17, 2022

knizhnik commented Jun 17, 2022

aome510 commented Jun 17, 2022

Add wal backpressure performance tests #1891

Add wal backpressure performance tests #1891

Conversation

aome510 commented Jun 6, 2022 • edited Loading

Changes

aome510 commented Jun 7, 2022 • edited Loading

knizhnik left a comment

Choose a reason for hiding this comment

aome510 commented Jun 7, 2022 • edited Loading

petuhovskiy commented Jun 7, 2022

aome510 commented Jun 7, 2022 • edited Loading

knizhnik commented Jun 8, 2022

knizhnik commented Jun 8, 2022 • edited Loading

knizhnik commented Jun 8, 2022

aome510 commented Jun 8, 2022

aome510 commented Jun 8, 2022 • edited Loading

knizhnik commented Jun 8, 2022

aome510 commented Jun 8, 2022 • edited Loading

aome510 commented Jun 8, 2022 • edited Loading

knizhnik commented Jun 10, 2022

knizhnik commented Jun 10, 2022

aome510 commented Jun 10, 2022 • edited Loading

aome510 commented Jun 10, 2022

knizhnik commented Jun 11, 2022

knizhnik commented Jun 13, 2022

knizhnik commented Jun 13, 2022

aome510 commented Jun 13, 2022 • edited Loading

aome510 commented Jun 13, 2022 • edited Loading

hlinnaka commented Jun 13, 2022

knizhnik commented Jun 13, 2022

aome510 commented Jun 13, 2022 • edited Loading

aome510 commented Jun 14, 2022 • edited Loading

knizhnik commented Jun 14, 2022

aome510 commented Jun 14, 2022 • edited Loading

aome510 commented Jun 15, 2022 • edited Loading

knizhnik commented Jun 16, 2022

aome510 commented Jun 17, 2022 • edited Loading

aome510 commented Jun 17, 2022

knizhnik commented Jun 17, 2022

aome510 commented Jun 17, 2022

aome510 commented Jun 6, 2022 •

edited

Loading

aome510 commented Jun 7, 2022 •

edited

Loading

aome510 commented Jun 7, 2022 •

edited

Loading

aome510 commented Jun 7, 2022 •

edited

Loading

knizhnik commented Jun 8, 2022 •

edited

Loading

aome510 commented Jun 8, 2022 •

edited

Loading

aome510 commented Jun 8, 2022 •

edited

Loading

aome510 commented Jun 8, 2022 •

edited

Loading

aome510 commented Jun 10, 2022 •

edited

Loading

aome510 commented Jun 13, 2022 •

edited

Loading

aome510 commented Jun 13, 2022 •

edited

Loading

aome510 commented Jun 13, 2022 •

edited

Loading

aome510 commented Jun 14, 2022 •

edited

Loading

aome510 commented Jun 14, 2022 •

edited

Loading

aome510 commented Jun 15, 2022 •

edited

Loading

aome510 commented Jun 17, 2022 •

edited

Loading