Skip to content

Commit

Permalink
fix(test_remote_storage_upload_queue_retries): became flakier since #…
Browse files Browse the repository at this point in the history
…6960 (#6999)

This PR increases the `wait_until` timeout.
These are where things became more flaky as of
#6960.
Most likely because it doubles the work in the
`churn_while_failpoints_active_thread`.

Slack context:
https://neondb.slack.com/archives/C033RQ5SPDH/p1709554455962959?thread_ts=1709286362.850549&cid=C033RQ5SPDH
  • Loading branch information
problame authored Mar 4, 2024
1 parent e938bb8 commit f0be940
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions test_runner/regress/test_remote_storage.py
Original file line number Diff line number Diff line change
Expand Up @@ -329,14 +329,15 @@ def churn_while_failpoints_active(result):
churn_while_failpoints_active_thread.start()

# wait for churn thread's data to get stuck in the upload queue
wait_until(10, 0.5, lambda: assert_gt(get_queued_count(file_kind="layer", op_kind="upload"), 0))
wait_until(10, 0.5, lambda: assert_ge(get_queued_count(file_kind="index", op_kind="upload"), 2))
wait_until(10, 0.5, lambda: assert_gt(get_queued_count(file_kind="layer", op_kind="delete"), 0))
# Exponential back-off in upload queue, so, gracious timeouts.

wait_until(30, 1, lambda: assert_gt(get_queued_count(file_kind="layer", op_kind="upload"), 0))
wait_until(30, 1, lambda: assert_ge(get_queued_count(file_kind="index", op_kind="upload"), 2))
wait_until(30, 1, lambda: assert_gt(get_queued_count(file_kind="layer", op_kind="delete"), 0))

# unblock churn operations
configure_storage_sync_failpoints("off")

# ... and wait for them to finish. Exponential back-off in upload queue, so, gracious timeouts.
wait_until(30, 1, lambda: assert_eq(get_queued_count(file_kind="layer", op_kind="upload"), 0))
wait_until(30, 1, lambda: assert_eq(get_queued_count(file_kind="index", op_kind="upload"), 0))
wait_until(30, 1, lambda: assert_eq(get_queued_count(file_kind="layer", op_kind="delete"), 0))
Expand Down

1 comment on commit f0be940

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2561 tests run: 2427 passed, 1 failed, 133 skipped (full report)


Failures on Postgres 14

  • test_basebackup_with_high_slru_count[github-actions-selfhosted-vectored-10-13-30]: release
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_basebackup_with_high_slru_count[release-pg14-github-actions-selfhosted-vectored-10-13-30]"
Flaky tests (1)

Postgres 16

  • test_multi_attach: debug

Code coverage* (full report)

  • functions: 28.7% (6934 of 24172 functions)
  • lines: 47.2% (42526 of 90097 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
f0be940 at 2024-03-04T15:35:54.914Z :recycle:

Please sign in to comment.