Skip to content

Commit

Permalink
Add test for proper handling of connection failure to avoid 'cannot w…
Browse files Browse the repository at this point in the history
…ait on socket event without a socket' error (#8231)

## Problem

See neondatabase/cloud#14289
and PR #8210 

## Summary of changes

Add test for problems fixed in #8210

## Checklist before requesting a review

- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

## Checklist before merging

- [ ] Do not forget to reformat commit message to not include the above
checklist

---------

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
  • Loading branch information
knizhnik and Konstantin Knizhnik committed Jul 2, 2024
1 parent 891cb5a commit 4a0c2ae
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 5 deletions.
5 changes: 0 additions & 5 deletions pgxn/neon/libpagestore.c
Original file line number Diff line number Diff line change
Expand Up @@ -427,11 +427,6 @@ pageserver_connect(shardno_t shard_no, int elevel)
values[n_pgsql_params] = NULL;

shard->conn = PQconnectStartParams(keywords, values, 1);
if (!shard->conn)
{
neon_shard_log(shard_no, elevel, "Failed to connect to pageserver: out of memory");
return false;
}
if (PQstatus(shard->conn) == CONNECTION_BAD)
{
char *msg = pchomp(PQerrorMessage(shard->conn));
Expand Down
24 changes: 24 additions & 0 deletions test_runner/regress/test_pageserver_reconnect.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import time
from contextlib import closing

import psycopg2.errors
from fixtures.log_helper import log
from fixtures.neon_fixtures import NeonEnv, PgBin

Expand Down Expand Up @@ -40,3 +41,26 @@ def run_pgbench(connstr: str):
c.execute("select pg_reload_conf()")

thread.join()


# Test handling errors during page server reconnect
def test_pageserver_reconnect_failure(neon_simple_env: NeonEnv):
env = neon_simple_env
env.neon_cli.create_branch("test_pageserver_reconnect")
endpoint = env.endpoints.create_start("test_pageserver_reconnect")

con = endpoint.connect()
cur = con.cursor()

cur.execute("set statement_timeout='2s'")
cur.execute("SELECT setting FROM pg_settings WHERE name='neon.pageserver_connstring'")
connstring = cur.fetchall()[0][0]
cur.execute(
f"alter system set neon.pageserver_connstring='{connstring}?some_invalid_param=xyz'"
)
cur.execute("select pg_reload_conf()")
try:
cur.execute("select count(*) from pg_class")
except psycopg2.errors.QueryCanceled:
log.info("Connection to PS failed")
assert not endpoint.log_contains("ERROR: cannot wait on socket event without a socket.*")

1 comment on commit 4a0c2ae

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3094 tests run: 2969 passed, 1 failed, 124 skipped (full report)


Failures on Postgres 14

  • test_sharding_autosplit[github-actions-selfhosted]: release
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_sharding_autosplit[release-pg14-github-actions-selfhosted]"
Flaky tests (2)

Postgres 16

  • test_secondary_background_downloads: debug

Postgres 14

Code coverage* (full report)

  • functions: 32.7% (6934 of 21213 functions)
  • lines: 50.0% (54314 of 108576 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
4a0c2ae at 2024-07-02T20:15:33.114Z :recycle:

Please sign in to comment.