storcon: reduce "connection refused" period during upgrades (storcon deployments cause cplane operation failures (connection refused\nrequest must not be retried
))
#8034
Labels
c/storage/controller
Component: Storage Controller
c/storage/pageserver
Component: storage: pageserver
t/bug
Issue Type: Bug
triaged
bugs that were already triaged
Context: https://neondb.slack.com/archives/C06K38EB05D/p1718209960490099?thread_ts=1718184799.253779&cid=C06K38EB05D
Problem
In prodlike cloudbench, we have observed that a storcon deployment can, 44s (!) after the storcon logs that it's up again, cause cplane to get
connection refused
errors when it tries to talk to storcon.Analysis
@ololobus :
Impact
When a Cplane client does a POST request, it doesn't retry them when it gets
connection refused
because it doesn't assume idempotency.Example cplane log message
Related
The text was updated successfully, but these errors were encountered: