Restic repository management fixes #1367

skriss · 2019-04-12T22:38:29Z

This does not need to make it into the first alpha, so look at other open PRs first.

Currently if, when creating a new restic repo as part of taking a backup, the repo creation fails, Velero waits for an hour before reporting the failure, because it may "eventually" succeed. This is a bad UX, though, and if it failed to create the first time, it won't likely succeed the next time without user intervention. So, I changed it so the backup fails fast if the repo creation fails the first time.

I also lowered the resync interval for the restic repo controller, so repos that become "not ready" are retried more often - makes it more likely to get back to a good state after getting into a bad state.

One or two other small fixes in their own commits.

Testing looks good so far, but would like to do some more.

Signed-off-by: Steve Kriss <krisss@vmware.com>

carlisia

Missing changelog but 👍

Signed-off-by: Steve Kriss <krisss@vmware.com>

nrb · 2019-04-17T15:32:09Z

I also lowered the resync interval for the restic repo controller, so repos that become "not ready" are retried more often - makes it more likely to get back to a good state after getting into a bad state.

Is this a separate state from the one mentioned initially (that would take an hour to retry)?

skriss · 2019-04-17T15:40:29Z

Yeah - existing restic repositories are periodically checked for integrity, and if they fail that for any reason, they'll become NotReady. That's a different case than a new repo that fails to initialize for some reason.

skriss added 6 commits April 12, 2019 14:43

restic: change log statement from error to debug

7251c8c

Signed-off-by: Steve Kriss <krisss@vmware.com>

repo ensurer: return error if new repo becomes NotReady

f879670

Signed-off-by: Steve Kriss <krisss@vmware.com>

repo ensurer: wait at most one minute for repo to become ready

8d61cb0

Signed-off-by: Steve Kriss <krisss@vmware.com>

repo ensurer: rename readyChans to repoChans

0328a70

Signed-off-by: Steve Kriss <krisss@vmware.com>

restic repo controller: lower resync period to 5min

5d06bd4

Signed-off-by: Steve Kriss <krisss@vmware.com>

repo ensurer: fix channel lifecycles

44acdcb

Signed-off-by: Steve Kriss <krisss@vmware.com>

skriss changed the title ~~Restic repo fixes~~ Restic repository management fixes Apr 12, 2019

skriss requested review from nrb and carlisia April 12, 2019 22:38

carlisia approved these changes Apr 17, 2019

View reviewed changes

changelog

c475108

Signed-off-by: Steve Kriss <krisss@vmware.com>

nrb merged commit 0750b2c into vmware-tanzu:master Apr 17, 2019

skriss deleted the restic-repo-fixes branch April 17, 2019 16:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restic repository management fixes #1367

Restic repository management fixes #1367

skriss commented Apr 12, 2019

carlisia left a comment

nrb commented Apr 17, 2019

skriss commented Apr 17, 2019

Restic repository management fixes #1367

Restic repository management fixes #1367

Conversation

skriss commented Apr 12, 2019

carlisia left a comment

Choose a reason for hiding this comment

nrb commented Apr 17, 2019

skriss commented Apr 17, 2019