-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Update Syncing condition to include all errors #260
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
9124c33
to
f974e29
Compare
f974e29
to
b689d33
Compare
d32e568
to
2baf2c5
Compare
dac0ce9
to
3c50011
Compare
/retest |
64d517f
to
52bc32c
Compare
/retest |
Extracted the duration period cleanup: #273 |
Extracted flakey bug fix: #274 |
52bc32c
to
705661c
Compare
Extracted periodic status updates: #279 |
Extracted part of the logic to combine sync errors: #281 The logic to preserve the non-applier errors from the updater will be done in a subsequent PR. |
Extracted thread-safety improvements to #301 |
705661c
to
e851d04
Compare
- Update the Syncing condition to reflect errors from all phases with the current commit. This prevents hiding/dropping errors, especially non-blocking errors. - Add/clarify some comments - Set ErrorSummary and ErrorSources to nil when empty (no errors) Change-Id: Ica2dbae2b196a0fa0e2a4fbdc21d1775a27f4b0d
e851d04
to
e7a4194
Compare
Extracted the hydration e2e changes: #310 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really like the refactoring. Left a few questions/comments but overall LGTM
pkg/parse/opts.go
Outdated
@@ -96,11 +89,3 @@ func (o *opts) k8sClient() client.Client { | |||
func (o *opts) discoveryClient() discovery.ServerResourcer { | |||
return o.discoveryInterface | |||
} | |||
|
|||
func (o *opts) SetReconciling(value bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏
@@ -41,7 +41,9 @@ type Resources struct { | |||
objectSet map[core.ID]*unstructured.Unstructured | |||
} | |||
|
|||
// Update performs an atomic update on the resource declaration set. | |||
// Update performs an atomic update on the resource declaration set, converts | |||
// the objects to Unstructured and validating that not all namespaces are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and validating
-> and validates
@@ -429,35 +446,62 @@ func setSyncStatusFields(syncStatus *v1beta1.Status, newStatus syncStatus, denom | |||
} | |||
|
|||
func setSyncStatusErrors(syncStatus *v1beta1.Status, cse []v1beta1.ConfigSyncError, denominator int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice if this was usable for RenderingStatus
and SourceStatus
to minimize duplication
@@ -142,66 +142,74 @@ func Run(ctx context.Context, p Parser) { | |||
} | |||
} | |||
|
|||
// run executes the parts of the pipeline managed by the reconciler. | |||
// | |||
// Phases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this! Would you mind adding a bit more context for each phase?
} | ||
if state.needToSetSourceStatus(newSourceStatus) { | ||
if err := p.setSourceStatus(ctx, newSourceStatus); err != nil { | ||
parseStatus := sourceStatus{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm a bit confused as to what a sourceStatus is supposed to represent. Looks like here it's related to issues with parsing (i.e. marshalling?) but above it's related to issues reading some config files. Is it supposed to represent any error with reading from the filesystem at any stage of the pipeline?
reposync.SetSyncing(&rs, continueSyncing, "Source", "Source", newStatus.commit, errorSource, rs.Status.Source.ErrorSummary, newStatus.lastUpdate) | ||
// Syncing is stopped/prevented if there are any errors from the same commit. | ||
// TODO: Handle non-blocking errors | ||
syncing := (syncErrorCount == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, syncing
only depends on the source error. Now, it is changed to include all errors, including the former render or sync errors with the same commit.
It affects how the notification sends out alerts.
The current proposal is to leverage the Syncing condition to determine the sync status and send out alerts.
If the Syncing condition is false
, it indicates the current apply loop is done. If there're no errors, then we send out success
notification. Otherwise, it is failed
.
In this case, if there're errors from the same commit, even if it might be fixed with the new loop, there will be a failure notification.
We can include previous errors with the same commit in the ErrorSummary to surface the errors earlier, but can we keep syncing
only depend on the errors from the current phase (source, render, or sync)?
@sdowell FYI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can include previous errors with the same commit in the ErrorSummary to surface the errors earlier, but can we keep syncing only depend on the errors from the current phase (source, render, or sync)?
It's not super clear what the Syncing condition was/is supposed to represent.
Internally, the phases are effectively Fetch/Read -> Render/Read -> Parse -> Update (Validate/Apply/Watch/Reconcile).
It's confusing to have the status.sync
field represent just the errors from the Updater (which wraps the Applier) and then have the Syncing conditions which is set to True by the previous source and rendering phases.
This PR effectively assumes that since the source and rendering phases were setting the Syncing condition to True that the Syncing condition must include the entire pipeline, and not just the Update/Apply phase.
If that is NOT what we want, then we need to document the proposed change and implement it consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it's worth noting that the inputs to a sync/apply are not just Git. They also can include Helm values in the spec and other config values, like Git dir, URL, cluster name, etc. So it doesn't make sense to have a unique pipeline execution depend on just a commit. Even just a generation would be technically insufficient, because the previous run may fail, perhaps due to some ephemeral network issue, triggering a retry. Today we don't indicate what generation or attempt/retry the Syncing condition represents, which makes it hard to use for triggering discrete alerts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Syncing condition represents the current/latest fetch -> render -> parse -> update pipeline, while status.source
, status.rendering
, and status.sync
represent the statuses of the last processed commit in each phase.
It is possible that the Syncing condition only includes status of partial pipeline while the loop is still in progress.
For example,
- The first pipeline failed at commit A in the apply phase,
status.sync
will have errors for commit A. - The second pipeline failed to fetch commit A,
status.source
will have errors for commit A, andstatus.sync
still contains the former apply errors. - The current pipeline succeeded in fetching and rendering commit A, source errors from the second pipeline will get cleared from
status.source
. Then theparseAndUpdate
function will callp.setSourceStatus
, which calls thesetSourceStatusWithRetries
function. ThesetSourceStatusWithRetries
function should set Syncing condition totrue
because it will continue applying the parsed objects next. However, this PR sets the syncing condition tofalse
because of the sync errors from the first pipeline.
curious if this is still WIP or if it's even relevant. If it is consider closing the PR and keeping the private branch or converting to draft. |
with the current commit. This prevents hiding/dropping errors,
especially non-blocking errors.