Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource health checking and wait #2718

Open
phillebaba opened this issue Jul 12, 2024 · 1 comment
Open

Resource health checking and wait #2718

phillebaba opened this issue Jul 12, 2024 · 1 comment
Assignees
Labels
enhancement ✨ New feature or request
Milestone

Comments

@phillebaba
Copy link
Member

Is your feature request related to a problem? Please describe.

The predominant method to health check deployments currently is to use the wait features in an action. This requires users to know and specify the ready status condition for the specific resource they are waiting for. In the background the waiting is done through shelling out to kubectl wait. This is overly verbose and should be solved in a simpler way.

Describe the solution you'd like

  • Given a Zarf package with a health condition set.
  • When the package is deployed.
  • Then health checking is run immediately after the deployment and before any actions.

My suggestion is to just copy what Flux is doing as it has been working over the years. This would add two new fields to the components in a Zarf package, a wait and a healthChecks field.

When wait is true we would wait for all the resources applied to the cluster. This is the preferred method that we would want users to use first of all.

kind: ZarfPackageConfig
metadata:
  name: argocd
  description: Example showcasing installing ArgoCD
components:
  - name: argocd-helm-chart
    required: true
    charts:
      - name: argo-cd
        version: 5.54.0
        namespace: argocd
        url: https://argoproj.github.io/argo-helm
        releaseName: argocd-baseline
        valuesFiles:
          - baseline/values.yaml
    images:
      - docker.io/library/redis:7.0.15-alpine
      - quay.io/argoproj/argocd:v2.9.6
      # Cosign artifacts for images - argocd - argocd-helm-chart
      - quay.io/argoproj/argocd:sha256-2dafd800fb617ba5b16ae429e388ca140f66f88171463d23d158b372bb2fae08.sig
      - quay.io/argoproj/argocd:sha256-2dafd800fb617ba5b16ae429e388ca140f66f88171463d23d158b372bb2fae08.att
    wait: true

The health checks list is a alternative method for those that for some reason want to only health check specific reasons. They may want to do so because all resources may not be kstatus compatible. Alternatively because they are only interested in specific resources.

kind: ZarfPackageConfig
metadata:
  name: argocd
  description: Example showcasing installing ArgoCD
components:
  - name: argocd-helm-chart
    required: true
    charts:
      - name: argo-cd
        version: 5.54.0
        namespace: argocd
        url: https://argoproj.github.io/argo-helm
        releaseName: argocd-baseline
        valuesFiles:
          - baseline/values.yaml
    images:
      - docker.io/library/redis:7.0.15-alpine
      - quay.io/argoproj/argocd:v2.9.6
      # Cosign artifacts for images - argocd - argocd-helm-chart
      - quay.io/argoproj/argocd:sha256-2dafd800fb617ba5b16ae429e388ca140f66f88171463d23d158b372bb2fae08.sig
      - quay.io/argoproj/argocd:sha256-2dafd800fb617ba5b16ae429e388ca140f66f88171463d23d158b372bb2fae08.att
    healthChecks:
      - apiVersion: v1
        kind: Deployment
        name: argocd
        namespace: argocd

In the documentation we would push users to first try using wait because it is much simpler and requires less knowledge of the package.

Describe alternatives you've considered

There are not really that many alternatives out there other than kstatus. This solution would allow us to change to some other health checking method in the future if one would appear without changing the user changing configuration.

Additional context

Relates to #2203

@phillebaba phillebaba added the enhancement ✨ New feature or request label Jul 12, 2024
@phillebaba phillebaba self-assigned this Jul 12, 2024
@phillebaba phillebaba added this to the v0.37.0 milestone Jul 12, 2024
@salaxander salaxander modified the milestones: v0.37.0, v0.38.0 Aug 8, 2024
@AustinAbro321
Copy link
Contributor

One way to get information on which objects are deployed for the wait key is to use the releases.releases object from helm (currently ignored) which should give runtime objects for everything deployed.

if errors.Is(histErr, driver.ErrReleaseNotFound) {
// No prior release, try to install it.
spinner.Updatef("Attempting chart installation")
_, err = h.installChart(postRender)
} else if histErr == nil && len(releases) > 0 {
// Otherwise, there is a prior release so upgrade it.
spinner.Updatef("Attempting chart upgrade")
lastRelease := releases[len(releases)-1]
_, err = h.upgradeChart(lastRelease, postRender)

@salaxander salaxander modified the milestones: v0.38.0, v0.39.0 Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ New feature or request
Projects
Status: No status
Development

No branches or pull requests

3 participants