Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application dependencies #7437

Open
jessesuen opened this issue Oct 14, 2021 · 64 comments · May be fixed by #15280
Open

Application dependencies #7437

jessesuen opened this issue Oct 14, 2021 · 64 comments · May be fixed by #15280

Comments

@jessesuen
Copy link
Member

jessesuen commented Oct 14, 2021

Summary

I was speaking with @JasonMorgan from Buoyant today about a missing feature in Argo CD for blocking application syncs based on required dependencies on other applications. The use case is:

  1. I need to deploy apps A and B
  2. B must not be deployed before A (because A has a mutating webhook which must be in place before B starts)
  3. I want to sync them all at the same time and don't want to think about clicking sync in some correct order

This is especially important for the bootstrapping use case where you're recreating a cluster from git, and you need to create many apps after a bunch of system-level add-ons are fully available. e.g. linkerd must be in place before any applications come up, because linkerd's mutating webhook needs to inject sidecars into application pods starting up.

The use case is very compelling and I'm convinced we should prioritize this. I think this feature, combined with ApplicationSets will really start to complete our bootstrapping story.

Motivation

Please give examples of your use case, e.g. when would you use this.

During cluster bootstrapping, cluster addons (especially ones with mutating webhooks) need to be in place before application pods can come up.

Proposal

How do you think this should be implemented?

It turns out, @jannfis already started some work on this, and the spec changes close to what we need: #3892

Given the age of the original PR, I'm filing an issue in case we abandon #3892 for a new attempt, and targeting this for tentative next milestone in case someone wants to pick this up.

@jessesuen jessesuen added the enhancement New feature or request label Oct 14, 2021
@jessesuen jessesuen added this to the v2.3 milestone Oct 14, 2021
@jannfis
Copy link
Member

jannfis commented Oct 30, 2021

I'm glad to see this gaining traction again. From previous discussions, we thought that the sync retry feature would solve this problem in a more declarative way (e.g. reconcile as long as necessary, hoping for dependencies to have finished reconciling in a certain time frame).

I think we could build up upon the existing PoC code, however I think we should consider some more things than are currently implemented in the PoC:

  • Application dependency specification should allow for label selector as well as single, named Applications
  • Ability to optional restrict dependencies on same destination clusters
  • Force sync should override/ignore any unmet dependencies when syncing manually
  • Dependencies should be visualized in the UI, similar to how we visualize ownerReferences

And probably some more things I have somewhere in the back of my mind from when I came up with the PoC.

@jessesuen
Copy link
Member Author

I'm glad to see this gaining traction again. From previous discussions, we thought that the sync retry feature would solve this problem in a more declarative way (e.g. reconcile as long as necessary, hoping for dependencies to have finished reconciling in a certain time frame).

Yes, what I now realize is that retries don't help because in the problematic scenario (mutating webhooks), nothing actually "fails" per se and so there is nothing to retry. The dependent application silently succeeds even though it didn't get injected properly.

I think we could build up upon the existing PoC code, however, I think we should consider some more things than are currently implemented in the PoC:

I love your ideas on making this even more powerful with labels and force sync. But for MVP, we can keep this quite simple, not very far removed from your PoC. The way I think this feature should work is:

  1. Application B depends on A. Both applications are created, but neither is deployed (have a Missing health status).
  2. User clicks sync on B
  3. B now has an operation in a Running state (because we don't have a Pending state), but stays inRunning indefinitely because A is not healthy (NOTE: we would also keep it in Running if A did not exist).
  4. User eventually clicks on sync on A
  5. As soon as A is Healthy, B would actually go through with the operation.

I took a look at your work, and I believe you implemented it just like how I described it.

Dependencies should be visualized in the UI, similar to how we visualize ownerReferences

I think this is more than we need, a simple message in the operation would be sufficient to understand what's going on.

@Lavanya-Anbalagan
Copy link

This is a blocker for us and makes us to put lot of efforts between the dependency applications. Can we get an update on this ?.

@flaviomoringa
Copy link

Have the exact same issue with installing Kyverno and then some policies.
Also referenced here:
#8358
#7978

@hhannani
Copy link

hhannani commented Mar 28, 2022

Hi team, is there a way to use dependencies between yaml files within the same Application?

@DotNetRockStar
Copy link

bump; same issues.

1 similar comment
@rafilkmp3
Copy link

bump; same issues.

@christianh814
Copy link
Member

Just adding my "bump" here. This is mainly because I would also like this with ApplicationSets as I stated in issue #221

@wmgroot
Copy link
Contributor

wmgroot commented Apr 22, 2022

I've opened a PR showing a possible implementation path (which needs some work).
This is against the old repo, but I'd like to get feedback on the direction before investing more effort into migrating it to this repo.
wmgroot/applicationset#1

If the dependency work is close to completion, I believe it could replace the user defined rollout stages in my PR.

@qxmips
Copy link

qxmips commented Jun 1, 2022

same here

@nneram
Copy link

nneram commented Jun 9, 2022

We would love to see this feature as well ! 👍🏻

@alexmt alexmt modified the milestones: v2.4, v2.5 Jun 21, 2022
@rumstead
Copy link
Member

rumstead commented Jun 22, 2022

Adding my "bump".

EDIT:
Use cases:

  1. Namespaces/Namespace quotas (cluster bootstrap)
  2. Vault (mutating webhook)
  3. Service mesh (Consul with a mutating webhook)
  4. Capsule (multi-tenancy enabler)
  5. Business applications

@chenele
Copy link

chenele commented Jun 28, 2022

Adding my bump

@crenshaw-dev
Copy link
Member

Thanks for the +1s! If you leave a comment, please add info about your use case so it can be considered when writing the feature. Otherwise adding a thumbs-up to the issue is sufficient to move it up the priorities list. :-)

@imusmanmalik
Copy link

imusmanmalik commented Jul 7, 2022

+1 would love to see this feature as well

Also have this requirement of Apps based on Apps and so on... same use-case Application B depends on A.

@dgsardina
Copy link

+1

My use case will be on a cluster bootstrap we have istiod and istio-ingressgateway deployed as independent applications but the latter fails to sync as the mutating webhook of the first was not ready when it was deployed.

@RobCannon
Copy link

My use cases are:
I have an Application that references a folder-based chart that has our Certificate declarations. That Application will fail unless the Application that installs the cert-manager helm chart has succeeded (even if I install the CRDs first). I would also like to make the Applications that deploy our app services dependent on the certificates Application.

I can use sync waves and App of App hierarchies to get everything to deploy in the right order when I bootstrap a cluster, but just having a property on the Application that says it is dependent on one or more other Applications seems MUCH easier to manage. Let ArgoCD figure out the order based on the dependency info!

@RobCannon
Copy link

RobCannon commented Jul 23, 2022

It looks like this is being tracked on the roadmap in this issue. Please go upvote!
#3517

@day0hero
Copy link

I would really like to see this feature added! We are using jobs with sync-waves/hooks to get this functionality. While it works, it can be cumbersome to implement/debug especially when you're putting these hooks in across 10+ applications. Having the ability to clearly define the dependencies between the applications would be awesome!

Just as an example of our deployment scenario (there are other components to this, but the flow is the similar):

  1. deploy cloud storage (openshift data foundation)
  2. kubernetes job that waits for the storage to become available
  3. deploy dependent resources (quay, objectstoreuser (for s3 integration)
  4. kubernetes job that waits for the user and secret to get auto-generated
  5. deploy remaining applications

@jaxels10
Copy link

We need this for deploying certain applications before others, such as kyverno with kyverno policies, but also having Ceph fully reconciled before letting applications use its storage classes. This is the number one missing features keeping us from using Argo and instead using Flux. If this was implemented I am sure we would make the switch.

@sambonbonne
Copy link

@jaxels10 maybe you already considered this option but why not using sync waves if you "just" want to be sure with the apply order?

See the documentation for more information.

If your applications are centralized in one repository, with the apps of apps pattern, you can use sync waves to ensure apply order.

@fvogl
Copy link

fvogl commented Aug 9, 2023

@sambonbonne unfortunately sync-waves don't work for app-of-apps in case of updates. The sync order is working for the initial deployment of the apps and also while deleting them (Argo takes them out in the descending order). For updates though the order is random and basically most of the changes are applied at the same time. I would love to see the sync-waves working.

@purduemike
Copy link

the solution here is to make sure all apps are truly independent and will retry themselves until all the definitions they rely on are in memory.

I tend to agree with @shanproofpoint. We should try to make sure apps are independent. My use-case is to ensure our DB schema changes are live before starting App B. This can easily be done in code. App B, just need to check the schema version in the DB before making its health check green.
The problem with sync-waves between apps is, how should apps behave if it updates don't finish before the next sync? I feel like this can get really complex really quickly. So, shooting for app independence is key.

@jannfis jannfis linked a pull request Aug 29, 2023 that will close this issue
13 tasks
@jannfis
Copy link
Member

jannfis commented Aug 29, 2023

I took a new throw at implementing this. I diverted a little from the previous approach, but I think it's pretty usable already: #15280

@leoluz
Copy link
Collaborator

leoluz commented Sep 25, 2023

This is a highly voted proposal and while I think the main use-case (mutation webhook) makes some sense, I am also concerned about how this feature could be promoting anti-patterns when it comes to micro-service designs.

The first example that comes to mind is the distributed monolith. Ideally (in the perfect world :) ) an application should be resilient enough to allow it to be deployed even if its dependencies aren't satisfied. A simple example is one service that depends on Prometheus infra to expose metrics. It doesn't really matter if Prometheus is available on the cluster or not. The core functionality of this service should still be available and once Prometheus infra is up it will start scraping metrics without requiring the application to restart. If someone configures this service in Argo CD with a dependency to Prometheus it will block new syncs if Prometheus is unavailable (maybe even if it is Degraded?) while it shouldn't. This is a very simplistic example but I am pretty sure that there are much more in terms of how this feature could be misused which would make support much harder for Argo CD admins.

If the dependency graph is complex with many apps and levels involved, how users would be able to visualize the dependency tree to understand what is causing their application to remain out-of-sync?

@jessesuen @jannfis

@jannfis
Copy link
Member

jannfis commented Sep 25, 2023

If the dependency graph is complex with many apps and levels involved, how users would be able to visualize the dependency tree to understand what is causing their application to remain out-of-sync?

In the most recent incarnation, if the sync is blocked by a dependency's state, it will be noted in the Application's .status field. So far, there are no plans on visualization, but the information is readily available in the Application CRs. The wait state will also be reflected in an Application's conditions, so the information is easily accessible from the UI.

@leoluz
Copy link
Collaborator

leoluz commented Sep 25, 2023

The wait state will also be reflected in an Application's conditions, so the information is easily accessible from the UI

I am sorry but as far as I know the Application's status fields are not exposed in the UI. Am I wrong? It requires kubectl access in the cluster where the Applications are synced. Anyhow, let's put ourselves in the user's shoes: As a devops, I pushed a change in git and my application remains out of sync. Even if I click the sync button nothing happens. There is no place in Argo CD UI to tell me why the application is not syncing. I have to call support. Argo CD admin must look in the gigantic Application's status field to dig where the error is.

We are having many different support issues where the answer is in the resource's status field but users just don't look at it. The direction that we are going is to surface important status fields data in Argo CD UI to make it more user friendly.

@jannfis
Copy link
Member

jannfis commented Sep 25, 2023

@leoluz While waiting for any dependencies, it will look in the UI right now as follows:

image

and

image

So no direct cluster access required. Obviously, this information could be surfaced a little better. I'm open to suggestions, but I believe for an MVP, this might be good enough.

@shinebayar-g
Copy link

shinebayar-g commented Oct 16, 2023

I can use sync waves and App of App hierarchies to get everything to deploy in the right order when I bootstrap a cluster

Excuse me, how do you do this? I am using App of Apps pattern and added argocd.argoproj.io/sync-wave: '-1' to the CRDs application. But kube-prometheus-stack still started syncing before even CRDs are installed.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: '-1'
  name: kube-prometheus-stack-crds
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io

Edit: Found this really nice blog post that explains it. https://codefresh.io/blog/argo-cd-application-dependencies/

@aiceball
Copy link

aiceball commented Nov 1, 2023

@jannfis
am I correct in understanding that your PR: #15280 would function for any application deployment strategies?

i.e. it would cover all of the following cases:

  • manual deployment of multiple apps
  • app of apps
  • applicationsets
  • app of applicationsets

@jannfis
Copy link
Member

jannfis commented Nov 13, 2023

@aiceball Yes, the dependency mechanism would be rather independent of the pattern you use to create/maintain your applications.

@zs-dima
Copy link

zs-dima commented Dec 9, 2023

What about dependsOn for ApplicationSet elements?

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: my-applications
  namespace: argocd
spec:
  generators:
    - list:
        elements:
          # Infrastructure
          - name: cert-manager
            path: infrastructure/networking/cert-manager
          - name: traefik
            path: infrastructure/networking/traefik
            dependsOn:
              - cert-manager
          - name: rancher
            path: infrastructure/system/rancher
            dependsOn:
              - traefik
          # Apps
          - name: n8n
            path: apps/n8n
            dependsOn:
              - traefik
  template:
    metadata:
      name: '{{name}}'
    spec:
      project: default
      source:
        repoURL: 'https://github.com/${GITHUB_USER}/${GITHUB_REPO}.git'
        targetRevision: HEAD
        path: '{{path}}'
      destination:
        server: 'https://kubernetes.default.svc'
        namespace: '{{name}}-system'

FluxCD has dependencies:
https://fluxcd.io/flux/components/kustomize/kustomizations/#dependencies

Event Docker Compose and Docker Swarm have depends_on:
https://docs.docker.com/compose/compose-file/compose-file-v3/#depends_on

@christianh814
Copy link
Member

@zs-dima There's already a way to do that with progressive syncs

https://argo-cd.readthedocs.io/en/stable/operator-manual/applicationset/Progressive-Syncs/

kurktchiev added a commit to back-stack/everything-as-code that referenced this issue Mar 11, 2024
Signed-off-by: Boris Kurktchiev <kurktchiev@gmail.com>
@vvatlin
Copy link

vvatlin commented Apr 15, 2024

It's still impossible to guarantee orders between apps. Sync waves don't work.

@nneram
Copy link

nneram commented Apr 15, 2024

Hi @vvatlin, I can confirm that it's working, at least in the version I use, v2.8.4. I have an app of apps pattern with 11 applications and still growing, with nearly 7 waves. All you need is here: https://argo-cd.readthedocs.io/en/stable/operator-manual/cluster-bootstrapping/#app-of-apps-pattern. However, you need to add health assessment since v1.8 (#3781). Otherwise, it will not work.

For more information: https://argo-cd.readthedocs.io/en/stable/operator-manual/upgrading/1.7-1.8/.
I think you also have ApplicationSets, but I didn't look in that way.

They are working solutions but it would be easier with dependencies. I agree with that.

@christianh814
Copy link
Member

It's still impossible to guarantee orders between apps. Sync waves don't work.

hey @vvatlin , I wrote a blog about getting Syncwaves working with App of Apps

@vvatlin
Copy link

vvatlin commented Apr 16, 2024

I have app of apps and Health assessment also. And my child apps still synchronize randomly. argocd 2.10.7

@chanakya-svt
Copy link

Hi @vvatlin, I can confirm that it's working, at least in the version I use, v2.8.4. I have an app of apps pattern with 11 applications and still growing, with nearly 7 waves. All you need is here: https://argo-cd.readthedocs.io/en/stable/operator-manual/cluster-bootstrapping/#app-of-apps-pattern. However, you need to add health assessment since v1.8 (#3781). Otherwise, it will not work.

For more information: https://argo-cd.readthedocs.io/en/stable/operator-manual/upgrading/1.7-1.8/. I think you also have ApplicationSets, but I didn't look in that way.

They are working solutions but it would be easier with dependencies. I agree with that.

Hi @vvatlin, with the setup thats working for you, are you using ServerSideApply/ServerSideDiff in the ApplicationSet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.