Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default Timeout Settings on the Helm Chart #901

Open
td4b opened this issue May 27, 2023 · 0 comments
Open

Default Timeout Settings on the Helm Chart #901

td4b opened this issue May 27, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@td4b
Copy link

td4b commented May 27, 2023

Is your feature request related to a problem? Please describe.
Cluster communication timeout's are too tight by default when bootstrapping the Vault cluster in helm/k8s.

e.g. Running vault operator init

Describe the solution you'd like
Seems that EKS 1.24 (from 1.21) adds additional network latency within the cluster.
https://support.hashicorp.com/hc/en-us/articles/8552873602451-Vault-on-Kubernetes-and-context-deadline-exceeded-errors

What is interesting is that this isn't a default setting in the helm chart (which it should be) to account for the increased latency between versions.

Adding it here fixed the issue entirely, and when the vault gets unsealed, the keys are properly outputted to the CLI without timeout.

set {
    name = "server.extraEnvironmentVars.VAULT_CLIENT_TIMEOUT"
    value = "300s"
  }

I am thinking that increasing the timeout may help account for network latency in k8s/ eks.

Full chart settings that worked:

resource "helm_release" "vault" {
  name       = "vault"
  repository = "https://helm.releases.hashicorp.com"
  chart      = "vault"
  namespace  = "vault"

  set {
    name  = "server.ha.enabled"
    value = "true"
  }
  set {
    name  = "server.ha.raft.enabled"
    value = "true"
  }
  set {
    name  = "server.ha.raft.setNodeId"
    value = "true"
  }
  set {
    name = "server.extraEnvironmentVars.VAULT_CLIENT_TIMEOUT"
    value = "300s"
  }
  set {
    name  = "server.ha.raft.config"
    value = <<EOT
    ui = true
    listener "tcp" {
      tls_disable = 1
      address = "[::]:8200"
      cluster_address = "[::]:8201"
    }

    storage "raft" {
      path    = "/vault/data"
    }

    service_registration "kubernetes" {}

EOT
  }
}

Thanks.

Describe alternatives you've considered
n/a

Additional context
Took me a while to find the root cause, which was attributed to a go language error message "context deadline exceeded" which then got me to look at ways to increase the timeout value.

Note this is just a suggestion or breadcrumb for others.

@td4b td4b added the enhancement New feature or request label May 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant