Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

give-redis-liveness-more-time #853

Merged
merged 1 commit into from
Mar 27, 2024
Merged

Conversation

aprilrieger
Copy link
Collaborator

@aprilrieger aprilrieger commented Mar 27, 2024

Story

ams-prod went down today. When coming back up redis kept trying to start up, but needed more time to complete it's initial db load. The process took approximately 112 seconds to complete after increasing live on production.
DB loaded from append only file: 112.674 seconds

Update the values file with the corresponding subchart values needed to set the initialDelay so fcrepo can have the time it needs to startup -- currently takes approx 112.xx seconds

Redis deployment subchart is:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "fcrepo.fullname" . }}
  labels:
    {{- include "fcrepo.labels" . | nindent 4 }}
spec:
  replicas: 1
  selector:
    matchLabels:
      {{- include "fcrepo.selectorLabels" . | nindent 6 }}
  strategy:
    type: Recreate
  template:
    metadata:
    {{- with .Values.podAnnotations }}
      annotations:
        {{- toYaml . | nindent 8 }}
    {{- end }}
      labels:
        {{- include "fcrepo.selectorLabels" . | nindent 8 }}
    spec:
      {{- with .Values.imagePullSecrets }}
      imagePullSecrets:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      restartPolicy: Always
      serviceAccountName: {{ include "fcrepo.serviceAccountName" . }}
      securityContext:
        {{- toYaml .Values.podSecurityContext | nindent 8 }}
      volumes:
        - name: data
          {{- if and .Values.storage.enabled .Values.storage.existingClaim }}
          persistentVolumeClaim:
            claimName: {{ .Values.storage.existingClaim }}
          {{- else if .Values.storage.enabled }}
          persistentVolumeClaim:
            claimName: {{ template "fcrepo.fullname" . }}
          {{ else }}
          emptyDir: {}
          {{ end }}
      initContainers:
        - name: "remove-lost-found"
          image: "busybox:1.32.0"
          command:
            - rm
            - -fr
            - /data/lost+found
          volumeMounts:
            - name: data
              mountPath: /data
        - name: "init-db"
          image: {{ .Values.postgresql.image.repository | default "bitnami/postgresql" }}:{{ .Values.postgresql.image.tag | default "latest" }}
          command:
            - sh
            - -c
            - >-
              PGPASSWORD=$DATABASE_PASSWORD PGHOST=$DATABASE_HOST PGUSER=$DATABASE_USER; export PGPASSWORD PGHOST PGUSER;
              psql -tc "SELECT 1 FROM pg_database WHERE datname = '$DATABASE_NAME'" | grep -q 1 ||
              createdb -e -w $DATABASE_NAME
          envFrom:
            - configMapRef:
                name: {{ include "fcrepo.fullname" . }}-env
            - secretRef:
                name: {{ template "fcrepo.fullname" . }}
            {{- if .Values.fcrepoSecretName }}
            - secretRef:
                name: {{ include .Values.fcrepoSecretName . }}
            {{- end }}

      containers:
        - name: {{ .Chart.Name }}
          securityContext:
            {{- toYaml .Values.securityContext | nindent 12 }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          envFrom:
            - configMapRef:
                name: {{ include "fcrepo.fullname" . }}-env
            - secretRef:
                name: {{ template "fcrepo.fullname" . }}
            {{- if .Values.fcrepoSecretName }}
            - secretRef:
                name: {{ include .Values.fcrepoSecretName . }}
            {{- end }}
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
          volumeMounts:
            - name: data
              mountPath: /data
          livenessProbe:
            httpGet:
              path: /rest
              port: http
* see note    initialDelaySeconds: {{ default 15 .Values.healthCheck.initialDelay }} <--- mapped value we are updating in values file
          readinessProbe:
            httpGet:
              path: /rest
              port: http
            initialDelaySeconds: {{ default 15 .Values.healthCheck.initialDelay }}
          resources:
            {{- toYaml .Values.resources | nindent 12 }}
      {{- with .Values.nodeSelector }}
      nodeSelector:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.affinity }}
      affinity:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.tolerations }}
      tolerations:
        {{- toYaml . | nindent 8 }}
      {{- end }}

… to set the initialDelay so fcrepo can have the time it needs to startup -- currently takes approx 112.xx seconds
@bkiahstroud bkiahstroud merged commit c27625b into develop Mar 27, 2024
6 checks passed
@bkiahstroud bkiahstroud deleted the give-redis-liveness-more-time branch March 27, 2024 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants