Skip to content

Commit

Permalink
feat: helm option to mount shared_fs checkpoints to master (#8741)
Browse files Browse the repository at this point in the history
  • Loading branch information
tybritten authored Jan 24, 2024
1 parent 9f06d35 commit 43c074e
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 1 deletion.
9 changes: 9 additions & 0 deletions docs/release-notes/master-mount.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
:orphan:

**Improvements**

- Helm: Add support for Downloading Checkpoints when using ``shared_fs``. Adds a ``mountToServer``
value under ``checkpointStorage``. By default, this parameter is set to ``false`` preserving the
current behavior. However when it's set to ``true`` and the storage type is ``shared_fs`` it
enables the hostpath mount on the server. This allows for the use of ``checkpoint.download()`` to
work with ``shared_fs`` on Determined starting from version ``0.27.0`` and later.
10 changes: 10 additions & 0 deletions helm/charts/determined/templates/master-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,10 @@ spec:
- name: master-config
mountPath: /etc/determined/
readOnly: true
{{- if and (.Values.checkpointStorage.mountToServer) (eq .Values.checkpointStorage.type "shared_fs") }}
- name: checkpoint-storage
mountPath: /determined_shared_fs
{{ end }}
{{- if .Values.tlsSecret }}
- name: tls-secret
mountPath: {{ include "determined.secretPath" . }}
Expand Down Expand Up @@ -113,6 +117,12 @@ spec:
- name: master-config
secret:
secretName: determined-master-config-{{ .Release.Name }}
{{- if and (.Values.checkpointStorage.mountToServer) ( eq .Values.checkpointStorage.type "shared_fs") }}
- name: checkpoint-storage
hostPath:
path: {{ .Values.checkpointStorage.hostPath }}
type: Directory
{{ end }}
{{- if .Values.tlsSecret }}
- name: tls-secret
secret:
Expand Down
3 changes: 2 additions & 1 deletion helm/charts/determined/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,8 @@ checkpointStorage:
# system.
type: shared_fs
hostPath: /checkpoints

# By default, shared_fs is not mounted to the server pod. Change this to true to enable checkpoint downloads from the server.
mountToServer: false
# For storing in GCS.
# type: gcs
# bucket: <bucket_name>
Expand Down

0 comments on commit 43c074e

Please sign in to comment.