Skip to content

Commit

Permalink
fix: allow experiments to configure k8s sidecars (#8854)
Browse files Browse the repository at this point in the history
This turns out to be my regression from Jan 2021 when expconf first
landed.

Also, we might as well resurrect the detailed explanations that used to
be in extensions.py, and which were still cited by the go files, even
though the python code was deleted a long time ago.
  • Loading branch information
rb-determined-ai committed Feb 20, 2024
1 parent d07ec40 commit ac8c440
Show file tree
Hide file tree
Showing 11 changed files with 220 additions and 42 deletions.
16 changes: 15 additions & 1 deletion master/pkg/schemas/extensions/checks.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,18 @@
// See determined/common/schemas/extensions.py for the explanation of this and other extensions.
// checks is a simple extension that returns a specific error if a subschema fails to match.
//
// The keys of the "checks" dictionary are the user-facing messages, and the values are the
// subschemas that must match.
//
// Example:
//
// "checks": {
// "you must specify an entrypoint that references the trial class":{
// ... (schema which allows Native API or requires that entrypoint is set) ...
// },
// "you requested a bayesian search but hyperband is way better": {
// ... (schema which checks if you try searcher.name=baysian) ...
// }
// }

// This file is a tutorial for implementing extensions for the santhosh-tekuri/jsonschema package.
//
Expand Down
12 changes: 11 additions & 1 deletion master/pkg/schemas/extensions/compare_properties.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,14 @@
// See determined/common/schemas/extensions.py for the explanation of this and other extensions.
// compareProperties allows a schema to compare values in the instance against each other.
// Amazingly, json-schema does not have a built-in way to do this.
//
// Example: ensuring that hyperparmeter minval is less than maxval:
//
// "compareProperties": {
// "type": "a<b",
// "a": "minval",
// "b": "maxval"
// }

// See ./checks.go for notes on implementing extensions for the santhosh-tekuri/jsonschema package.

package extensions
Expand Down
14 changes: 13 additions & 1 deletion master/pkg/schemas/extensions/disallow_properties.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,16 @@
// See determined/common/schemas/extensions.py for the explanation of this and other extensions.
// disallowProperties is for restricting which properties are allowed in an object with
// per-property, such as when we allow a k8s pod spec with some fields disallowed.
//
// Example: The "pod_spec" property of the environment config:
//
// "pod_spec": {
// "type": "object",
// "disallowProperties": {
// "name": "pod Name is not a configurable option",
// "name_space": "pod NameSpace is not a configurable option"
// }
// }

// See ./checks.go for notes on implementing extensions for the santhosh-tekuri/jsonschema package.

package extensions
Expand Down
30 changes: 29 additions & 1 deletion master/pkg/schemas/extensions/eventually.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,32 @@
// See determined/common/schemas/extensions.py for the explanation of this and other extensions.
// eventually allows for two-step validation, by only enforcing the specified subschemas
// during the completeness validation phase. This is a requirement specific to Determined.
//
// One use case is when it is necessary to enforce a `oneOf` on two fields that are
// `eventuallyRequired`. If the `oneOf` is evaluated during the sanity validation phase, it will
// always fail, if for example, the user is using cluster default values, but if validation
// for this subschema is held off until completeness validation, it will validate correctly.
//
// Example: eventually require one of connection string and account url to be specified:
//
// "eventually": {
// "checks": {
// "Exactly one of connection_string or account_url must be set": {
// "oneOf": [
// {
// "eventuallyRequired": [
// "connection_string"
// ]
// },
// {
// "eventuallyRequired": [
// "account_url"
// ]
// }
// ]
// }
// }
// }

// See ./checks.go for notes on implementing extensions for the santhosh-tekuri/jsonschema package.

package extensions
Expand Down
9 changes: 8 additions & 1 deletion master/pkg/schemas/extensions/eventually_required.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,11 @@
// See determined/common/schemas/extensions.py for the explanation of this and other extensions.
// eventuallyRequred allows for two-step validation. This is a requirement specific to Determined
// because there are fields required (checkpoint_storage) but which may not be present in what the
// user actually submits, since a cluster default may be present.
//
// eventuallyRequired behaves identically to required, only when building the validator, it is
// possible to not include the eventuallyRequired extension; making it possible to *not* enforce
// eventuallyRequired at specific times.

// See ./checks.go for notes on implementing extensions for the santhosh-tekuri/jsonschema package.

package extensions
Expand Down
17 changes: 16 additions & 1 deletion master/pkg/schemas/extensions/optional_ref.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,19 @@
// See determined/common/schemas/extensions.py for the explanation of this and other extensions.
// optionalRef behaves like $ref, except that it also allows the value to be null.
//
// This is logically equivalent to an anyOf with a {"type": "null"} element, but it has better
// error messages.
//
// Example: The "internal" property of the experiment config may be a literal null:
//
// "internal": {
// "type": [
// "object",
// "null"
// ],
// "optionalRef": "http://determined.ai/schemas/expconf/v0/internal.json",
// "default": null
// }

// See ./checks.go for notes on implementing extensions for the santhosh-tekuri/jsonschema package.

package extensions
Expand Down
47 changes: 46 additions & 1 deletion master/pkg/schemas/extensions/union.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,45 @@
// See determined/common/schemas/extensions.py for the explanation of this and other extensions.
// union is for custom error messages with union types. The built-in oneOf keyword has the same
// validation behavior but awful error handling. If you had the following invalid hyperparameter:
//
// hyperparameters:
// - learning_rate:
// type: double
// min: 0.001
// max: 0.005
//
// would you return an error saying:
//
// "your double hparam has invalid fields 'min' and 'max' but needs 'minval' and 'maxval'",
//
// or would you say:
//
// "your int hparam has type=double but needs type=int and 'minval' and 'maxval'"?
//
// Obviously you want the first option, because we treat the "type" key as special, and we can
// uniquely identify which subschema should match against the provided data based on the "type".
//
// The union extension provides this exact behavior.
//
// Example: The "additionalProperties" schema for the hyperparameters dict:
//
// "union": {
// "items": [
// {
// "unionKey": "const:type=int",
// "$ref": ...
// },
// {
// "unionKey": "const:type=double",
// "$ref": ...
// },
// ...
// ]
// }
//
// When the oneOf validation logic is not met, the error chosen is based on the first unionKey to
// evaluate to true. In this case, the "const:" means a certain key ("type") must match a certain
// value ("int" or "double") for that subschema's error message to be chosen.

// See ./checks.go for notes on implementing extensions for the santhosh-tekuri/jsonschema package.

package extensions
Expand Down Expand Up @@ -134,6 +175,10 @@ func UnionExtension() jsonschema.Extension {
}
}

// evaluateUnionKey: unionKey is part of the union extension. It allows for concisely describing
// when an instance of data "should" match a given portion of a subschema of a union type, even when
// it doesn't fully match. unionKey allows us to select the correct error message to show to the
// user from the union type.
func evaluateUnionKey(key JSON, instance JSON) bool {
switch tKey := key.(type) {
case string:
Expand Down
40 changes: 25 additions & 15 deletions master/pkg/schemas/zgen_schemas.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 7 additions & 5 deletions schemas/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
- See `schemas/expconf` for logics that are shared across languages.
- See `schemas/test_cases` for test cases that are shared across languages.
- See `master/pkg/schemas/expconf` for go struct definitions.
- See `harness/determined/common/schemas/expconf` for python class definitions.

- We generate code that contains the definitions and utility functions of
structs. See `schemas/gen.py`.
Expand All @@ -24,7 +23,7 @@

- Null handling:
- Anything with a null value is treated as not present.
- Reason: there is no pythonic or golangic way to represent values which were
- Reason: there is no golangic way to represent values which were
provided in the configuration as literal nulls, rather than values which
were not provided at all. In theory, you could have singleton
"NotProvided" pointers, which you would check for every time that you
Expand All @@ -48,8 +47,12 @@
- `union`: Excellent error messages when validating union types
- `optionalRef`: like `$ref`, but only enforced for non-null values
- `eventually`: Defer validation of inner clause till completeness validation phase
- The canonical implementations (with thorough comments) may be found in
`harness/determined/common/schemas/extensions.py`.
- The implementations of the extensions (with thorough comments) may be found
in `master/pkg/schemas/extensions/*.go`.
- The old python implementations were easier to read and understand, but
they were not ever used and so were removed in 5d9266a0. Looking at
`harness/determined/common/schemas/extensions.py` in `5d9266a0~` may be
instructive.

- Migration and Versioning:
- Migration logics are implemented in the master. See `pkg/schemas/expconf/parse.go`.
Expand All @@ -73,4 +76,3 @@
- update any uses of that object throughout the codebase to reflect
the new structure.
- ensure that `make gen && go test ./pkg/schemas/expconf` passes
- Currently, no python or JS changes are needed.
40 changes: 25 additions & 15 deletions schemas/expconf/v0/environment.json
Original file line number Diff line number Diff line change
Expand Up @@ -105,21 +105,31 @@
],
"default": null,
"items": {
"type": "object",
"disallowProperties": {
"image": "container Image is not configurable, set it in the experiment config",
"command": "container Command is not configurable",
"args": "container Args are not configurable",
"working_dir": "container WorkingDir is not configurable",
"ports": "container Ports are not configurable",
"liveness_probe": "container LivenessProbe is not configurable",
"readiness_probe": "container ReadinessProbe is not configurable",
"startup_probe": "container StartupProbe is not configurable",
"lifecycle": "container Lifecycle is not configurable",
"termination_message_path": "container TerminationMessagePath is not configurable",
"termination_message_policy": "container TerminationMessagePolicy is not configurable",
"image_pull_policy": "container ImagePullPolicy is not configurable, set it in the experiment config",
"security_context": "container SecurityContext is not configurable, set it in the experiment config"
"$comment": "if container name is 'determined-container', reject most edits to the container spec (which is ours to configure)",
"if": {
"properties": {
"name": {
"const": "determined-container"
}
}
},
"then": {
"type": "object",
"disallowProperties": {
"image": "container Image is not configurable, set it in the experiment config",
"command": "container Command is not configurable",
"args": "container Args are not configurable",
"working_dir": "container WorkingDir is not configurable",
"ports": "container Ports are not configurable",
"liveness_probe": "container LivenessProbe is not configurable",
"readiness_probe": "container ReadinessProbe is not configurable",
"startup_probe": "container StartupProbe is not configurable",
"lifecycle": "container Lifecycle is not configurable",
"termination_message_path": "container TerminationMessagePath is not configurable",
"termination_message_policy": "container TerminationMessagePolicy is not configurable",
"image_pull_policy": "container ImagePullPolicy is not configurable, set it in the experiment config",
"security_context": "container SecurityContext is not configurable, set it in the experiment config"
}
}
}
}
Expand Down
25 changes: 25 additions & 0 deletions schemas/test_cases/v0/environment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# This is a regression test: the migration to expconf accidentally prevented
# all changes to sidecar container specs, which was previously allowed.

- name: podspec can configure sidecar
sane_as:
- http://determined.ai/schemas/expconf/v0/environment.json
case:
pod_spec:
spec:
containers:
- name: my-sidecar
security_context: whatever

- name: podspec cannot configure determined container
sanity_errors:
http://determined.ai/schemas/expconf/v0/environment.json:
- "<config>.pod_spec.spec.containers\\[1\\]: .* not configurable"
case:
pod_spec:
spec:
containers:
- name: my-sidecar
security_context: whatever
- name: determined-container
security_context: this should fail

0 comments on commit ac8c440

Please sign in to comment.