Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-2876: Core library proposal for CEL #3132

Merged
merged 13 commits into from
Feb 3, 2022

Conversation

jpbetz
Copy link
Contributor

@jpbetz jpbetz commented Jan 13, 2022

  • One-line PR description: Propose what core library of functions we will make available to CEL for Beta

Proposed additions are:

  • isSorted for lists with comparable elements. This is useful for ensuring that a list is kept in-order.
  • sum, min and max functions for lists of summable/comparable elements. This is the
    core set of aggregate functions for lists, with CEL they can also be used on scalars by using defining list literals
    inline , e.g. [self.val1, self.val2].max()
  • indexOf / lastIndexOf for lists (overloading the existing string functions), this can be useful for
  • validating partial order (i.e. the tasks of a workflow)

Also:

  • Explain mechanism to make post-alpha library updates (which will need to be done with great care given that they will make CRDs incompatible with older Kubernetes versions)
  • Update KEP to reflect that numerical comparisons will be fully supported in Beta
  • Update KEP to reflect that int-or-string will be typesafe in Beta
  • Other comments:

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 13, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpbetz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 13, 2022
@liggitt liggitt added this to the v1.24 milestone Jan 18, 2022
jpbetz and others added 2 commits January 18, 2022 12:27
Co-authored-by: Jordan Liggitt <jordan@liggitt.net>
@jpbetz
Copy link
Contributor Author

jpbetz commented Jan 19, 2022

Thanks for the review @liggitt. Feedback addressed. This is now down to a fairly short list of proposed changes.

Co-authored-by: Jordan Liggitt <jordan@liggitt.net>
- Regular Expressions
- Some Standard Definitions
- We add a function and later need to change it.
- We don't add a function that is essential for use cases and later need to add it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall from xpath that you could add functions with a namespace associated with them, namespace:doThing( as I recall. This allowed versioning of functions and using it here would allow us to know whether an expression could be enforced by the server during creation of the CRD. Is that a thing in CEL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, unfortunately. There is no namespacing. All functions are declared in a global scope. Functions are uniquely identified by the combination of name + arguments, so multiple (overloaded) functions of the same name may coexist and the compiler will bind the correct one based on arguments. (The 1st argument can be the receiver, so foo(x) and x.foo() are equivalent callsites).

cc @TristonianJones for any insights on how to handle versioning/function library evolution.

Co-authored-by: Jordan Liggitt <jordan@liggitt.net>
@jpbetz jpbetz force-pushed the cel-to-beta branch 2 times, most recently from 3967cbf to 887130c Compare February 2, 2022 23:08
@liggitt
Copy link
Member

liggitt commented Feb 3, 2022

lgtm. looks like toc needs updating, and over to @deads2k

@kikisdeliveryservice kikisdeliveryservice changed the title Core library proposal for CEL KEP-2876: Core library proposal for CEL Feb 3, 2022
Any changes to the CEL function library will make it possible for CRD authors to create CRDs that are incompatible with
all previous Kubernetes releases that supported CEL. Because of this incompatibility, changing the function library
will need to carefully considered and kept to a minimum. All changes will need to be limited to function additions.
We will not add functions to do things that can already be accomplished with existing functions. Improving ease-of-use
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to explore the limits on this. Is it possible to take a URL (scheme://host:) and parse out the host if we allow

  1. dns names
  2. ipv4
  3. ipv6
  4. ipv4 mapped in ipv6

I'm interested in how difficult this is to do with existing functions and if it is very difficult, would we really hold this line? I see it messed up hand-written parsing code very frequently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's intractable with what we have in CEL now due to the lack of regex Find.

Options:

  1. Add regex find and findAll - Very unlikely that people will get the regex exactly right
  2. Add URL parsing functions. This should probably be: string -> URL conversion + getScheme, getUserinfo, getHost, getPath, getQuery, getFragment

I'm currently in favor of adding both. We made an explicit decision to not add regex replace (shouldn't really be needed for validation), but I hadn't thought carefully about match vs. find.

Any changes to the CEL function library will make it possible for CRD authors to create CRDs that are incompatible with
all previous Kubernetes releases that supported CEL. Because of this incompatibility, changing the function library
will need to carefully considered and kept to a minimum. All changes will need to be limited to function additions.
We will not add functions to do things that can already be accomplished with existing functions. Improving ease-of-use
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another one that stands out is the proper parsing of image pull specs. I bet you could get very close with clever regexes and splits, but would avoid providing a built-in? Grammar here for reference: https://github.com/distribution/distribution/blob/v2.7.1/reference/reference.go#L4-L24

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we use that in k8s already.

I'm a bit concerned with keeping this stable since the format is fairly involved.

Looks like the absolute minimum would be:

  • string -> image type conversion
  • getName
  • hasTag/getTag
  • hasDigest/getDigest

But the name has a host and a port, and the digest has an algorithm, and the tag can be a semver and someone will want to compare those.

@deads2k deads2k added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Feb 3, 2022
@deads2k
Copy link
Contributor

deads2k commented Feb 3, 2022

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 3, 2022
Any changes to the CEL function library will make it possible for CRD authors to create CRDs that are incompatible with
all previous Kubernetes releases that supported CEL. Because of this incompatibility, changing the function library
will need to carefully considered and kept to a minimum. All changes will need to be limited to function additions.
We will not add functions to do things that can already be reasonably accomplished with existing functions Improving ease-of-use
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We will not add functions to do things that can already be reasonably accomplished with existing functions Improving ease-of-use
We will not add functions to do things that can already be reasonably accomplished with existing functions. Improving ease-of-use

@k8s-ci-robot k8s-ci-robot merged commit 3f4ece6 into kubernetes:master Feb 3, 2022
- Add `indexOf` / `lastIndexOf` support for lists (overloading the existing string functions), this can be useful for
validating partial order (i.e. the tasks of a workflow)
- Add URL parsing: `url(string) URL`, `string(URL) string` (conversion), `getScheme(URL) string`, `getUserInfo(URL) string`,
`getHost(URL) string`, `getPort(URL) string`, `getPath(URL) string`, `getQuery(URL) string`, `getFragment(URL) string`. This is
Copy link
Member

@liggitt liggitt Feb 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume url(string) errors and prevents further evaluation if the parse fails?

clarify whether getHost includes the port or not (whether it is URL#Hostname() or URL.Host)

user info has further embedded structure: username or username:<password>. are we expecting callers to split that apart themselves? is this intending to return URL.UserInfo.String()? if so, does that return escaped or unescaped versions of those fields?

in general, need to specify whether escaped or unescaped versions of the pieces are returned

@liggitt
Copy link
Member

liggitt commented Feb 3, 2022

lgtm as well, a couple follow-up clarifications to address at your convenience

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants