Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SinkBinding doesn't work with KServices that uses BYO revision names #9544

Open
rhuss opened this issue Sep 23, 2020 · 31 comments
Open

SinkBinding doesn't work with KServices that uses BYO revision names #9544

rhuss opened this issue Sep 23, 2020 · 31 comments
Labels
area/API API objects and controllers triage/accepted Issues which should be fixed (post-triage)

Comments

@rhuss
Copy link
Contributor

rhuss commented Sep 23, 2020

Describe the bug

When using a SinkBinding with a Knative Service as subject that uses BYO revision names (i.e. setting the revision name from the client), then no K_SINK environment variable is injected to the KService.

Expected behavior

K_SINK should be injected to the Pod Template as a container env var regardless whether a Knative Service is configured for BYO revision names or not.

To Reproduce

kn service create random --image rhuss/random:1.0
kn broker create default
kn source binding create my-sb --subject Service:serving.knative.dev/v1:random --sink broker:default
kn service describe random --verbose | grep K_SINK

This will get no result as kn used BYO revision names by default. If switching over to server side generated names, then a K_SINK env var is injected.

# Switch over to server side generated names
kn service update random --revision-name ""
kn service describe random --verbose | grep K_SINK

        Env:    K_SINK=http://broker-ingress.knative-eventing.svc.cluster.local/default/default

Knative release version

Eventing 0.17.3 (running on minikube)

@rhuss
Copy link
Contributor Author

rhuss commented Sep 23, 2020

@evankanderson
Copy link
Member

This is probably a serving, rather than eventing, issue -- it would apply equally to a hypothetical OsbBinding which set the VCAP_SERVICES environment variable on a Knative Service which was in the "BYO revision" mode.

I wonder if the best solution would be to clear spec.template.metadata.name on the resource before applying the patch, or adding a default to clear that field into the patch if the patch doesn't explicitly manage that field.

@mattmoor @dprotaso for thoughts.

@evankanderson evankanderson transferred this issue from knative/eventing Sep 23, 2020
@mattmoor
Copy link
Member

Ultimately, different PodSpecables present different restrictions. For some cases they can handle it gracefully, and for others they should probably fail gracefully.

I'd characterize BYO name as one of our niche restrictions.

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2021
@rhuss
Copy link
Contributor Author

rhuss commented Mar 1, 2021

I'd characterize BYO name as one of our niche restrictions.

Until 0.20, BYO name was the default mode for kn (don't remember why this was the case), we changed that now to server-side naming in 0.21. However, since we still support the older kn version and it is still easy to switch to BYO revision names with kn, maybe this should be supported by the binding reconciliation loop? E.g. one could easily detect BYO mode and, if enabled, just set the revision name to a new random name.

As an alternative, I would suggest deprecated BYO revision name and let it fade out. I really don't see any convincing argument for BYO names, especially if it can not be ruled out, that only a single client is used for manipulating a KService.

I'm reopening this issue, just to track this open question.

@cardil
Copy link

cardil commented Mar 24, 2021

I hit that one with kn 0.19 and serving 0.21. Maybe do reopen as you said @rhuss ?

/reopen
/remove-lifecycle stale

@knative-prow-robot
Copy link
Contributor

@cardil: Reopened this issue.

In response to this:

I hit that one with kn 0.19 and serving 0.21. Maybe do reopen as you said @rhuss ?

/reopen
/remove-lifecycle stale

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 24, 2021
@markusthoemmes
Copy link
Contributor

Is there anything left to be done here? The above sounds like this is working as designed and we'd at most document the behavior better and/or improve visibility of the issue?

@rhuss
Copy link
Contributor Author

rhuss commented Jun 1, 2021

Well, I would say it depends: If BYO revisions is a fully supported feature by Knative Serving (even when it is a "niche feature") I would expect it to either work with SinkBindings or would have a documentation that BYO revision names are not supported for SinkBindings.

An alternative would be to remove BYO revision names as I think this feature is not very well tested, too ? (see this bug :)

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 31, 2021
@cardil
Copy link

cardil commented Sep 27, 2021

/remove-lifecycle stale

@knative-prow-robot knative-prow-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 27, 2021
@markusthoemmes
Copy link
Contributor

My understanding of SinkBinding is, that it also carries a reconciler which could really change the Knative Service at any point, kind of breaking the immutability requirement of revisions if we were to somehow bypass the name check that lead to the initial report here.

IMO we have 2 options then:

  1. Keep things the way they are and clearly document that SinkBinding is not supported for BYO revision names.
  2. Drop BYO revision names altogether.

@markusthoemmes
Copy link
Contributor

/area api

@knative-prow-robot knative-prow-robot added the area/API API objects and controllers label Sep 27, 2021
@rhuss
Copy link
Contributor Author

rhuss commented Sep 27, 2021

+1 for dropping BYO revision names as they have more problems than they solve (and the client these days default to server-generated revision names anyway and don't leverage client-side revision names). See also the release notes of https://github.com/knative/client/releases/tag/v0.21.0 for some more arguments against BYO revision names.

@duglin
Copy link

duglin commented Sep 27, 2021

I don't use BYO revision names much, but that's probably because I don't play with traffic splitting very often - beyond trivial demos. However, if we drop BYO support will that make it harder for people to setup and manage their traffic splitting? In particular, I often find it really annoying to work with systems that require me to create something, discover what I just created, and then use that info in a follow-on command. Not only because it should have let me pick a user-friendly name to begin with so I can reference it via scripts easily, but it also means that I have to (potentially) have my system in an undesirable state until I can execute that last step of that process. And if I have to jump through some hoops I normally would not have jumped through just to make things work, then that's not the best UX. So... what's the UX implications here if we drop BYO revision names for non-trivial traffic splitting?

@rhuss
Copy link
Contributor Author

rhuss commented Sep 27, 2021

Actually, BYO as used by kn does not help you in this situation at all as the client chooses a random name for the revision on every update (which is even harder to predict). BYO names have tons of drawbacks, some of them are listed in these release notes https://github.com/knative/client/releases/tag/v0.21.0 . Another disadvantage is that you really depend on the state of your cluster when you e.g. choose a name that must not conflict with any other existing revision's name. So a script might work on one cluster but not on the other, depending on the cluster's history of revisions.

Said that there is good news as we exactly tackle this UX problem with the latest proposal for an updated UI for traffic split as described in this feature track

This proposal will allow you to create traffic splits without looking up revisions names for the most common use cases.

@rhuss
Copy link
Contributor Author

rhuss commented Sep 27, 2021

I fail to see any advantage for BYO revision names, even for traffic splits it is not practical (as explained above).

@cardil
Copy link

cardil commented Sep 30, 2021

I see users might like to use BYO revision name to equal their application version. In that way, they will know what they actually traffic split. It's easy for us, to use default revision numbered version (or even random one from before) as we just do examples and simple demos.

Imagine yourself being in some company, and releasing and deploying new SSO service, that other services and users are constantly using. Your service version is changing from 2.3.3-1 to 2.4.0-2, so you're bringing new features, but that involved numerous dependencies needed to bump. Despite testing this on testing and stage environments, you are not 100% sure it will work, as experience taught you that production data and load can cause many unforeseen problems. That's why your team decided to use Knative to traffic split every new release of such crucial services (any in fact 😉).

So, you have your images: registry.acme.com/edge/sso:2.3.3-1 and registry.acme.com/edge/sso:2.4.0-2. Now, you are running 2.3.3-1 version, and you have 100% of traffic routed to revision named 2.3.3-1. You start the deployment of 2.4.0-2 version, and in the same kubectl apply you are setting 5% of load to hit new version. 🤞

Of course, to do that, you needed to always use BYO revision and name it after your build version. That's tedious.

The traffic split FT @rhuss do not solve this, as it only is a CLI feature. In fact, it may make it harder (the --traffic 100% and --traffic 0%) to know for sure, what am I doing, actually. Imagine someone gave you a new terminal session to production k8s. Will you invoke --traffic 100% and --traffic 0% options without first checking what actually the state is?!?

I image that a company/team that takes traffic splitting seriously would invest in some wrapper that will perform deploys automatically. 5%, check logs for X min, 10%, check logs, 20%, check logs, ... Such thing could be written in anything, terraform code, an operator, or a web app. It's unrealistic to assume a human operator would use kn to deploy production services, and do the traffic splitting, log crunching loop.

We could do a better job in generating revision names, it would save some confusion. We could:

  1. Look at well-known image labels to read the revision name, or version, build number
  2. As a fall back, look at image tag, and use it as version number
  3. If none of the above are successful, use current ordered number revision names
  4. Use dynamic naming for revision name, like current, previous, n-2 in traffic splitting - both on CLI and YAML
  5. Allow users to add aliases to revisions, to be used in traffic splitting, apart from dynamic ones.

@markusthoemmes
Copy link
Contributor

We could do a better job in generating revision names, it would save some confusion. We could:

It wouldn't help though. The problem in this case is that we're trying to update an immutable revision after it being created. If we adopt the naming strategy you suggest, the same issue would be around as the name of the revision with the same label (or image tag or whatever) would stay the same and thus collide in the same ways.

I agree with the description on where BYO names are useful tho.

@cardil
Copy link

cardil commented Oct 1, 2021

One additional idea is that system components could have more permission than user does - meaning we might decide that SB can change a thing that is immutable for user.

Or maybe introduce a mutant revision, and just shift the revision name to it - the underlying revision names could be sequential, and we just apply a proper revision name via label.

@rhuss
Copy link
Contributor Author

rhuss commented Oct 11, 2021

IMO you can't use an image version/tag as a revision name reliably as every Configuration change would need a new revision name, even when you don't change the image (but maybe only a label or env var). A revision name != an application version.

That has to be provided by the user. IMO it is far better to leverage the tag: parts of traffic split to give you a more indicative name. I still don't think it is practical for the client to provide the revision names without evaluating the cluster state before (e.g. if this revision name already exists before using it). This only works if you have full control and don't try any apply kind of operations. In any practical use case that I have encountered for client-side naming, there was already a random portion to somewhat reduce the risk of a naming collision (but still can happen). There is no benefit of server-side naming generation, that can guarantee to create a new unique revision name.

btw, with the CLI proposal mentioned above btw, you are still able to fully specify the traffic split without the help of a CLI to 'fill up to 100%' support.

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 16, 2022
@rhuss
Copy link
Contributor Author

rhuss commented Jan 18, 2022

/remove-lifecycle stale

While I think it might not be necessary to fix SinkBindings to run with BYO revision names, I also think we should document the restriction somewhere, ideally at the place where the SinkBinding itself is documented.

@knative-prow-robot knative-prow-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 18, 2022
@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 19, 2022
@abrennan89
Copy link

@rhuss if this is still relevant can you open a docs bug to add a note about this please?

@cardil
Copy link

cardil commented Oct 5, 2022

/remove-lifecycle stale
/reopen

@abrennan89 Couldn't we just reuse this issue as doc one?

@knative-prow knative-prow bot reopened this Oct 5, 2022
@knative-prow
Copy link

knative-prow bot commented Oct 5, 2022

@cardil: Reopened this issue.

In response to this:

/remove-lifecycle stale
/reopen

@abrennan89 Couldn't we just reuse this issue as doc one?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow knative-prow bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 5, 2022
@abrennan89
Copy link

@cardil if someone for serving is writing the docs then yeah that's fine, otherwise please close it and open an issue in the docs repo that provides specific details about what needs to change.

@github-actions
Copy link

github-actions bot commented Jan 5, 2023

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 5, 2023
@knative-prow-robot
Copy link
Contributor

This issue or pull request is stale because it has been open for 90 days with no activity.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close

/lifecycle stale

@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 6, 2023
@rhuss
Copy link
Contributor Author

rhuss commented Feb 14, 2023

/remove-lifecycle stale

@ReToCode ReToCode added the triage/accepted Issues which should be fixed (post-triage) label Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/API API objects and controllers triage/accepted Issues which should be fixed (post-triage)
Projects
None yet
Development

No branches or pull requests

9 participants