Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting name of ports in generated services #113

Merged
merged 3 commits into from
Mar 21, 2019
Merged

Allow setting name of ports in generated services #113

merged 3 commits into from
Mar 21, 2019

Conversation

tanordheim
Copy link
Contributor

@tanordheim tanordheim commented Mar 21, 2019

I'm taking Flagger for a spin (loving it), but I'm having some issues with our GRPC services that no longer work as expected when Flagger is managing the VirtualService/Service instances surrounding the services. This is namely due to the name of the ports in the generated Service objects hardcoded to be http, where in GRPCs case they should be set to http2 or grpc (or http2-something or grpc-something in order for Envoy to configure itself as a HTTP2 proxy instead of a HTTP1-proxy.

I did a quick change in the CanaryService API type and the router.KubernetesRouter in order to be able to specify a portName in addition to a port in my canary service definition; this solves my issue, but it might not be the way you guys would prefer to solve this - so I'm mostly opening up this PR to have a place to discuss the appropriate implementation. I couldn't find any tests that verified the Service creation (only retrieval/mapping) so this change is not covered by any tests.

Fix: #115

@stefanprodan
Copy link
Member

@tanordheim this looks ok to me, can you confirm that the metrics checks (success rate and request duration) are working with grpc? Also can you please open an issue for Kubernetes routing tests.

Thanks

@tanordheim
Copy link
Contributor Author

I havent tested them extensively, all I've been able to do so far is just to have the deployments go out and have Flagger transition them into production using my canary settings. Let me try to produce some fake errors during transitioning in my GRPC services here to ensure that rollbacks happen as expected.

@stefanprodan
Copy link
Member

stefanprodan commented Mar 21, 2019

The request duration should work. The success rate query is using this filter response_code!~"5.*" and unless Istio does some mapping this will not match https://github.com/grpc/grpc/blob/master/doc/statuscodes.md

@tanordheim
Copy link
Contributor Author

These are the internal GRPC status code; HTTP2 is still the transport underneath, which should mean that the error rates are correctly tracked by Istio by default. I'm doing some tests now to verify that though, give me 30 minutes :)

@tanordheim
Copy link
Contributor Author

Yeah, the response code things were problematic; it's tracked in Istio, but GRPC uses status code 200 for all responses; even request returning a GRPC error.

This is likely something you'd have to instrument on your own though (we're using https://github.com/grpc-ecosystem/go-grpc-prometheus) and change the metric you're observing to be a GRPC specific metric.

@tanordheim tanordheim changed the title WIP: Allow setting name of ports in generated services Allow setting name of ports in generated services Mar 21, 2019
@stefanprodan
Copy link
Member

Flagger supports custom promql queries, so you could embedded the GRPC query in the canary spec.

@tanordheim
Copy link
Contributor Author

Yeah, I know, I just meant that it's maybe outside the scope of this PR to handle :)

I can write some suggestions to the docs if you want to, though.

@stefanprodan
Copy link
Member

Yes a GRPC note in the docs would be great. Thanks

@tanordheim
Copy link
Contributor Author

I added an example custom query with your pre-existing custom query example.

Copy link
Member

@stefanprodan stefanprodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks @tanordheim

@stefanprodan stefanprodan merged commit 438c553 into fluxcd:master Mar 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants