-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rate limiting wasmplugin controller: using GatewayAPI topology (DAG) #447
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #447 +/- ##
==========================================
+ Coverage 80.20% 80.79% +0.58%
==========================================
Files 64 70 +6
Lines 4492 4956 +464
==========================================
+ Hits 3603 4004 +401
- Misses 600 642 +42
- Partials 289 310 +21
Flags with carried forward coverage won't be shown. Click here to find out more.
|
19be6e1
to
c5f75dc
Compare
27bb4a1
to
08f327e
Compare
policies := utils.Map(rlpList.Items, func(p kuadrantv1beta2.RateLimitPolicy) kuadrantgatewayapi.Policy { return &p }) | ||
|
||
t, err := kuadrantgatewayapi.NewTopology( | ||
kuadrantgatewayapi.WithGateways([]*gatewayapiv1.Gateway{gw}), | ||
kuadrantgatewayapi.WithRoutes(utils.Map(routeList.Items, ptr.To)), | ||
kuadrantgatewayapi.WithPolicies(policies), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
policies := utils.Map(rlpList.Items, func(p kuadrantv1beta2.RateLimitPolicy) kuadrantgatewayapi.Policy { return &p }) | |
t, err := kuadrantgatewayapi.NewTopology( | |
kuadrantgatewayapi.WithGateways([]*gatewayapiv1.Gateway{gw}), | |
kuadrantgatewayapi.WithRoutes(utils.Map(routeList.Items, ptr.To)), | |
kuadrantgatewayapi.WithPolicies(policies), | |
policies := utils.Map(rlpList.Items, func(p kuadrantv1beta2.RateLimitPolicy) kuadrantgatewayapi.Policy { return &p }) | |
routes := utils.Map(routeList.Items, func(r gatewayapiv1.HTTPRoute) *gatewayapiv1.HTTPRoute {return &r}) | |
t, err := kuadrantgatewayapi.NewTopology( | |
kuadrantgatewayapi.WithGateways([]*gatewayapiv1.Gateway{gw}), | |
kuadrantgatewayapi.WithRoutes(routes), | |
kuadrantgatewayapi.WithPolicies(policies), |
nit: Goland IDE is highlighting this for me but it compiles and runs so not sure on this one, it might be just me .. 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make run-lint
(golangci-lint) does not complain. It compiles.
you sure it is valid lint issue? "cannot use generic function without instantiation" does not sound correct for this example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, unsure of this one either. Might be just Goland IDE specific issue for me. Since it compiles, I think it can be ignored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to change it to routes := utils.Map(routeList.Items, func(r gatewayapiv1.HTTPRoute) *gatewayapiv1.HTTPRoute {return &r})
if you find this annoying. Let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should fix it:
policies := utils.Map(rlpList.Items, func(p kuadrantv1beta2.RateLimitPolicy) kuadrantgatewayapi.Policy { return &p }) | |
t, err := kuadrantgatewayapi.NewTopology( | |
kuadrantgatewayapi.WithGateways([]*gatewayapiv1.Gateway{gw}), | |
kuadrantgatewayapi.WithRoutes(utils.Map(routeList.Items, ptr.To)), | |
kuadrantgatewayapi.WithPolicies(policies), | |
policies := utils.Map(rlpList.Items, func(p kuadrantv1beta2.RateLimitPolicy) kuadrantgatewayapi.Policy { return &p }) | |
t, err := kuadrantgatewayapi.NewTopology( | |
kuadrantgatewayapi.WithGateways([]*gatewayapiv1.Gateway{gw}), | |
kuadrantgatewayapi.WithRoutes(utils.Map(routeList.Items, ptr.To[gatewayapiv1.HTTPRoute])), | |
kuadrantgatewayapi.WithPolicies(policies), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me anyway! Awesome work! 👍
parents map[NodeID]*internalNode | ||
children map[NodeID]*internalNode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use orderedmap
here, so we have consistent responses from funcs like Parents()
, Children()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happy to add them, but it is not needed right now, is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a basic data structure that can be used to build indexes and elaborate logic on top. Because we already had problems with that in the past, I'm personally more inclined every day to ensuring consistent ordering on those lists. But up to you.
d.nodes[n.id] = n | ||
d.populateIndexes(node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a mutex and wrap this as a critical section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am aware that the dag package is not thread safe. But, as it is used now, it does not need to be thread-safe. Locks can be added when (if) a single instance is accessed by multiple threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a minor effort worth adding IMO.
parentInternalNode.children[childInternalNode.id] = childInternalNode | ||
childInternalNode.parents[parentInternalNode.id] = parentInternalNode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodes []*node | ||
} | ||
|
||
// build a mutable simple graph representation out of DAG only for validating purposes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that graph and also dag. They have algorithms to find paths and subgraphs, to walk the graphs and other graph topology generic functionality. We do not need that, yet, and IMO the implementation is simple enough to do it ourselves. Additionally, we want functionality like "give me all the HTTPRoutes" and I could not find that. I implemented that using indexes on top of the graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preventing cycles is one of the built-in features provided by both of those libs, plus they are both thread-safe by design.
Today we may not need all they offer, agreed. For example, we probably don't care about shortest paths between nodes yet. Still, as long as a lib is efficient enough for what we do need, saves us the effort to have to maintain those functionalities ourselves (simple as they might be), and opens up for more applications in the future, I personally think it's worth considering it.
As for "give me all the HTTPRoutes", isn't that the kind of thing we can get from traversing the graph? If you prefer keeping indexed copies of type-specific nodes – say, for efficiency –, I suppose we could also wrap the DAG within another struct that stores additionally all the indexes we need, maybe using the hashes/ids of the vertices to map to value?
Either way, I leave this as a suggestion for the future. In theory, if we want to, i guess replacing this underlying implementation should be possible to do transparently.
return GatewayNode{ | ||
Gateway: gNode.Gateway, | ||
attachedPolicies: gNode.attachedPolicies, | ||
routes: routes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Routes of the gateways added lazily only when reading the gateways of a topology. Maybe add a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I do not follow. When calling Gateways()
, the returned object has all the routes for each gateway in the topology. It is not loaded lazily. There is no callback from the user perspective.
@guicassolato ready for a new review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried this PR with the verification steps from #218. Except for the order of the rules in the WasmPlugin resource which seemed completely random (possibly a preexisting issue), it all works fine.
The kuadrant-operator-controller-manager
keeps crashing but I don't think it's necessarily related to this PR. Rather, due to some DNSHealthCheckProbe
issue spotted in the logs. The same does NOT happen in main
though, so maybe rebasing will fix it?
Other than that, I've left a few comments from before – some reinforced in my latest review. The most important one regards thread-safety of the new data structures.
99f5e4d
to
4ca61d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for rebasing, @eguzki!
I don't want to block this any further, so approving it and leaving a summary of things I think we could improve in the future:
Fair points, before merging, I will: Leave a comment to highlight that the Check this out as it should not matter the order of the policies being returned by the DAG. The controller sorts that list of policies afterwards based on policy creation timestamp EDIT: found issue with the rules order as reported. Fixed in a46f05f Create an issue for that |
Comments addressed @guicassolato |
@@ -30,7 +31,16 @@ func WasmRules(rlp *kuadrantv1beta2.RateLimitPolicy, route *gatewayapiv1.HTTPRou | |||
return rules | |||
} | |||
|
|||
for limitName := range rlp.Spec.Limits { | |||
// Sort RLP limits for consistent comparison with existing wasmplugin objects | |||
limitNames := make([]string, 0, len(rlp.Spec.Limits)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mitigates but does not solve 100%, I'm afraid. Ceteris paribus 🤓, a consistent wasmplugin config also depends on the order of the policies which are indexed for a gateway using a regular Golang map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order of the policies is already being sorted here, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I mentioned already
The controller sorts that list of policies afterwards based on policy creation timestamp
What
Dedicated WASMPlugin controller for the rate limiting policy. The controller implementation is based on a new approach using GatewayAPI Topology data structure Directed Acyclic Graph to represent the GatewayAPI resources.
Supersedes #317
Controller main object type
The type of the main object being reconciled is the
Gateway
. There is one-to-one mapping of gateway and wasmplugin. Another approach would be having the WASMPlugin as the main object type. In that case, it would be needed one way to get the gateway name from{name, namespace}
of the wasmplugin. The event mappers would be more complex. Worth exploring anyway.Watched resources and event mappers
The controller watches all resources that affect the configuration (spec) of the rate limit Wasmplugin. I.e. HTTPRoutes, Gateways and RateLimitPolicies. On events of those resources, the event mapper walks the Gateway API topology, via targetRef and parentRefs up to the peak of the topology, as far as kuadrant is concerned today, i.e. the Gateway. In other words, if you change a rate limit policy, it will generate events on all gateways affected. If you change an HTTPRroute, it will generate events on all gateways involved. Interesting to note that if you change the parentRef of a HTTProute, it will generate events on the gateways that were targeted and also on the new gateways currently being targeted. That allows doing cleaning for those gateways where a RLP is no longer affecting.
The GatewayAPI Topology
On gateway reconciliation event, the controller builds what it is named "The GatewayAPI Topology". It reads ALL HTTPRoutes targeting the given gateway as well as ALL rate limit policies in the cluster. With all that info, the controller creates a new instance of the Gateway API topology as well as a set of indexes (another "view" of the topology) that can be queried to provide information to build the desired spec of the rate limiting wasmplugin. Interesting to note here that a field indexer is being added to the manager to index all httproute parenting a gateway. Basically you can use this manager index (not a gateway api topology index), to query "give me all the HTTPRoutes parenting Gateway A".
The current implementation of the "GatewayAPI Topology" does not use a k8s client to query Gateway API and kuadrant policies. Instead, fetching resources must be done externally and then the GatewayAPI topology receives a list of resources. It could be done otherwise and have generic function with a k8s client to build topology for a given policy of one kind (for example RLP or KAP). The main reason for the current implementation is to decouple GatewayAPI topology and the cluster. Meaning that I can build GatewayAPI topology from a set of resources filtered with some criteria. The use case here is that I want to build a GatewayAPI topology, not for the entire cluster, but only focused on one gateway and all policies and routes involved.
The implementation assumes "cheap" operations of reading ALL the ratelimitpolicies and all routes parenting a given gateway. That assumption comes from the controller runtime k8s client that holds a cache. At least for
list
andget
operations. That cache is being kept up to date in the background when there are updates/deletes in the objects triggered by kuadrant controllers or any other third party controllers or any other k8s API server user. Check for this comment with a verification of this "cheap" operation assumption. Indeed, the resource lists are cached across reconciliation events and controllers.The topology implementation is based on a pure Directed Acyclic Graph data structure implementation that is totally decoupled from the GatewayAPI relations. On top of that, there is an implementation of the GatewayAPI topology which builds a graph from Gateways and HTTPRoutes. In the future, it can be extended for additional GatewayAPI API types. On top of the GatewayAPI topology, a number of indexes are created which are tailored for the needs of the rate limiting wasmplugin controller. Thus:
Additionally:
github.com/kuadrant/kuadrant-operator/pkg/library/gatewayapi
package with all tools and helpers related to GatewayAPI resources and not directly related to any kuadrant controller logic. This package would be a good candidate to be shared and useful for any third party controller dealing with GatewayAPI resources. This package includes the GatewayAPI topology implementation.common
togithub.com/kuadrant/kuadrant-operator/pkg/library/gatewayapi
package that IMO are generic enough and decoupled from kuadrant controller's implementation needs. This change caused (small) changes on lots of files 😞pkg/library/reconcilers
packageTODO: Integration tests:
ratelimitpolicy_controller_test.go
torate_limiting_wasmplugin_controller_test.go