Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add live k8s API collector #34

Merged
merged 17 commits into from
Jun 6, 2023
Merged

Add live k8s API collector #34

merged 17 commits into from
Jun 6, 2023

Conversation

jt-dd
Copy link
Contributor

@jt-dd jt-dd commented Jun 5, 2023

This collector connect directly using the default config path to find the configuration (using controller-runtime lib):

  • --kubeconfig flag pointing at a file
  • KUBECONFIG environment variable pointing at a file
  • In-cluster config if running in cluster
  • $HOME/.kube/config if exists.

Ref: https://github.com/kubernetes-sigs/controller-runtime/blob/v0.15.0/pkg/client/config/config.go#L76

It uses the configuration to set 3 variables needed for crawling the API:

  • PageSize: Number of entry being retrieving by each call on the API (same for all Kubernetes entry types)
  • PageBufferSize: Number of pages to buffer
  • RateLimitPerSecond: Rate limiting per second across all calls (same for all Kubernetes entry types) against the Kubernetes API

Note: for the rate limiting, a simple leaky bucket has been implemented (using ratelimit lib from Uber). The rate limiting affect simultaneously all the calls to the Kub API regardless of the type being collected.

@jt-dd jt-dd requested a review from edznux-dd June 5, 2023 16:42
@jt-dd jt-dd marked this pull request as ready for review June 5, 2023 16:42
Copy link
Contributor

@edznux-dd edznux-dd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIP review, gotta go :D

pkg/collector/collector.go Show resolved Hide resolved
pkg/collector/k8s_api_test.go Show resolved Hide resolved
}

if cfg.Collector.Live.PageSize == nil {
cfg.Collector.Live.PageSize = globals.Ptr(K8sAPIDefaultPageSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you use * or & or in the type instead of doing a generic in here?
For some types (like bool and all, where you don't have a variable, I see why, but i'm not sure I understand here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not take address of a constant, so using for me was the most elegant way. But I guess I can also do the tricks, but for me it was less elegant:

PageSize := K8sAPIDefaultPageSize
cfg.Collector.Live.PageSize = &PageSize

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yep, makes sense didn't see the constant.
I would have used a var instead of const so it has a pointer available. 😅

Also tbh, this is config related, and we are already using viper so we should not use custom things to set default imo:
something like viper.SetDefault("k8s.collector.PageSize", 5000) would replace this function entirely. You would just need to call viper.GetInt("k8s.collector.PageSize")

In any case, K8sAPIDefaultPageSize should be renamed K8sAPIDefaultPageSizeDefault to make it clear that it's customizable later on and that this value isn't the "final" one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K8sAPIDefaultPageSizeDefault has 2 default which does not make sense. If you prefer I can change it to K8sAPIPageSizeDefault but all the current variables are to DefaultBlabla.
Also the SetDefault() seems to be a pain:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah woops, can't read, guess it shows that it needs to be prefix/suffix and not in the middle 😅
DefaultK8sAPIPageSize sounds fine to me! Having the default in the middle breaks the convention we have in other variables of DefaultBlabla as you mention yourself :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the issue mentionned still valid (from 2015, and marked as fixed by spf13/viper#195 (comment))?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did not brake the convention because to me it made more sense to have PackageDefaultValue. The package was never mentioned in the other values. Do we want to name the "package" in those constant ? I find it easier to follow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should "never" put the package's name in the name of the var. Otherwise you need to do "package.PackageVar" which is redundant (and also create more work if you want to update / move the package, but that's 🤷).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speaking of package, If possible, I think having the whole k8s_api related in a subpackage of collector would be nicer to use (if there's no circular deps with the rest of the project) :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no we can not due to the CollectorClient interface and ClientFactory() method. This is why I added the name of the "subpackage" (if you prefer).

pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api_test.go Outdated Show resolved Hide resolved
}

if cfg.Collector.Live.PageSize == nil {
cfg.Collector.Live.PageSize = globals.Ptr(K8sAPIDefaultPageSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yep, makes sense didn't see the constant.
I would have used a var instead of const so it has a pointer available. 😅

Also tbh, this is config related, and we are already using viper so we should not use custom things to set default imo:
something like viper.SetDefault("k8s.collector.PageSize", 5000) would replace this function entirely. You would just need to call viper.GetInt("k8s.collector.PageSize")

In any case, K8sAPIDefaultPageSize should be renamed K8sAPIDefaultPageSizeDefault to make it clear that it's customizable later on and that this value isn't the "final" one.

pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api_test.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api_test.go Show resolved Hide resolved
pkg/collector/k8s_api_test.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api_test.go Show resolved Hide resolved
pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
@jt-dd
Copy link
Contributor Author

jt-dd commented Jun 6, 2023

This whole thing can be replaced by
assert.Equal(cfg.Collector.Live, tt.args,value) I believe
The pretty print and all will be done automatically, example go playground: https://goplay.tools/snippet/n_yR-edgBMq

=== RUN   TestSomething
    prog_test.go:19: 
        	Error Trace:	/tmp/sandbox3381284915/prog_test.go:19
        	Error:      	Not equal: 
        	            	expected: main.testlol{FieldA:"Hellolola", FieldB:123}
        	            	actual  : main.testlol{FieldA:"Hellolol", FieldB:123}
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1,3 +1,3 @@
        	            	 (main.testlol) {
        	            	- FieldA: (string) (len=9) "Hellolola",
        	            	+ FieldA: (string) (len=8) "Hellolol",
        	            	  FieldB: (int) 123
        	Test:       	TestSomething
        	Messages:   	The two words should be the same.
--- FAIL: TestSomething (0.00s)
FAIL

I did not know that trick. It is really nice.

@jt-dd
Copy link
Contributor Author

jt-dd commented Jun 6, 2023

This should be a switch based on the config instead of being hardcoded

Yes implemented.

@jt-dd jt-dd requested review from d0g0x01 and edznux-dd June 6, 2023 10:58
@d0g0x01
Copy link
Contributor

d0g0x01 commented Jun 6, 2023

There are a couple of features missing from the spec https://datadoghq.atlassian.net/browse/ASENG-362 - maybe it would be worth adding these as a subsequent PR?

When this is complete do you want to switch the integration tests over to use this collector? you'll need to update the code at test/system/setup_test.go

if cfg.Collector.Type != config.CollectorTypeK8sAPI {
return fmt.Errorf("invalid collector type in config: %s", cfg.Collector.Type)
}
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should you not do then attach the config to the collector instance? also you dont use the logger?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should this be a private method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure to understand what you mean by attaching the config to the collector instance. The config is attached to the collector instance.
Same for the logger, it is being used by the collector instance.

Yes this should be private.

pkg/collector/k8s_api.go Outdated Show resolved Hide resolved
type k8sAPICollector struct {
clientset kubernetes.Interface
log *log.KubehoundLogger
rl ratelimit.Limiter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find!

return nil, err
}

kubeConfig, err := ctrl.GetConfig()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to specify a target cluster in the configuration. I think currently this will default to the current kubectx which might not be what the user wnats/expects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This support an env variable KUBECONFIG (and other standard way) . But I can try to path it over through the config file in another PR.


// checkNamespaceExists checks if a namespace exists
func (c *k8sAPICollector) checkNamespaceExists(ctx context.Context, namespace string) error {
_, err := c.clientset.CoreV1().Namespaces().Get(ctx, namespace, metav1.GetOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be worth getting all the namespaces once then caching for quick lookups as should not be that much data

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*no need to use the formal cache provider here. just a hashmap in memory attached to the collector would be fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this, maybe I should do the caching in a different PR since it is purely optimisation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

}

// streamPodsNamespace streams the pod objects corresponding to a cluster namespace.
func (c *k8sAPICollector) streamPodsNamespace(ctx context.Context, namespace string, ingestor PodIngestor) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed in the original desing meeting that it might help ensure better consistency if we pull all the objects namespace by namespace for pods/roles/rolebindings. What do you think of tweaking the logic slightly to be

  1. pull all namespaces
  2. cache namespaces in the collector instance
  3. call streamXXXNamespace for each namespace in order

You have most of the pieces to do it already so should not be too tricky. @edznux-dd any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is "why" we though about the WithOption() pattern in the Stream* functions.
Caching the list of namespace seems fine to me, I don't see it grow that significantly for it to cause any issue.

Doing the iteration in the Steam* function may make it a bit difficult to parallelize more in the future? (multiple namespace concurrently, so non concurently). I don't know how the API/Storage of k8s handles the load balancing/sharding of this kind of resources, and I don't know if it matters :D (I don't think that could cause hot spots to appears if we do purely sequential reads, but I don't know)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the actual gain of iterating manually over the namespace vs the "automated way" ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thoughts I think this might be some premature optimization. Suggest we leave as is for now and revisit if we see any issues materialize

const (
DefaultK8sAPIPageSize int64 = 5000
DefaultK8sAPIPageBufferSize int32 = 10
DefaultK8sAPIRateLimitPerSecond int = 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is compute ok with this internally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not ping them, but those a the default value (expect the rate limiting). But 100 req/sec seems pretty low to me (I used to try to keep under this value why I was pentesting API). But I will ask them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think thats fine for me then

@@ -0,0 +1,5 @@
collector:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth editing the default config to use this (configs/etc/kubehound.yaml)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed

@jt-dd jt-dd requested a review from d0g0x01 June 6, 2023 13:45
Copy link
Contributor

@d0g0x01 d0g0x01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jt-dd jt-dd merged commit 3204a58 into main Jun 6, 2023
@jt-dd jt-dd deleted the jt-dd/k8sAPILocalCollector branch June 6, 2023 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants