Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroup cleanups #2959

Merged
merged 8 commits into from
Oct 21, 2021
Merged

cgroup cleanups #2959

merged 8 commits into from
Oct 21, 2021

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Oct 13, 2021

This is a preparation for bumping opencontainers/runc to 1.1.

Mostly code simplification with some minor improvements along the way.

Please review commit by commit, and see individual commit messages for more details.

@k8s-ci-robot
Copy link
Collaborator

Hi @kolyshkin. Thanks for your PR.

I'm waiting for a google member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Creatone
Copy link
Collaborator

/ok-to-test

@kolyshkin kolyshkin changed the title [WIP] Cgroups cleanups cgroup cleanups Oct 14, 2021
@kolyshkin
Copy link
Contributor Author

No longer a WIP

Copy link
Collaborator

@Creatone Creatone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only small nit. Otherwise LGTM

@kolyshkin
Copy link
Contributor Author

@Creatone thank you for your review! Please advise where should I put the new imports (see my comments above).

@Creatone
Copy link
Collaborator

@kolyshkin The second group should be for third-party packages. The last group should have only project packages.
Someone missed it in these packages :/ Could you sort that in the correct way?

@kolyshkin
Copy link
Contributor Author

Could you sort that in the correct way?

Sure, but this patchset is already big enough to review, and I don't want it to make it even bigger. Besides, other files also suffer from bad imports grouping. Thus, I am fixing this in #2962.

Will rebase this one once #2962 will be merged.

@eero-t
Copy link
Contributor

eero-t commented Oct 19, 2021

Will rebase this one once #2962 will be merged.

This is really huge PR. IMHO it would be better to split it to couple of PRs with loosely related commits, e.g. ones depending on import changes could be a separate PR from one(s) that do not have them. Formatting changes would indeed be nice as separate commit for review.

@kolyshkin
Copy link
Contributor Author

Draft pending #2962 merge.

@kolyshkin kolyshkin marked this pull request as draft October 20, 2021 03:25
@bobbypage
Copy link
Collaborator

@kolyshkin #2962 was merged so this can be rebased and put out of draft mode?

This field is only used
 - to check if there are any mounts, i.e. len() == 0;
 - to iterate over the list of mounts in watcher.

In both cases, we can reuse MountPoints without any degradation.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
cgroupPaths was basically a copy of cgroupSubsystems.MountPoints.
This is because it was produced from the elements of the latter
by calling common.MakeCgroupPaths() with the second argument of "/".
Note that path.Join(something, "/") will return something.

Since we have two copies of the same info, let's remove one and reuse
the other.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
As this structure only have 1 element now, it does not make sense to
have it as a structure.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Replace it with map[string]string, which it is. This is a continuation
of the previous commit, separated out for review clarity.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Instead of creating an intermediate map to remove specific mounts based
on MetricSet, amend the supportedSubsystems map to have information
about which subsystem reports which metrics.

Based on that info, the check whether the subsystem is needed or not
becomes much simpler.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
It is equivalent to GetCgroupSubsystems(nil) which is easy to use,
and there's only one user. Remove the function, patch the user.

While at it, document the includeMetrics argument of
GetCgroupSubsystems.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In case of cgroup v2, there are no separate per-subsystem mount points,
and so it does not make sense to have a subsystem to mountpoint map.

Yet for the unification of code between v1 and v2, we still use this
map, although a single element is enough.

For cgroup v2, the key is "", the value is the mount point (or the path
to a specific cgroup).

This commit
 - simplifes GetCgroupSubsystems for cgroup v2 (trivial) case;
 - modifies GetCgroupPath() methods to use "" as the key for cgroup v2;
 - fixes NewCgroupManager() to use "" as the key for cgroup v2 case.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin
Copy link
Contributor Author

Rebased on top of the current master; no longer a draft.

@kolyshkin kolyshkin marked this pull request as ready for review October 21, 2021 02:30
@kolyshkin
Copy link
Contributor Author

@bobbypage needs /retest from you I guess.

@bobbypage
Copy link
Collaborator

/retest

@bobbypage
Copy link
Collaborator

Looks like some tests failed in prow run?

https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/google_cadvisor/2959/pull-cadvisor-e2e/1451016418337230848/build-log.txt

W1021 02:48:18.054] , err: exit status 2
I1021 02:48:24.812] >> running tests
I1021 02:48:25.297] ?   	github.com/google/cadvisor	[no test files]
I1021 02:48:30.917] ok  	github.com/google/cadvisor/accelerators	0.041s
I1021 02:48:30.917] ?   	github.com/google/cadvisor/cache	[no test files]
I1021 02:48:31.070] ok  	github.com/google/cadvisor/cache/memory	0.055s
I1021 02:48:31.071] ok  	github.com/google/cadvisor/client	0.118s
I1021 02:48:31.071] ?   	github.com/google/cadvisor/client/clientexample	[no test files]
I1021 02:48:31.071] ok  	github.com/google/cadvisor/client/v2	0.078s
I1021 02:48:32.097] ok  	github.com/google/cadvisor/collector	0.122s
I1021 02:48:32.097] ok  	github.com/google/cadvisor/container	0.042s
I1021 02:48:32.098] ok  	github.com/google/cadvisor/container/common	0.060s
W1021 02:48:37.134] I1021 02:48:37.134091    6753 ssh.go:113] Running the command ssh, with args: [-o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -o LogLevel=ERROR -i /workspace/.ssh/google_compute_engine prow@35.226.46.97 -- sudo ls /var/lib/cloud/instance/boot-finished]
W1021 02:48:38.055] I1021 02:48:38.055035    6753 ssh.go:113] Running the command ssh, with args: [-o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -o LogLevel=ERROR -i /workspace/.ssh/google_compute_engine prow@34.132.87.66 -- sudo ls /var/lib/cloud/instance/boot-finished]
I1021 02:48:39.076] ok  	github.com/google/cadvisor/container/containerd	0.117s
I1021 02:48:39.076] ?   	github.com/google/cadvisor/container/containerd/install	[no test files]
I1021 02:48:39.076] ok  	github.com/google/cadvisor/container/crio	0.110s
I1021 02:48:39.077] ?   	github.com/google/cadvisor/container/crio/install	[no test files]
I1021 02:48:40.491] ok  	github.com/google/cadvisor/container/docker	0.263s
I1021 02:48:40.492] ?   	github.com/google/cadvisor/container/docker/install	[no test files]
I1021 02:48:40.493] ?   	github.com/google/cadvisor/container/docker/utils	[no test files]
I1021 02:48:40.493] ok  	github.com/google/cadvisor/container/libcontainer	0.169s
I1021 02:48:40.493] ok  	github.com/google/cadvisor/container/raw	0.097s
I1021 02:48:40.493] ?   	github.com/google/cadvisor/container/systemd	[no test files]
I1021 02:48:40.493] ?   	github.com/google/cadvisor/container/systemd/install	[no test files]
I1021 02:48:40.494] ?   	github.com/google/cadvisor/container/testing	[no test files]
I1021 02:48:40.494] ok  	github.com/google/cadvisor/devicemapper	0.064s
I1021 02:48:40.494] ?   	github.com/google/cadvisor/devicemapper/fake	[no test files]
I1021 02:48:40.494] ok  	github.com/google/cadvisor/events	0.072s
I1021 02:48:40.494] ok  	github.com/google/cadvisor/fs	0.582s
I1021 02:48:40.494] ok  	github.com/google/cadvisor/info/v1	0.082s
I1021 02:48:40.495] ?   	github.com/google/cadvisor/info/v1/test	[no test files]
I1021 02:48:40.495] ok  	github.com/google/cadvisor/info/v2	0.112s
I1021 02:48:40.495] ?   	github.com/google/cadvisor/integration/framework	[no test files]
I1021 02:48:40.495] ?   	github.com/google/cadvisor/integration/runner	[no test files]
I1021 02:48:40.495] ok  	github.com/google/cadvisor/integration/tests/api	0.132s
I1021 02:48:40.495] ok  	github.com/google/cadvisor/integration/tests/healthz	0.121s
I1021 02:48:40.496] ok  	github.com/google/cadvisor/machine	0.228s
I1021 02:48:45.659] ok  	github.com/google/cadvisor/manager	0.488s
I1021 02:48:47.195] ok  	github.com/google/cadvisor/metrics	0.359s
I1021 02:48:47.195] ?   	github.com/google/cadvisor/nvm	[no test files]
I1021 02:48:47.195] ok  	github.com/google/cadvisor/perf	0.131s
I1021 02:48:47.196] ok  	github.com/google/cadvisor/resctrl	0.417s
I1021 02:48:47.196] ?   	github.com/google/cadvisor/stats	[no test files]
I1021 02:48:47.196] ?   	github.com/google/cadvisor/storage	[no test files]
I1021 02:48:47.197] ok  	github.com/google/cadvisor/summary	0.071s
I1021 02:48:47.197] ok  	github.com/google/cadvisor/utils	0.169s
I1021 02:48:47.198] ?   	github.com/google/cadvisor/utils/cloudinfo	[no test files]
I1021 02:48:47.198] ?   	github.com/google/cadvisor/utils/cloudinfo/aws	[no test files]
I1021 02:48:47.199] ?   	github.com/google/cadvisor/utils/cloudinfo/azure	[no test files]
I1021 02:48:47.199] ?   	github.com/google/cadvisor/utils/cloudinfo/gce	[no test files]
I1021 02:48:47.200] ?   	github.com/google/cadvisor/utils/container	[no test files]
I1021 02:48:47.200] ?   	github.com/google/cadvisor/utils/cpuload	[no test files]
I1021 02:48:47.200] ?   	github.com/google/cadvisor/utils/cpuload/netlink	[no test files]
I1021 02:48:47.201] ?   	github.com/google/cadvisor/utils/cpuload/netlink/example	[no test files]
I1021 02:48:47.201] ok  	github.com/google/cadvisor/utils/oomparser	0.203s
I1021 02:48:47.202] ?   	github.com/google/cadvisor/utils/oomparser/oomexample	[no test files]
I1021 02:48:47.202] ok  	github.com/google/cadvisor/utils/sysfs	0.353s
I1021 02:48:47.202] ?   	github.com/google/cadvisor/utils/sysfs/fakesysfs	[no test files]
I1021 02:48:47.203] ok  	github.com/google/cadvisor/utils/sysinfo	0.117s
I1021 02:48:47.203] ok  	github.com/google/cadvisor/validate	0.140s
I1021 02:48:47.204] ?   	github.com/google/cadvisor/version	[no test files]
I1021 02:48:47.204] ?   	github.com/google/cadvisor/watcher	[no test files]
I1021 02:48:47.205] ?   	github.com/google/cadvisor/zfs	[no test files]
W1021 02:48:52.828] # github.com/google/cadvisor/cmd/internal/container/mesos [github.com/google/cadvisor/cmd/internal/container/mesos.test]
W1021 02:48:52.829] internal/container/mesos/factory_test.go:23:2: imported and not used: "github.com/google/cadvisor/container/libcontainer" as containerlibcontainer
I1021 02:48:56.155] ok  	github.com/google/cadvisor/cmd	0.522s
I1021 02:48:56.156] ok  	github.com/google/cadvisor/cmd/internal/api	0.101s
I1021 02:48:56.156] ?   	github.com/google/cadvisor/cmd/internal/container/install	[no test files]
I1021 02:48:56.157] FAIL	github.com/google/cadvisor/cmd/internal/container/mesos [build failed]
I1021 02:48:56.157] ?   	github.com/google/cadvisor/cmd/internal/container/mesos/install	[no test files]
I1021 02:48:56.158] ?   	github.com/google/cadvisor/cmd/internal/healthz	[no test files]
I1021 02:48:56.158] ?   	github.com/google/cadvisor/cmd/internal/http	[no test files]
I1021 02:48:56.159] ?   	github.com/google/cadvisor/cmd/internal/http/mux	[no test files]
I1021 02:48:56.159] ?   	github.com/google/cadvisor/cmd/internal/pages	[no test files]
I1021 02:48:56.160] ?   	github.com/google/cadvisor/cmd/internal/pages/static	[no test files]
I1021 02:48:56.160] ?   	github.com/google/cadvisor/cmd/internal/storage/bigquery	[no test files]
I1021 02:48:56.161] ?   	github.com/google/cadvisor/cmd/internal/storage/bigquery/client	[no test files]
I1021 02:48:56.161] ?   	github.com/google/cadvisor/cmd/internal/storage/bigquery/client/example	[no test files]
I1021 02:48:56.162] ?   	github.com/google/cadvisor/cmd/internal/storage/elasticsearch	[no test files]
I1021 02:48:56.162] ?   	github.com/google/cadvisor/cmd/internal/storage/influxdb	[no test files]
I1021 02:48:56.163] ?   	github.com/google/cadvisor/cmd/internal/storage/kafka	[no test files]
I1021 02:48:56.163] ?   	github.com/google/cadvisor/cmd/internal/storage/redis	[no test files]
I1021 02:48:56.164] ?   	github.com/google/cadvisor/cmd/internal/storage/statsd	[no test files]
I1021 02:48:56.164] ?   	github.com/google/cadvisor/cmd/internal/storage/statsd/client	[no test files]
I1021 02:48:56.165] ?   	github.com/google/cadvisor/cmd/internal/storage/stdout	[no test files]
I1021 02:48:56.165] ?   	github.com/google/cadvisor/cmd/internal/storage/test	[no test files]
I1021 02:48:56.166] FAIL
W1021 02:48:56.267] make: *** [Makefile:33: test] Error 2

The code to initialize perf collector is almost the same between cgroup
v1 and v2, the only difference is how we get cgroup path. Since now
GetCgroupPath is v2 ready, we can use it, and thus unify these two
cases.

This commit is best reviewed with --ignore-space-change.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin
Copy link
Contributor Author

Looks like some tests failed in prow run?

Yup, lost a trivial hunk due to rebase, fixed now.

GHA CI is not catching it for a reason fixed in #2969.

Apparently, make docker-test also skips the tests under ./cmd -- looking into it.

Copy link
Collaborator

@Creatone Creatone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Creatone Creatone merged commit 3beb265 into google:master Oct 21, 2021
@kolyshkin
Copy link
Contributor Author

Apparently, make docker-test also skips the tests under ./cmd -- looking into it.

See #2972

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants