Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cAdvisor metrics are missing metadata information #867

Closed
etungsten opened this issue Mar 23, 2020 · 6 comments
Closed

cAdvisor metrics are missing metadata information #867

etungsten opened this issue Mar 23, 2020 · 6 comments
Assignees
Labels
status/needs-triage Pending triage or re-evaluation type/bug Something isn't working
Milestone

Comments

@etungsten
Copy link
Contributor

etungsten commented Mar 23, 2020

Image I'm using:
bottlerocket-aws-k8s-1.15-x86_64-0.3.1-8a0c0b3 with a v1.15 EKS control plane

What I expected to happen:
cAdvisor metrics to have associated metadata available (e.g container_name, pod, namespace)
For example, container_fs_usage_bytes for an AL2 based EKS node:

container_fs_usage_bytes{container="kube-proxy",container_name="kube-proxy",device="/dev/nvme0n1p1",id="/kubepods/burstable/pod69bed0f7-1b10-4da6-bf06-2193e6e6f2aa/955dc369c1eec8f5f00e74198f76d0b70983ed3649b9055aac5dd4e9ed9c2c66",image="602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy@sha256:d3a6122f63202665aa50f3c08644ef504dbe56c76a1e0ab05f8e296328f3a6b4",name="k8s_kube-proxy_kube-proxy-p978c_kube-system_69bed0f7-1b10-4da6-bf06-2193e6e6f2aa_0",namespace="kube-system",pod="kube-proxy-p978c",pod_name="kube-proxy-p978c"} 12288 1584990960301

What actually happened:
cAdvisor metrics all have missing metadata information in an Bottlerocket node.

container_fs_usage_bytes{container="",container_name="",device="/dev/nvme0n1p10",id="/",image="",name="",namespace="",pod="",pod_name=""} 438272 1584990076136
...
container_cpu_usage_seconds_total{container="",container_name="",cpu="total",id="/",image="",name="",namespace="",pod="",pod_name=""} 4199.250219446 1584990076136

How to reproduce the problem:

  1. Launch Bottlerocket nodes in your cluster.
  2. Start a HTTP proxy to the Kuberenetes API server and access the metrics under http://localhost:8001/api/v1/nodes/$NODE_NAME/proxy/metrics/cadvisor
    Substitute $NODE_NAME with the name of a bottlerocket node.
@etungsten
Copy link
Contributor Author

With #868, majority of the metrics have regained metadata information. However a lot of filesystem and io related metrics are still missing metadata. (e.g. container_fs_limit_bytes, container_fs_io_time_seconds_total)

@bcressey bcressey added this to the v0.3.2 milestone Mar 24, 2020
@etungsten etungsten modified the milestones: v0.3.2, v0.4.0 Apr 7, 2020
@iliana iliana removed this from the v0.4.0 milestone Jun 22, 2020
@gregdek gregdek added this to the techdebt milestone Apr 1, 2021
@bcressey bcressey self-assigned this Apr 8, 2022
@baasumo
Copy link

baasumo commented Apr 14, 2022

Just expanding on the above comment, our team noticed all of these metrics:

"container_fs_inodes_free"
"container_fs_inodes_total"
"container_fs_io_current"
"container_fs_io_time_seconds_total"
"container_fs_io_time_weighted_seconds_total"
"container_fs_limit_bytes"
"container_fs_read_seconds_total"
"container_fs_reads_merged_total"
"container_fs_sector_reads_total"
"container_fs_sector_writes_total"
"container_fs_usage_bytes"
"container_fs_write_seconds_total"
"container_fs_writes_merged_total"

are lacking metadata and some of these are used for autoscaling and other behavior in some of our workloads.

@zmrow
Copy link
Contributor

zmrow commented Apr 14, 2022

@baasumo Thanks for following up on this!

@bcressey
Copy link
Contributor

Support for these metrics is still not in cAdvisor upstream, per google/cadvisor#2785 (comment).

@ghost
Copy link

ghost commented May 23, 2022

Is there a way we can get labels and annotations from the pod running these containers?

@stmcginnis stmcginnis added status/needs-triage Pending triage or re-evaluation and removed priority/p2 labels Dec 1, 2022
@stmcginnis
Copy link
Contributor

It appears we were able to improve this slightly with the referenced issue above, but from what I can tell the full fix for this is actually something outside of Botttlerocket. I'm going to close this since we don't have any concrete things identified in Bottlerocket, but feel free to reopen if there is anything else that can be done from this end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/needs-triage Pending triage or re-evaluation type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants