Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure kubernetes caches don't expire if they are being read #10946

Merged
merged 6 commits into from
Mar 4, 2019

Conversation

jsoriano
Copy link
Member

@jsoriano jsoriano commented Feb 26, 2019

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix #10658

@jsoriano jsoriano added module review Metricbeat Metricbeat containers Related to containers use case Team:Integrations Label for the Integrations team labels Feb 26, 2019
@jsoriano jsoriano self-assigned this Feb 26, 2019
@jsoriano jsoriano force-pushed the kubernetes-metrics-cache-cleanup branch 2 times, most recently from 1902871 to c36d589 Compare February 26, 2019 17:44
@jsoriano jsoriano force-pushed the kubernetes-metrics-cache-cleanup branch from c36d589 to 1f26a33 Compare February 26, 2019 17:53
Copy link
Contributor

@sayden sayden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's a WIP. Just left some comments if they help.

metricbeat/module/kubernetes/util/metrics_cache.go Outdated Show resolved Hide resolved
metricbeat/module/kubernetes/util/metrics_cache.go Outdated Show resolved Hide resolved
@jsoriano jsoriano marked this pull request as ready for review February 27, 2019 15:17
@jsoriano jsoriano requested a review from a team as a code owner February 27, 2019 15:17
@jsoriano jsoriano added needs_backport PR is waiting to be backported to other branches. and removed backport labels Feb 27, 2019
@jsoriano
Copy link
Member Author

jsoriano commented Feb 27, 2019

I have been testing this with master and 6.6.1 and it seems to work.
I have also fixed some issues on tests with the initial version to avoid flakyness and to pass the race detector.

As caches are now, #10658 can still happen if period of container metricset is greater than 120 seconds, fixing this would require to further refactor the global cache. I leave this for future improvements (#10965) as I don't think many people will have these periods. A quick "fix" would be to increase the default timeout.

@andrewkroh
Copy link
Member

andrewkroh commented Feb 28, 2019

It only quickly peaked at the cache, but it looks similar to libbeat/common/cache.go (godoc) which expires elements based on their last access time, has a goroutine to periodically do cleanup, and allows injecting a clock for testing (but injecting the clock isn't exported so you'd need to expose this).

@jsoriano
Copy link
Member Author

@andrewkroh oh, this actually looks like what we'd need here, I will give a try to this implementation, thanks!

@jsoriano jsoriano force-pushed the kubernetes-metrics-cache-cleanup branch from 8481def to 304bec5 Compare February 28, 2019 19:07

// Check it expired
assert.Equal(t, 0.0, test.Get("foo"))
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test removed because libbeat cache implementation is used now and it has its own tests for this.

@jsoriano jsoriano merged commit 106df3d into elastic:master Mar 4, 2019
@jsoriano jsoriano deleted the kubernetes-metrics-cache-cleanup branch March 4, 2019 12:48
jsoriano added a commit to jsoriano/beats that referenced this pull request Mar 4, 2019
…#10946)

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix elastic#10658

(cherry picked from commit 106df3d)
@jsoriano jsoriano removed the needs_backport PR is waiting to be backported to other branches. label Mar 4, 2019
jsoriano added a commit to jsoriano/beats that referenced this pull request Mar 4, 2019
…#10946)

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix elastic#10658

(cherry picked from commit 106df3d)
jsoriano added a commit to jsoriano/beats that referenced this pull request Mar 4, 2019
…#10946)

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix elastic#10658

(cherry picked from commit 106df3d)
jsoriano added a commit to jsoriano/beats that referenced this pull request Mar 4, 2019
…#10946)

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix elastic#10658

(cherry picked from commit 106df3d)
jsoriano added a commit that referenced this pull request Mar 5, 2019
…#11057)

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix #10658

(cherry picked from commit 106df3d)
jsoriano added a commit that referenced this pull request Mar 5, 2019
…#11058)

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix #10658

(cherry picked from commit 106df3d)
jsoriano added a commit that referenced this pull request Mar 5, 2019
…#11059)

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix #10658

(cherry picked from commit 106df3d)
jsoriano added a commit that referenced this pull request Mar 14, 2019
…#11060)

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix #10658

(cherry picked from commit 106df3d)
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…#10946) (elastic#11059)

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix elastic#10658

(cherry picked from commit db4b4c2)
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…#10946) (elastic#11060)

Some metrics in metricbeat kubernetes module are cached during a time,
if they are not updated they are removed. But it is usual to have pods or
containers that are not updated during more time that the expiration cache.
Current implementation was not renovating expiration times for cache
entries so all were eventually removed if updates for them are not received.
Replace it with the cache implementation available in libbeat, but keeping
the existing interface.

Also, use slashes instead of dashes to generate unique container uids.
Dashes can be used by kubernetes names, what could lead to ambiguous
keys for the caches.

Fix elastic#10658

(cherry picked from commit db4b4c2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug containers Related to containers use case Metricbeat Metricbeat module review Team:Integrations Label for the Integrations team v6.5.5 v6.6.2 v6.7.0 v7.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants