Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write rates of endpoints still high on 3.6.1 #16583

Closed
smarterclayton opened this issue Sep 27, 2017 · 17 comments
Closed

Write rates of endpoints still high on 3.6.1 #16583

smarterclayton opened this issue Sep 27, 2017 · 17 comments

Comments

@smarterclayton
Copy link
Contributor

smarterclayton commented Sep 27, 2017

API server write rates from 3.6.1 on a heavily loaded cluster.

{resource="endpoints",verb="PUT"} | 35.0125
{resource="pods",verb="PUT"} | 19.45416666666667
{resource="pods",verb="DELETE"} | 16.987499999999997
{resource="events",verb="POST"} | 16.841666666666665
{resource="pods",verb="POST"} | 16.104166666666664
{resource="nodes",verb="PATCH"} | 15.579166666666667
{resource="localsubjectaccessreviews",verb="POST"} | 8.433333333333334
{resource="builds",verb="PATCH"} | 8.108333333333333
{resource="daemonsets",verb="PUT"} | 6.304166666666666
{resource="replicasets",verb="POST"} | 4.279166666666667
{resource="subjectaccessreviews",verb="POST"} | 1.7291666666666667
{resource="tokenreviews",verb="POST"} | 0.9333333333333333
{resource="replicationcontrollers",verb="PUT"} | 0.4666666666666667
{resource="deploymentconfigs",verb="PUT"} | 0.3541666666666667
{resource="deploymentconfigs",verb="POST"} | 0.2583333333333333

@derekwaynecarr @openshift/sig-platform-management

@liggitt
Copy link
Contributor

liggitt commented Sep 27, 2017

do we have large numbers of endpoints with no subsets? if so, kubernetes/kubernetes#50583 might be relevant

@smarterclayton
Copy link
Contributor Author

Hrm, doubt it but maybe.

@smarterclayton
Copy link
Contributor Author

I don't like seeing 16 pod deletes a second. That's probably bad.

@smarterclayton
Copy link
Contributor Author

@openshift/sig-master

@smarterclayton
Copy link
Contributor Author

smarterclayton commented Sep 27, 2017

Here's write volume to etcd before and after the 3.6.1 upgrade.

before_after_volume

@smarterclayton
Copy link
Contributor Author

The jump was Sept 26th 2:15 gmt. May be unrelated to update?

@liggitt
Copy link
Contributor

liggitt commented Oct 12, 2017

what is the server version?

@smarterclayton
Copy link
Contributor Author

this was 3.6.1 official

@liggitt
Copy link
Contributor

liggitt commented Oct 12, 2017

the endpoint update fix merged to ose on 2017-09-15 08:22, and is only in these tags:

v3.6.173.0.35-1
v3.6.173.0.36-1
v3.6.173.0.37-1
v3.6.173.0.38-1
v3.6.173.0.39-1
v3.6.173.0.40-1
v3.6.173.0.41-1
v3.6.173.0.42-1
v3.6.173.0.43-1
v3.6.173.0.44-1
v3.6.173.0.45-1
v3.6.173.0.46-1
v3.6.173.0.47-1
v3.6.173.0.48-1
v3.6.173.0.49-1

@liggitt
Copy link
Contributor

liggitt commented Oct 12, 2017

what sha or tag is 3.6.1 official?

@liggitt
Copy link
Contributor

liggitt commented Oct 12, 2017

looks like 3.6.1 is 3.6.173.0.21, which does not include the endpoints fixes yet

@smarterclayton
Copy link
Contributor Author

smarterclayton commented Oct 12, 2017 via email

@smarterclayton
Copy link
Contributor Author

Was one single daemonset https://bugzilla.redhat.com/show_bug.cgi?id=1501514. Moving to bug

@smarterclayton
Copy link
Contributor Author

After the daemonset is fixed we're still doing 30 writes / s from endpoints and pod traffic is <1 w/s

@liggitt
Copy link
Contributor

liggitt commented Oct 16, 2017

what environment and version are you seeing this on?

@liggitt
Copy link
Contributor

liggitt commented Oct 16, 2017

need at least 3.6.173.0.48 for the controller and resourceVersion no-op fixes @joelsmith picked

@liggitt
Copy link
Contributor

liggitt commented Oct 16, 2017

we're seeing high levels of endpoint write API calls, but low volumes of updates when watching endpoints, then it is likely that the writes are no-op writes, and kubernetes/kubernetes#50583 is a candidate fix

openshift-merge-robot added a commit that referenced this issue Oct 17, 2017
Automatic merge from submit-queue.

UPSTREAM: 50583: Make endpoints controller update based on semantic equality

Pick of #16889 for 3.6
Fixes #16583
openshift-merge-robot added a commit that referenced this issue Oct 17, 2017
Automatic merge from submit-queue.

UPSTREAM: 50583: Make endpoints controller update based on semantic equality

Fixes #16583
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants