Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bucket drain events using semaphore #49

Merged
merged 5 commits into from
Mar 9, 2020

Conversation

eytan-avisror
Copy link
Collaborator

@eytan-avisror eytan-avisror commented Mar 5, 2020

Fixes #48

Limit is currently set to 32 concurrent terminations, once drain is completed semaphore is released.

Example:
80 instances are terminated > 32 will begin draining, others are blocked (heartbeat is sent) > once drain is completed semaphore is released > next batch starts draining.

This causes a delay in handling lifecycle events up to (TerminatingInstances / 32 * MaxDrainTime)

TBD:

  • Make value controllable by flag, with default value of 32

Testing:

  • Terminate 100 instances running pods with preStop hook that sleeps 90 seconds, evaluate delays
  • Memory utilization using default value does not exceed recommended values

@eytan-avisror eytan-avisror requested a review from a team as a code owner March 5, 2020 18:50
@codecov
Copy link

codecov bot commented Mar 5, 2020

Codecov Report

Merging #49 into master will decrease coverage by 0.01%.
The diff coverage is 71.42%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #49      +/-   ##
==========================================
- Coverage   73.06%   73.05%   -0.02%     
==========================================
  Files          12       12              
  Lines         995     1002       +7     
==========================================
+ Hits          727      732       +5     
- Misses        206      207       +1     
- Partials       62       63       +1     

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5cd1979...1bab820. Read the comment docs.

@eytan-avisror
Copy link
Collaborator Author

eytan-avisror commented Mar 6, 2020

Test

  • Run 100 node, with deployment of 100 pod replicas with pre terminate step of sleep 170s
  • Terminate 97 nodes

Results

Batching:
Screen Shot 2020-03-06 at 12 10 03 PM

delay in processing time gradually increasing (blue) clearly visible, gaps are 170s of sleep time as expected.

Memory consumption:
Screen Shot 2020-03-06 at 12 08 34 PM

Memory consumption seem to hover around 500mb per batch of 32 x kubectl drain (the default value).

I have also bumped up the examples to request 2048Mi

Edge cases to consider

in scenarios where the max-drain-timeout is set to a large number, and drains are taking a very long time (e.g. due to PDB / scale up), AND a large number of nodes is terminating, it might take a long time to start deregistration for last batches, however this scenario is highly unlikely as it would mean all 32 kubectl threads are being delayed - also this can be configured by either setting a lower timeout or adding more parallel drains (and memory).

Copy link
Contributor

@kevdowney kevdowney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@@ -101,6 +104,7 @@ func init() {
serveCmd.Flags().StringVar(&queueName, "queue-name", "", "the name of the SQS queue to consume lifecycle hooks from")
serveCmd.Flags().StringVar(&kubectlLocalPath, "kubectl-path", "/usr/local/bin/kubectl", "the path to kubectl binary")
serveCmd.Flags().StringVar(&logLevel, "log-level", "info", "the logging level (info, warning, debug)")
serveCmd.Flags().Int64Var(&maxDrainConcurrency, "max-drain-concurrency", 32, "maximum number of node drains to process in parallel")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might just want the unsigned int uint instead, then you just convert to int64 for the semaphore.NewWeighted(int64(maxDrainConcurrency))

Copy link
Collaborator Author

@eytan-avisror eytan-avisror Mar 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference with using int and converting? uint is simply not accepting negative numbers, it would also be possible to use int and convert to int64, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my point is, if we know we only use this variable as an int64 why do we need an extra conversion?

@eytan-avisror eytan-avisror merged commit eabed53 into keikoproj:master Mar 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Massive downscales can have scalability issues with kubectl
3 participants