Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR Idea: Document a minimal cost deployment #717

Closed
consideRatio opened this issue Jun 11, 2018 · 9 comments
Closed

PR Idea: Document a minimal cost deployment #717

consideRatio opened this issue Jun 11, 2018 · 9 comments
Labels
pr-idea Pull Request ideas should have concrete and seemingly viable changes.

Comments

@consideRatio
Copy link
Member

consideRatio commented Jun 11, 2018

Me and @UsDAnDreS have both attempted to minimize the amount of money spent by reducing the nodes and the available CPU cores on the nodes while making the initial cluster setup. It is not obvious on how to do this, and I think it would be good to share some experience, as well as update the guide to be less resource greedy.

I aim to look into how the z2jh guide can allow for a cheaper default setup and elaborate more in due time. My current setup has one single CPU node (n1-standard-1) and have configured google cloud to scale up more nodes when required for the singleuser-server pods.

@betatim
Copy link
Member

betatim commented Jun 12, 2018

What is your experience with the user experience when you need to scale up? For mybinder.org it takes quite a while to spawn a new node which makes me think that you don't want to scale "too often". However for clusters which are empty a lot (teaching for example) it is attractive to have one node with the hub on it that is very small and then scale as soon as someone logs in.

@UsDAnDreS
Copy link

UsDAnDreS commented Jun 12, 2018

@betatim Yes, I need it for weekly workshops (with under a 100 of potential users), hence requiring a weekly scale-up dynamic with emptiness in-between. I'm still on the stage of figuring on how to just configure the environment on the server, and will start thinking about scaling right after that, but any advice on the optimization and efficient use is much appreciated (as $'s are ticking off that Google Cloud free trial haha).

@choldgraf
Copy link
Member

I'm +1 on language that makes scaling down/up easier for people in general (and maybe some "user stories" from different usage scenarios where scaling up/down would help)

@minrk
Copy link
Member

minrk commented Jun 19, 2018

I'm supporting a workshop of 50 students right now, and a traditional JupyterHub deployment doesn't suffer nearly as much on a scale-up event as Binder for two reasons:

  1. there aren't as many concurrent spawns
  2. there's only one image to pull, not several, so multiple pending spawns on one new node don't impede each other much

I haven't had a single spawn failure and it scales up from 0 to 3 nodes each morning as students show up.

My setup is similar to @consideRatio's with a 4-cpu node with hub, proxy, prometheus, etc. and a 0-N autoscale pool of 8-cpu nodes for users (I started with 16-cpu nodes, but ran into the GKE hard limit of 16 users per node when using persistent volumes #732). That way it gets a pretty good chance of scaling all they way back to 0 at the end of the day without my help.

@minrk
Copy link
Member

minrk commented Jun 21, 2018

This is continuing to work well, and I'm very happy! Completely unattended scale-down to just the one node at the end of the day, then back up to five in the morning:

screen shot 2018-06-21 at 09 45 26

(and a couple of people, I'm assuming instructors preparing for the next day, showing up around midnight, but still fitting on the first node.)

@consideRatio
Copy link
Member Author

@minrk If you have 5 nodes, and 5 users, one on each node, then the cluster won't scale down right?

I'm amazed that this works so well when the nodes are massive, because you have these super sized nodes right?

@minrk
Copy link
Member

minrk commented Jul 1, 2018

My user nodes aren't huge, they are n1-standard-8. The limiting factor for my nodes is gcloud's very low persistent-volume-per-node limit, so I can't have more than 16 users per node. As a result, I have ~12 users per node.

The scale-down works very well for me because these are all students in a class in the same timezone, so they all go home at the end of the day and it scales back to 0.

Another thing that helps me scale down is that while I assign all non-user pods to one pool, I don't actually assign users strictly to the user pool. This is perhaps not the most prudent, but it allows my first few users to run without allocating a node in the user pool. That means when one or two users show up after everybody has stopped for the day, instead of scaling-up one user node, they stay on the always-on node with the hub.

Reliable scale-down when you have more constant activity (like Binder) is going to be harder until we get an actually reliable pod-packing scheduler running. As it is now, we have to fake packing by cordoning a node and hoping it drains before the next scale-up event.

@consideRatio
Copy link
Member Author

consideRatio commented Jul 1, 2018

@minrk ah excellent! I'm very happy to have spent a lot of time this week implementing a lot of what you will appreciate it seems.

Zero-to-jupyterhub-k8s 0.7+ easy setup:

  • Two node pools, with "hub.jupyter.org/node-purpose": "core" and "user" respectively and with a NoSchedule taint on the user pool, that the users tolerate. This means that kube-dns or one of the core pods happen to schedule there and cause issues, as they can if for example you run an upgrade and due to a lack of resource on the core node to handle two simultaneous hubs / proxies etc they schedule on a user node.
    • The "core" node-pool requires only a single 1 core node
    • The "user" node-pool: autoscaling, 4 or 8 core machines perhaps, perhaps preemtible nodes also.
  • Evictable placeholder user pods (with low PodPriority) create configurable amount of headroom
  • Continuous image pullers pulls images whenever placeholders scale up
  • A custom scheduler packs the user pod tight

Btw: google have or will increase that PVC restriction to 128 or similar as far as I recall.

@consideRatio consideRatio changed the title Reduce cost of the default z2jh cluster setup PR: Document a minimal cost deployment Sep 11, 2019
@consideRatio consideRatio changed the title PR: Document a minimal cost deployment PR Idea: Document a minimal cost deployment Sep 11, 2019
@consideRatio consideRatio added the pr-idea Pull Request ideas should have concrete and seemingly viable changes. label Sep 11, 2019
@consideRatio
Copy link
Member Author

Instead of documenting a minimalistic deployment, we are now documenting general optimizations for any deployment and I think that is a better approach. Hmm.. closing this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-idea Pull Request ideas should have concrete and seemingly viable changes.
Projects
None yet
Development

No branches or pull requests

5 participants