Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add adaptive example of a Monte Carlo estimate #19

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

willirath
Copy link
Member

@willirath willirath commented Oct 12, 2018

This adds a notebook with a Monte-Carlo estimate of the number pi. It also shows how to tune Adaptive to smoothly auto-scale the cluster to approximately match a given execution time of 20 seconds over a huge range of precisions.

@willirath willirath changed the title Add adaptive example of a Monte Carlo estimate [WIP] Add adaptive example of a Monte Carlo estimate Oct 30, 2018
@willirath
Copy link
Member Author

Counting pods from the list of pods is not what we want. I'll switch to len() of the list of workers instead. Or is there any public API call for: "How not is the cluster atm?"

Copy link
Member

@guillaumeeb guillaumeeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're showing two things here, how to compute a pi approximation using Dask, and how it scales easily on Pangeo cloud platform, you should explain both.

This is really interesting, thanks for that! I don't know if this is completely relevant as a Pangeo example (FWIW I think it is), however I strongly feel that the pi computation and scaling at a lower lever could be of interest for dask-examples cc @mrocklin.

Or is there any public API call for: "How not is the cluster atm?"

len(cluster.pods()) looks fine. You could maybe use cluster.scheduler.workers or other scheduler methods, but not sure it would be much better.

" alt=\"PI monte-carlo estimate\">\n",
" \n",
"Using [Dask's adaptivity](http://docs.dask.org/en/latest/setup/adaptive.html), we'll show that it is possible to scale the available resources to meet almost identical wall times irrespective of the acutal work load:\n",
"\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in actual.

You should maybe introduce also the Montecarlo estimate of py mechanism?

"metadata": {},
"outputs": [],
"source": [
"# check Adaptive? for help on adapt's kwargs.\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you prepare the cell with the Adaptive? call?

"import numpy as np\n",
"from time import time\n",
"\n",
"def calc_pi_mc(size):\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to first describe step by step the pi estimation, that would be very interesting.
Then at the end put all inside a function to perform scalability analysis.

@willirath
Copy link
Member Author

Thanks @guillaumeeb for this review!

I just committed a more verbose version that now (briefly) outlines the method and has more details on why it matters that we can scale the cluster to meet a desired target duration over a wide range of problem sizes.

@willirath willirath changed the title [WIP] Add adaptive example of a Monte Carlo estimate Add adaptive example of a Monte Carlo estimate Nov 6, 2018
Copy link
Member

@guillaumeeb guillaumeeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, a really nice dask example! A few small comments.

"\n",
"## Actual timings\n",
"\n",
"Aming for a duration of 20 seconds per calculation, this is what we actually get:\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in Aiming.
Maybe put the results at the end? To give the user an idea of what can be achieved, even if he probably won't scale all the way up.

"source": [
"## Tuning adaptivity\n",
"\n",
"The following tunes a Dask cluster to use anywhere between 1 and 400 workers and to scale its size so that any computation is finished within 20 seconds. On Pangeo, time scales for starting / stopping workers are of the order of a few seconds, so we set a startup cost to 5 seconds (instead of the default value of 1 second) and increase possible scale-down times by setting the relevant interval to 2 seconds and the number of times a worker needs to be considered expendable before it is actually killed to `10`. We also reduce the default factor that is applied to adapt the cluster to a more modest `1.2`.\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time scales for starting / stopping workers are of the order of a few seconds

Not sure this is true for Cloud deployments, is it?

Why only 1.2 scale factor? I'm not sure this is used in most cases.

"source": [
"## The actual calculations\n",
"\n",
"We loop over volumes of 50 GB, 100 GB, 200 GB, ..., 3200 GB of double-precision random numbers and estimate $\\pi$ as described above."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe run it first we a small size, just so the user can validate and try the method.

And then loop, but maybe not up to 3200 GB?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants