Add adaptive example of a Monte Carlo estimate #19

willirath · 2018-10-12T17:02:24Z

This adds a notebook with a Monte-Carlo estimate of the number pi. It also shows how to tune Adaptive to smoothly auto-scale the cluster to approximately match a given execution time of 20 seconds over a huge range of precisions.

willirath · 2018-10-30T17:44:40Z

Counting pods from the list of pods is not what we want. I'll switch to len() of the list of workers instead. Or is there any public API call for: "How not is the cluster atm?"

guillaumeeb

So you're showing two things here, how to compute a pi approximation using Dask, and how it scales easily on Pangeo cloud platform, you should explain both.

This is really interesting, thanks for that! I don't know if this is completely relevant as a Pangeo example (FWIW I think it is), however I strongly feel that the pi computation and scaling at a lower lever could be of interest for dask-examples cc @mrocklin.

Or is there any public API call for: "How not is the cluster atm?"

len(cluster.pods()) looks fine. You could maybe use cluster.scheduler.workers or other scheduler methods, but not sure it would be much better.

guillaumeeb · 2018-10-31T09:40:52Z

elastic_monte_carlo_estimate_of_pi.ipynb

+    "     alt=\"PI monte-carlo estimate\">\n",
+    "     \n",
+    "Using [Dask's adaptivity](http://docs.dask.org/en/latest/setup/adaptive.html), we'll show that it is possible to scale the available resources to meet almost identical wall times irrespective of the acutal work load:\n",
+    "\n",


typo in actual.

You should maybe introduce also the Montecarlo estimate of py mechanism?

guillaumeeb · 2018-10-31T09:43:04Z

elastic_monte_carlo_estimate_of_pi.ipynb

+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# check Adaptive? for help on adapt's kwargs.\n",


Could you prepare the cell with the Adaptive? call?

guillaumeeb · 2018-10-31T09:46:14Z

elastic_monte_carlo_estimate_of_pi.ipynb

+    "import numpy as np\n",
+    "from time import time\n",
+    "\n",
+    "def calc_pi_mc(size):\n",


It would be good to first describe step by step the pi estimation, that would be very interesting.
Then at the end put all inside a function to perform scalability analysis.

willirath · 2018-11-05T09:11:55Z

Thanks @guillaumeeb for this review!

I just committed a more verbose version that now (briefly) outlines the method and has more details on why it matters that we can scale the cluster to meet a desired target duration over a wide range of problem sizes.

guillaumeeb

This looks good, a really nice dask example! A few small comments.

guillaumeeb · 2018-11-12T20:29:20Z

elastic_monte_carlo_estimate_of_pi.ipynb

+    "\n",
+    "## Actual timings\n",
+    "\n",
+    "Aming for a duration of 20 seconds per calculation, this is what we actually get:\n",


Typo in Aiming.
Maybe put the results at the end? To give the user an idea of what can be achieved, even if he probably won't scale all the way up.

guillaumeeb · 2018-11-12T20:31:48Z

elastic_monte_carlo_estimate_of_pi.ipynb

+   "source": [
+    "## Tuning adaptivity\n",
+    "\n",
+    "The following tunes a Dask cluster to use anywhere between 1 and 400 workers and to scale its size so that any computation is finished within 20 seconds.  On Pangeo, time scales for starting / stopping workers are of the order of a few seconds, so we set a startup cost to 5 seconds (instead of the default value of 1 second) and increase possible scale-down times by setting the relevant interval to 2 seconds and the number of times a worker needs to be considered expendable before it is actually killed to `10`.  We also reduce the default factor that is applied to adapt the cluster to a more modest `1.2`.\n",


time scales for starting / stopping workers are of the order of a few seconds

Not sure this is true for Cloud deployments, is it?

Why only 1.2 scale factor? I'm not sure this is used in most cases.

guillaumeeb · 2018-11-12T20:38:15Z

elastic_monte_carlo_estimate_of_pi.ipynb

+   "source": [
+    "## The actual calculations\n",
+    "\n",
+    "We loop over volumes of 50 GB, 100 GB, 200 GB, ..., 3200 GB of double-precision random numbers and estimate $\\pi$ as described above."


Maybe run it first we a small size, just so the user can validate and try the method.

And then loop, but maybe not up to 3200 GB?

Add adaptive example of a Monte Carlo estimate

3644a65

willirath changed the title ~~Add adaptive example of a Monte Carlo estimate~~ [WIP] Add adaptive example of a Monte Carlo estimate Oct 30, 2018

guillaumeeb reviewed Oct 31, 2018

View reviewed changes

Update with more explanations

e6526e7

willirath changed the title ~~[WIP] Add adaptive example of a Monte Carlo estimate~~ Add adaptive example of a Monte Carlo estimate Nov 6, 2018

guillaumeeb reviewed Nov 12, 2018

View reviewed changes

guillaumeeb mentioned this pull request Mar 20, 2019

Jobqueue-specific Dask examples? dask/dask-jobqueue#253

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add adaptive example of a Monte Carlo estimate #19

Add adaptive example of a Monte Carlo estimate #19

willirath commented Oct 12, 2018 •

edited

Loading

willirath commented Oct 30, 2018

guillaumeeb left a comment

guillaumeeb Oct 31, 2018

guillaumeeb Oct 31, 2018

guillaumeeb Oct 31, 2018

willirath commented Nov 5, 2018

guillaumeeb left a comment

guillaumeeb Nov 12, 2018

guillaumeeb Nov 12, 2018

guillaumeeb Nov 12, 2018

Add adaptive example of a Monte Carlo estimate #19

Are you sure you want to change the base?

Add adaptive example of a Monte Carlo estimate #19

Conversation

willirath commented Oct 12, 2018 • edited Loading

willirath commented Oct 30, 2018

guillaumeeb left a comment

Choose a reason for hiding this comment

guillaumeeb Oct 31, 2018

Choose a reason for hiding this comment

guillaumeeb Oct 31, 2018

Choose a reason for hiding this comment

guillaumeeb Oct 31, 2018

Choose a reason for hiding this comment

willirath commented Nov 5, 2018

guillaumeeb left a comment

Choose a reason for hiding this comment

guillaumeeb Nov 12, 2018

Choose a reason for hiding this comment

guillaumeeb Nov 12, 2018

Choose a reason for hiding this comment

guillaumeeb Nov 12, 2018

Choose a reason for hiding this comment

willirath commented Oct 12, 2018 •

edited

Loading