Include n_tune, n_draws and t_sampling in SamplerReport #3827

michaelosthege · 2020-03-06T18:11:40Z

This PR adds three new (optional) properties to SamplerReport:

n_tune: Number of tune iterations.
n_draws: Number of draw iterations.
t_sampling: Number of seconds that the sampling procedure took. (Includes parallelization overhead.)

While n_draws may be retrieved from the trace, information about n_tune is lost unless discard_tuned_samples=True. However, ArviZ currently does not look at the "tune" sampler stat, making it very inconvenient to keep tuning iterations and use ArviZ at the same time.

The t_sampling (wall-clock) time is not kept automatically, but super useful for comparing sampler efficiency. We could also use it directly in the benchmarks..

I've included it in the _log_summary at INFO level:

What do you think?

codecov · 2020-03-07T09:44:47Z

Codecov Report

Merging #3827 into master will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #3827      +/-   ##
==========================================
+ Coverage   90.77%   90.79%   +0.01%     
==========================================
  Files         135      135              
  Lines       21074    21113      +39     
==========================================
+ Hits        19130    19169      +39     
  Misses       1944     1944

Impacted Files	Coverage Δ
pymc3/backends/report.py	`92.96% <100.00%> (+0.79%)`	⬆️
pymc3/sampling.py	`85.16% <100.00%> (+0.33%)`	⬆️
pymc3/tests/test_sampling.py	`99.62% <100.00%> (+<0.01%)`	⬆️

ColCarroll

I like this! I think it will be especially valuable if we implement some alternate stopping logic: like, if we make it easy to sample 1000 effective samples, this logging will be useful.

Left a few thoughts, but you're also welcome to merge and fix later.

pymc3/sampling.py

ColCarroll · 2020-03-08T15:39:15Z

pymc3/backends/report.py

@@ -151,7 +175,8 @@ def _add_warnings(self, warnings, chain=None):
        warn_list.extend(warnings)

    def _log_summary(self):
-
+        if self._n_tune is not None and self._n_draws is not None and self._t_sampling is not None:
+            logger.info(f'Sampling {self.n_tune} tune and {self.n_draws} draw iterations took {self.t_sampling:.0f} seconds.')


Can this include the number of chains and total number of draws? That might clarify how we count: For 1000 draws in 4 chains, the progressbar will go to 4000, so a message like "Sampling 500 tune and 1000 draws in 4 chains (2000 plus 4000 draws total) took 18 seconds." might be helpful.

What do you think?

Also, I might use {self.n_tune:,d} and {self.n_draws:,d} to get commas in the number, but that's a very weak desire.

I'll add implement your suggestions tomorrow - thanks!

As a German, I read 10,000 as float(10), but maybe we can go with the pythonic neutral ground of 10_000? (According to SI, one should use a thin space U+202F, but in monospaced fonts that's somewhat pointless.)

This is with the update I just pushed:

5 chains sequentially, interrupted during tuning of the first:

Interrupted during the second chain:

Not interrupted:

haha! I thought there was a locale-safe version of this. this looks great to me -- feel free to merge.

(because of KeyboardInterrupt)

+ clarify that n_tune are not necessarily in the trace

+ CompoundStep causes 2D stats + fix calculation without stats

…osthege/pymc3 into record-sampling-metadata

michaelosthege · 2020-03-11T10:18:54Z

Finally. This took embarrassingly many commits to go green.

include n_tune, n_draws and t_sampling in SamplerReport

e1844b7

michaelosthege added the enhancements label Mar 6, 2020

michaelosthege requested a review from ColCarroll March 6, 2020 18:15

ColCarroll reviewed Mar 8, 2020

View reviewed changes

michaelosthege and others added 10 commits March 10, 2020 16:37

count tune/draw samples instead of trusting parameters

d028ba1

(because of KeyboardInterrupt)

move log info to sampling so number of chains can be included

ae9670f

add test for SamplerReport n_tune and n_draws

fc3e18c

+ clarify that n_tune are not necessarily in the trace

use actual number of chains to compute totals

0e3cd76

mention new SamplerReport properties

4b9dc7b

fall back to tune and len(trace) if tune stat is unavailable

d1c8498

Merge branch 'master' into record-sampling-metadata

03b21d1

account for differences in stat shape

2b191f9

+ CompoundStep causes 2D stats + fix calculation without stats

Merge branch 'record-sampling-metadata' of https://github.com/michael…

2cb5dba

…osthege/pymc3 into record-sampling-metadata

stop timer before to avoid distortions

49798a2

michaelosthege merged commit 6c5254f into pymc-devs:master Mar 11, 2020

michaelosthege deleted the record-sampling-metadata branch March 11, 2020 11:34

michaelosthege mentioned this pull request Mar 30, 2020

Add warmup iterations and _group_warmup arviz-devs/arviz#1126

Merged

5 tasks

michaelosthege mentioned this pull request Apr 11, 2020

Automatically handle warmup draws and sampling metadata from_pymc3 arviz-devs/arviz#1146

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include n_tune, n_draws and t_sampling in SamplerReport #3827

Include n_tune, n_draws and t_sampling in SamplerReport #3827

michaelosthege commented Mar 6, 2020

codecov bot commented Mar 7, 2020 •

edited

Loading

ColCarroll left a comment

ColCarroll Mar 8, 2020

michaelosthege Mar 8, 2020 •

edited

Loading

michaelosthege Mar 10, 2020

ColCarroll Mar 10, 2020

michaelosthege commented Mar 11, 2020

Include n_tune, n_draws and t_sampling in SamplerReport #3827

Include n_tune, n_draws and t_sampling in SamplerReport #3827

Conversation

michaelosthege commented Mar 6, 2020

codecov bot commented Mar 7, 2020 • edited Loading

Codecov Report

ColCarroll left a comment

Choose a reason for hiding this comment

ColCarroll Mar 8, 2020

Choose a reason for hiding this comment

michaelosthege Mar 8, 2020 • edited Loading

Choose a reason for hiding this comment

michaelosthege Mar 10, 2020

Choose a reason for hiding this comment

ColCarroll Mar 10, 2020

Choose a reason for hiding this comment

michaelosthege commented Mar 11, 2020

codecov bot commented Mar 7, 2020 •

edited

Loading

michaelosthege Mar 8, 2020 •

edited

Loading