Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include n_tune, n_draws and t_sampling in SamplerReport #3827

Merged

Conversation

michaelosthege
Copy link
Member

This PR adds three new (optional) properties to SamplerReport:

  • n_tune: Number of tune iterations.
  • n_draws: Number of draw iterations.
  • t_sampling: Number of seconds that the sampling procedure took. (Includes parallelization overhead.)

While n_draws may be retrieved from the trace, information about n_tune is lost unless discard_tuned_samples=True. However, ArviZ currently does not look at the "tune" sampler stat, making it very inconvenient to keep tuning iterations and use ArviZ at the same time.

The t_sampling (wall-clock) time is not kept automatically, but super useful for comparing sampler efficiency. We could also use it directly in the benchmarks..

I've included it in the _log_summary at INFO level:
image

What do you think?

@codecov
Copy link

codecov bot commented Mar 7, 2020

Codecov Report

Merging #3827 into master will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3827      +/-   ##
==========================================
+ Coverage   90.77%   90.79%   +0.01%     
==========================================
  Files         135      135              
  Lines       21074    21113      +39     
==========================================
+ Hits        19130    19169      +39     
  Misses       1944     1944              
Impacted Files Coverage Δ
pymc3/backends/report.py 92.96% <100.00%> (+0.79%) ⬆️
pymc3/sampling.py 85.16% <100.00%> (+0.33%) ⬆️
pymc3/tests/test_sampling.py 99.62% <100.00%> (+<0.01%) ⬆️

Copy link
Member

@ColCarroll ColCarroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this! I think it will be especially valuable if we implement some alternate stopping logic: like, if we make it easy to sample 1000 effective samples, this logging will be useful.

Left a few thoughts, but you're also welcome to merge and fix later.

pymc3/sampling.py Outdated Show resolved Hide resolved
@@ -151,7 +175,8 @@ def _add_warnings(self, warnings, chain=None):
warn_list.extend(warnings)

def _log_summary(self):

if self._n_tune is not None and self._n_draws is not None and self._t_sampling is not None:
logger.info(f'Sampling {self.n_tune} tune and {self.n_draws} draw iterations took {self.t_sampling:.0f} seconds.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this include the number of chains and total number of draws? That might clarify how we count: For 1000 draws in 4 chains, the progressbar will go to 4000, so a message like "Sampling 500 tune and 1000 draws in 4 chains (2000 plus 4000 draws total) took 18 seconds." might be helpful.

What do you think?

Also, I might use {self.n_tune:,d} and {self.n_draws:,d} to get commas in the number, but that's a very weak desire.

Copy link
Member Author

@michaelosthege michaelosthege Mar 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add implement your suggestions tomorrow - thanks!

As a German, I read 10,000 as float(10), but maybe we can go with the pythonic neutral ground of 10_000? (According to SI, one should use a thin space U+202F, but in monospaced fonts that's somewhat pointless.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is with the update I just pushed:

5 chains sequentially, interrupted during tuning of the first:
image

Interrupted during the second chain:
image

Not interrupted:
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha! I thought there was a locale-safe version of this. this looks great to me -- feel free to merge.

@michaelosthege
Copy link
Member Author

Finally. This took embarrassingly many commits to go green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants