Fix keep_size for arviz structures. #4006

rpgoldman · 2020-07-09T17:28:22Z

The keep_size kwarg for sample_posterior_predictive and fast_sample_posterior_predictive didn't handle arviz.InferenceData or xarray.Dataset arguments correctly when the keep_size option is chosen.

Addresses !4004

codecov · 2020-07-09T17:57:24Z

Codecov Report

Merging #4006 into master will increase coverage by 0.03%.
The diff coverage is 88.57%.

@@            Coverage Diff             @@
##           master    #4006      +/-   ##
==========================================
+ Coverage   86.62%   86.65%   +0.03%     
==========================================
  Files          88       88              
  Lines       14062    14090      +28     
==========================================
+ Hits        12181    12210      +29     
+ Misses       1881     1880       -1

Impacted Files	Coverage Δ
pymc3/distributions/posterior_predictive.py	`89.29% <84.84%> (+0.50%)`	⬆️
pymc3/util.py	`93.75% <89.65%> (-0.54%)`	⬇️
pymc3/sampling.py	`86.35% <95.74%> (+0.31%)`	⬆️

kyleabeauchamp · 2020-07-09T23:09:52Z

I can confirm that this PR leads to a more reasonable output shape in my driver code.

For the purposes of code review, the linter changes in posterior_predictive.py lead to a pretty massive diff. I wonder if it might make sense to do linting as a separate PR to ensure high visibility of math changes.

pymc3/distributions/posterior_predictive.py

michaelosthege

Generally looks good, but I have to admin that I probably missed half of the changes because of black formatting 🙄.

pymc3/sampling.py

michaelosthege

actually I wanted to request changes - see the thread on instantiate_steppers

rpgoldman · 2020-07-10T13:23:51Z

I can confirm that this PR leads to a more reasonable output shape in my driver code.

For the purposes of code review, the linter changes in posterior_predictive.py lead to a pretty massive diff. I wonder if it might make sense to do linting as a separate PR to ensure high visibility of math changes.

I'm not sure that's feasible, since the CI checks for conformance with black formatting. I think the problem is that somehow this file slipped in without being black formatted to begin with.

Another time I would probably make the black formatting a separate commit but I didn't expect such a major upset. Also, even if it's a separate commit, people mostly just compare the end state of a MR, so that's probably extra work to no effect.

I wish I had a better answer for you.

pymc3/sampling.py

rpgoldman · 2020-07-10T15:37:11Z

I'm going to rebase and rewrite history to try to separate out the reformatting, please wait for more updates.

Previously posterior predictive sampling functions did not properly handle the `keep_size` keyword argument when getting an xarray Dataset as parameter. Also extended these functions to accept InferenceData object as input.

rpgoldman · 2020-07-10T16:42:32Z

@kyleabeauchamp @michaelosthege I have just (force) pushed a drastic restructuring that splits the MR into code changes and reformatting. We should make sure it passes tests again -- in case I messed something up in the rebase.

Now you should be able to quickly review the code changes and separately evaluate the reformatting -- if you even care about the latter.

I appreciate your patience.

michaelosthege · 2020-07-10T20:23:00Z

@rpgoldman much nicer to read!

I didn't find a problem. Your changes are also covered by the tests - except for raising the error raised by passing the wrong argument type.

rpgoldman · 2020-07-10T21:18:47Z

I didn't find a problem. Your changes are also covered by the tests - except for raising the error raised by passing the wrong argument type.

Do you mean https://github.com/pymc-devs/pymc3/blob/de2d6149453251b9b2496627c8fadffe59a83873/pymc3/distributions/posterior_predictive.py#L210 ? Should I supply a test with an argument of inappropriate type here?

If not, please LMK what error you had in mind. Thanks!

kyleabeauchamp · 2020-07-11T03:16:23Z

I think this LGTM as well (good tests, improved typing, fixing target issue), but I'll defer to others for approval as I'm not very familiar with this part of the codebase.

Make errors consistent across sample_posterior_predictive and fast_sample_posterior_predictive, and add 2 tests.

rpgoldman · 2020-07-12T03:37:01Z

@michaelosthege Added the two tests. If this passes, and is satisfactory, please merge. Thanks!

michaelosthege · 2020-07-12T10:36:12Z

Entry into release notes is missing.. I'm in a hurry and can't find the button to edit it online..

rpgoldman · 2020-07-12T21:37:17Z

Entry into release notes is missing.. I'm in a hurry and can't find the button to edit it online..

Fixed.

* Update GP NBs to use standard notebook style (pymc-devs#3978) * update gp-latent nb to use arviz * rerun, run black * rerun after fixes from comments * rerun black * rewrite radon notebook using ArviZ and xarray (pymc-devs#3963) * rewrite radon notebook using ArviZ and xarray Roughly half notebook has been updated * add comments on xarray usage * rewrite 2n half of notebook * minor fix * rerun notebook and minor changes * rerun notebook on pymc3.9.2 and ArviZ 0.9.0 * remove unused import * add change to release notes * SMC: refactor, speed-up and run multiple chains in parallel for diagnostics (pymc-devs#3981) * first attempt to vectorize smc kernel * add ess, remove multiprocessing * run multiple chains * remove unused imports * add more info to report * minor fix * test log * fix type_num error * remove unused imports update BF notebook * update notebook with diagnostics * update notebooks * update notebook * update notebook * Honor discard_tuned_samples during KeyboardInterrupt (pymc-devs#3785) * Honor discard_tuned_samples during KeyboardInterrupt * Do not compute convergence checks without samples * Add time values as sampler stats for NUTS (pymc-devs#3986) * Add time values as sampler stats for NUTS * Use float time counters for nuts stats * Add timing sampler stats to release notes * Improve doc of time related sampler stats Co-authored-by: Alexandre ANDORRA <andorra.alexandre@gmail.com> Co-authored-by: Alexandre ANDORRA <andorra.alexandre@gmail.com> * Drop support for py3.6 (pymc-devs#3992) * Drop support for py3.6 * Update RELEASE-NOTES.md Co-authored-by: Colin <ColCarroll@users.noreply.github.com> Co-authored-by: Colin <ColCarroll@users.noreply.github.com> * Fix Mixture distribution mode computation and logp dimensions Closes pymc-devs#3994. * Add more info to divergence warnings (pymc-devs#3990) * Add more info to divergence warnings * Add dataclasses as requirement for py3.6 * Fix tests for extra divergence info * Remove py3.6 requirements * follow-up of py36 drop (pymc-devs#3998) * Revert "Drop support for py3.6 (pymc-devs#3992)" This reverts commit 1bf867e. * Update README.rst * Update setup.py * Update requirements.txt * Update requirements.txt Co-authored-by: Adrian Seyboldt <aseyboldt@users.noreply.github.com> * Show pickling issues in notebook on windows (pymc-devs#3991) * Merge close remote connection * Manually pickle step method in multiprocess sampling * Fix tests for extra divergence info * Add test for remote process crash * Better formatting in test_parallel_sampling Co-authored-by: Junpeng Lao <junpenglao@gmail.com> * Use mp_ctx forkserver on MacOS * Add test for pickle with dill Co-authored-by: Junpeng Lao <junpenglao@gmail.com> * Fix keep_size for arviz structures. (pymc-devs#4006) * Fix posterior pred. sampling keep_size w/ arviz input. Previously posterior predictive sampling functions did not properly handle the `keep_size` keyword argument when getting an xarray Dataset as parameter. Also extended these functions to accept InferenceData object as input. * Reformatting. * Check type errors. Make errors consistent across sample_posterior_predictive and fast_sample_posterior_predictive, and add 2 tests. * Add changelog entry. Co-authored-by: Robert P. Goldman <rpgoldman@sift.net> * SMC-ABC add distance, refactor and update notebook (pymc-devs#3996) * update notebook * move dist functions out of simulator class * fix docstring * add warning and test for automatic selection of sort sum_stat when using wassertein and energy distances * update release notes * fix typo * add sim_data test * update and add tests * update and add tests * add docs for interpretation of length scales in periodic kernel (pymc-devs#3989) * fix the expression of periodic kernel * revert change and add doc * FIXUP: add suggested doc string * FIXUP: revertchanges in .gitignore * Fix Matplotlib type error for tests (pymc-devs#4023) * Fix for issue 4022. Check for support for `warn` argument in `matplotlib.use()` call. Drop it if it causes an error. * Alternative fix. * Switch from pm.DensityDist to pm.Potential to describe the likelihood in MLDA notebooks and script examples. This is done because of the bug described in arviz-devs/arviz#1279. The commit also changes a few parameters in the MLDA .py example to match the ones in the equivalent notebook. * Remove Dirichlet distribution type restrictions (pymc-devs#4000) * Remove Dirichlet distribution type restrictions Closes pymc-devs#3999. * Add missing Dirichlet shape parameters to tests * Remove Dirichlet positive concentration parameter constructor tests This test can't be performed in the constructor if we're allowing Theano-type distribution parameters. * Add a hack to statically infer Dirichlet argument shapes Co-authored-by: Brandon T. Willard <brandonwillard@users.noreply.github.com> Co-authored-by: Bill Engels <w.j.engels@gmail.com> Co-authored-by: Oriol Abril-Pla <oriol.abril.pla@gmail.com> Co-authored-by: Osvaldo Martin <aloctavodia@gmail.com> Co-authored-by: Adrian Seyboldt <aseyboldt@users.noreply.github.com> Co-authored-by: Alexandre ANDORRA <andorra.alexandre@gmail.com> Co-authored-by: Colin <ColCarroll@users.noreply.github.com> Co-authored-by: Brandon T. Willard <brandonwillard@users.noreply.github.com> Co-authored-by: Junpeng Lao <junpenglao@gmail.com> Co-authored-by: rpgoldman <rpgoldman@goldman-tribe.org> Co-authored-by: Robert P. Goldman <rpgoldman@sift.net> Co-authored-by: Tirth Patel <tirthasheshpatel@gmail.com> Co-authored-by: Brandon T. Willard <971601+brandonwillard@users.noreply.github.com>

rpgoldman requested review from michaelosthege, aloctavodia and lucianopaz July 9, 2020 18:20

rpgoldman self-assigned this Jul 9, 2020

rpgoldman linked an issue Jul 9, 2020 that may be closed by this pull request

sample_posterior_predictive flattens chains and draws #4004

Closed

rpgoldman added bug request discussion labels Jul 9, 2020

michaelosthege reviewed Jul 10, 2020

View reviewed changes

pymc3/distributions/posterior_predictive.py Outdated Show resolved Hide resolved

michaelosthege reviewed Jul 10, 2020

View reviewed changes

pymc3/distributions/posterior_predictive.py Outdated Show resolved Hide resolved

michaelosthege approved these changes Jul 10, 2020

View reviewed changes

pymc3/sampling.py Show resolved Hide resolved

michaelosthege requested changes Jul 10, 2020

View reviewed changes

michaelosthege approved these changes Jul 10, 2020

View reviewed changes

pymc3/sampling.py Show resolved Hide resolved

rpgoldman force-pushed the iss4004 branch from 60a029c to 9014096 Compare July 10, 2020 15:36

rpgoldman added 2 commits July 10, 2020 11:10

Fix posterior pred. sampling keep_size w/ arviz input.

de2d614

Previously posterior predictive sampling functions did not properly handle the `keep_size` keyword argument when getting an xarray Dataset as parameter. Also extended these functions to accept InferenceData object as input.

Reformatting.

0cb6306

rpgoldman force-pushed the iss4004 branch from 9014096 to 0cb6306 Compare July 10, 2020 16:40

Check type errors.

eb59c5e

Make errors consistent across sample_posterior_predictive and fast_sample_posterior_predictive, and add 2 tests.

rpgoldman changed the title ~~WIP: Fix keep_size for arviz structures.~~ Fix keep_size for arviz structures. Jul 12, 2020

Add changelog entry.

1df5e4c

michaelosthege approved these changes Jul 12, 2020

View reviewed changes

michaelosthege merged commit 692a09f into pymc-devs:master Jul 12, 2020

eigenfoo mentioned this pull request Jul 19, 2020

New commits to pymc3/sampling.py or pymc3/step_methods/hmc/ eigenfoo/littlemcmc#81

Closed

twiecki mentioned this pull request Jul 23, 2020

sample_posterior_predictive does not handle InferenceData well #4025

Closed

kyleabeauchamp added this to the 3.9.3 milestone Jul 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix keep_size for arviz structures. #4006

Fix keep_size for arviz structures. #4006

rpgoldman commented Jul 9, 2020

codecov bot commented Jul 9, 2020 •

edited

Loading

kyleabeauchamp commented Jul 9, 2020 •

edited

Loading

michaelosthege left a comment

michaelosthege left a comment

rpgoldman commented Jul 10, 2020

rpgoldman commented Jul 10, 2020

rpgoldman commented Jul 10, 2020

michaelosthege commented Jul 10, 2020

rpgoldman commented Jul 10, 2020

kyleabeauchamp commented Jul 11, 2020

rpgoldman commented Jul 12, 2020

michaelosthege commented Jul 12, 2020

rpgoldman commented Jul 12, 2020

Fix keep_size for arviz structures. #4006

Fix keep_size for arviz structures. #4006

Conversation

rpgoldman commented Jul 9, 2020

codecov bot commented Jul 9, 2020 • edited Loading

Codecov Report

kyleabeauchamp commented Jul 9, 2020 • edited Loading

michaelosthege left a comment

Choose a reason for hiding this comment

michaelosthege left a comment

Choose a reason for hiding this comment

rpgoldman commented Jul 10, 2020

rpgoldman commented Jul 10, 2020

rpgoldman commented Jul 10, 2020

michaelosthege commented Jul 10, 2020

rpgoldman commented Jul 10, 2020

kyleabeauchamp commented Jul 11, 2020

rpgoldman commented Jul 12, 2020

michaelosthege commented Jul 12, 2020

rpgoldman commented Jul 12, 2020

codecov bot commented Jul 9, 2020 •

edited

Loading

kyleabeauchamp commented Jul 9, 2020 •

edited

Loading