Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding new notebook for conjugate sampling #4199

Merged
merged 11 commits into from
Nov 19, 2020
Merged

Adding new notebook for conjugate sampling #4199

merged 11 commits into from
Nov 19, 2020

Conversation

ckrapu
Copy link
Contributor

@ckrapu ckrapu commented Oct 28, 2020

This is a short piece of documentation showing how to combine NUTS with Gibbs sampling from a closed form posterior for a hierarchical Dirichlet multinomial model.

I acknowledge this may be a somewhat controversial item to include in the documentation because of the frequency with which users inquire about replacing NUTS with something that draws samples more rapidly despite being less efficient in ESS / second. In the particular case that is discussed in this notebook, it happens to be more efficient to do alternating NUTS / conjugate sampling updates in a compound Gibbs sampling scheme.

Any thoughts?

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@MarcoGorelli
Copy link
Contributor

I can't (yet) pretend to be able to comment on whether it's a good idea to "combine NUTS with Gibbs sampling from a closed form posterior for a hierarchical Dirichlet multinomial model" (maybe attending PyMCon will help 😄 ), but I'll just leave a quick comment regarding the CI failure (in case it's not clear):

nbqa-black...............................................................Failed
- hook id: nbqa-black
- files were modified by this hook

you can fix it with

pre-commit run nbqa-black --files docs/source/notebooks/sampling_conjugate_step.ipynb

@ckrapu
Copy link
Contributor Author

ckrapu commented Oct 29, 2020

pre-commit run nbqa-black --files docs/source/notebooks/sampling_conjugate_step.ipynb

Ah darn it - I didn't use the pre-commit process right. Thanks for the command.

@codecov
Copy link

codecov bot commented Oct 29, 2020

Codecov Report

Merging #4199 (193266c) into master (c5c9a14) will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4199      +/-   ##
==========================================
+ Coverage   88.93%   88.95%   +0.01%     
==========================================
  Files          92       92              
  Lines       14806    14806              
==========================================
+ Hits        13168    13170       +2     
+ Misses       1638     1636       -2     
Impacted Files Coverage Δ
pymc3/distributions/continuous.py 93.16% <0.00%> (+0.23%) ⬆️

@ricardoV94
Copy link
Member

Might be interesting to mention if anything changes with the introduction of the Dirichlet-Multinomial distribution (PR: #3639). Does having an explicit compound distribution change performance to a similar extent as this strategy?

@ckrapu
Copy link
Contributor Author

ckrapu commented Nov 16, 2020

Might be interesting to mention if anything changes with the introduction of the Dirichlet-Multinomial distribution (PR: #3639). Does having an explicit compound distribution change performance to a similar extent as this strategy?

These two topics are closely related but this notebook shows how to sample from the Dirichlet random vector used to parameterize the multinomial distribution while the compound distribution marginalizes over that random vector instead. It is probably worth updating this notebook to mention that using the DCM is likely much faster if you are not interested in the low-level Dirichlet draws.

Copy link
Contributor

@AlexAndorra AlexAndorra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good @ckrapu! Thanks for the initiative, and sorry for the long time to review, the last few weeks have been hectic...
I like the way you introduce, contextualize and explain the problem 👌 It's almost ready to go; I just left some comments and suggestions below -- feel free to ask if anything's unclear 😉
Note you'll have to merge master into this branch to fix the conflict.

Regarding the Dirichlet-Multinomial PR, I agree with you:

It is probably worth updating this notebook to mention that using the DCM is likely much faster if you are not interested in the low-level Dirichlet draws.

BTW this PR has unfortunately stalled, so if you or @ricardoV94 feel like pushing it over the finish line that'd really be a great addition 🤩

@@ -0,0 +1,634 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are often constrained by the of the worst-performing


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot this typo ;)

Copy link
Contributor

@AlexAndorra AlexAndorra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ckrapu, that looks really good now! Just one last typo you forgot and one last question, and it'll be good to go 😉

@ckrapu
Copy link
Contributor Author

ckrapu commented Nov 18, 2020

Thanks @ckrapu, that looks really good now! Just one last typo you forgot and one last question, and it'll be good to go 😉

Thanks for the review! I really appreciate the attention to detail.

Copy link
Contributor

@AlexAndorra AlexAndorra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good, thanks for this nice notebook @ckrapu ! I'll merge once tests pass 🎉

PS: I relaunched some of the jobs, which weirdly failed with a very cryptic message 🤷‍♂️

There was an error in the .travis.yml file from which we could not recover.
Unfortunately, we do not know much about this error.
Please review https://docs.travis-ci.com, or contact us at support@travis-ci.com with the error ID: 38be43ecc95a469f8a3b50222008d246

@twiecki
Copy link
Member

twiecki commented Nov 18, 2020

Should add this to release notes too, it's a good one.

@AlexAndorra
Copy link
Contributor

Ah right! Will merge once you've added that then @ckrapu 😉

@twiecki twiecki merged commit ec0ee58 into pymc-devs:master Nov 19, 2020
@twiecki
Copy link
Member

twiecki commented Nov 19, 2020

Thanks @ckrapu, this is a great contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants