Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation of some distributions is confusing #3688

Closed
michaelosthege opened this issue Nov 21, 2019 · 4 comments · Fixed by #3837
Closed

Documentation of some distributions is confusing #3688

michaelosthege opened this issue Nov 21, 2019 · 4 comments · Fixed by #3837

Comments

@michaelosthege
Copy link
Member

Description of your problem

Most distributions support multiple parameterizations that are hard to grasp from the little description that we have. In some cases the scipy.stats documentation is not helping either.

Proposal / Ideas for improvement

  • state what the default parameterization would be
  • show examples with both (equal) parameterizations
  • link to the scipy.stats distribution that we use internally
  • link to the PyMC3 source code of the distribution (because that's what ultimately counts)
  • maybe we should insert , *, to the kwargs at some point so parameterization becomes explicit?
@michaelosthege michaelosthege changed the title Documentation of distributions is confusing Documentation of some distributions is confusing Nov 21, 2019
@svjunkie
Copy link

svjunkie commented Dec 4, 2019

@michaelosthege, can you provide some examples of distributions that could use clearer documentation? I reviewed some of the continuous distributions and found the following:

  • Normal distributions in the Normal family define the relationship between sd and tau; they use the mean-sd parameterization in the pdf definition
  • Beta distribution uses alpha and beta parameters in its definition of the pdf, followed by formulas to convert to mean-sd parameterization
  • studentT distribution accepts arguments named either sigma or sd, but if sd is given, that value is assigned to sigma as shown in source here, i.e. they're equivalent. I don't see that reflected in the documentation; are you suggesting making this explicit in the docs?

After exploring a bit, I'm curious - are you asking to make explicit which parameters take precedence if two parameters describing the same characteristic of a distribution are given?

This wasn't an exhaustive review, but it'd be helpful to understand which distributions you're referencing to help focus additional efforts

@lucianopaz
Copy link
Contributor

@svjunkie, the parameter name sd is deprecated, and should be replaced by sigma everywhere. If you find places without a deprecation warning, it would be great if you added it. The deprecation was done two releases ago, I think, so this could be the last release that still hassd on the distribution's signature.

@michaelosthege
Copy link
Member Author

If I remember correctly, I found the Lognormal quite confusing.

Ahanmr added a commit to Ahanmr/pymc3 that referenced this issue Mar 19, 2020
Ahanmr added a commit to Ahanmr/pymc3 that referenced this issue Mar 19, 2020
@Ahanmr
Copy link
Contributor

Ahanmr commented Mar 19, 2020

@lucianopaz I have added deprecationWarning whenever sd is passes into Normal function or any continuous, mixture or timeseries distribution function where sd is still an argument and sd is passed instead of sigma. Please do review it and let me know if there are any changes to make. Thanks.

@junpenglao junpenglao linked a pull request Mar 19, 2020 that will close this issue
twiecki pushed a commit that referenced this issue Mar 19, 2020
* partially fixes #3688

* fixes #3688 sd deprecated to sigma
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants