Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add icdf functions for distributions #6612

Open
16 of 38 tasks
Tracked by #7053
michaelraczycki opened this issue Mar 18, 2023 · 21 comments
Open
16 of 38 tasks
Tracked by #7053

Add icdf functions for distributions #6612

michaelraczycki opened this issue Mar 18, 2023 · 21 comments

Comments

@michaelraczycki
Copy link
Contributor

michaelraczycki commented Mar 18, 2023

Description

We are looking for help to implement inverse cumulative distribution (ICDF) functions for our distributions!

How to help?

This PR should give a template on how to implement and test new icdf functions for distributions: #6528

ICDF functions allow users to get the value associated with a specific cumulative probability.

So far we've added 2 examples for continuous distribution

  • Uniform:
    def icdf(value, lower, upper):
    res = lower + (upper - lower) * value
    res = check_icdf_value(res, value)
    return check_icdf_parameters(res, lower < upper)
  • Normal:
    def icdf(value, mu, sigma):
    res = mu + sigma * -np.sqrt(2.0) * pt.erfcinv(2 * value)
    res = check_icdf_value(res, value)
    return check_icdf_parameters(
    res,
    sigma > 0,
    msg="sigma > 0",
    )

And an example for a discrete distribution:

  • Geometric:
    def icdf(value, p):
    res = pt.ceil(pt.log1p(-value) / pt.log1p(-p)).astype("int64")
    res = check_icdf_value(res, value)
    return check_icdf_parameters(
    res,
    0 <= p,
    p <= 1,
    msg="0 <= p <= 1",
    )

Multiple sources describing the icdf function for any specific distribution can be found, you're free to choose which one is working for you. To start with I recommend checking:

New tests have to be added in test_continuous.py for continuous distributions, and test_discrete.py for discrete ones. You can use existing tests as a template:

check_icdf(
pm.Normal,
{"mu": R, "sigma": Rplus},
lambda q, mu, sigma: st.norm.ppf(q, mu, sigma),
)

Don't hesitate to ask any questions. You can grab as many distributions to implement moments as you want. Just make sure to write in this issue so that we can keep track of it.

Profit with your new open source KARMA!

The following distributions don't have an icdf method implemented:

Note that not all of the icdf equations will have closed solution, so it's recommended to first start with the ones that can be found in closed form, as they will be easier to implement and will contribute to the task further with providing other contributors with templates to understand the topic better. The list above is not final, and I'll try to update it to contain all distributions available for taking.

@gokuld
Copy link
Contributor

gokuld commented Mar 23, 2023

Hi @michaelraczycki or anyone else reviewing this, this is my first attempt to contribute to PyMC, so please let me know if this PR makes sense.
If this turns out to be useful, I am happy to add ICDF functions for more distributions.

@michaelraczycki
Copy link
Contributor Author

Hey @gokuld! Thank you for your contribution, it look promising. For the future reference please add a comment under the issue, letting others know that you're starting to work on specific issue/ parts of the issue. This assure that you're not working in parallel with someone else on the same part of the development.

@gokuld
Copy link
Contributor

gokuld commented Mar 24, 2023

Hey @gokuld! Thank you for your contribution, it look promising.

Thanks @michaelraczycki .

For the future reference please add a comment under the issue, letting others know that you're starting to work on specific issue/ parts of the issue. This assure that you're not working in parallel with someone else on the same part of the development.

Sure, I will post a comment when I start work to avoid parallel duplicate work from next time!

@gokuld
Copy link
Contributor

gokuld commented Mar 31, 2023

Hey all, I am starting work on the ICDF for the continuous beta distribution. Let me know if anyone else is working on this already.
(@michaelraczycki)

@michaelraczycki
Copy link
Contributor Author

@gokuld if there's no comment under this issue saying that someone reserves it you don't need to ask. Just call it and it's yours :)
Also in case so for any reason you can't / don't want to work on the issue anymore please also let us know here.
Good luck!

@ricardoV94
Copy link
Member

ricardoV94 commented Mar 31, 2023

@gokuld AFAICT the inverse CDF of the beta distribution doesn't have a closed form solution, so you would need an iterative algorithm which may not be trivial to write if you are not familiar with PyTensor. Ignore if you were aware of the fact ;)

@gokuld
Copy link
Contributor

gokuld commented Mar 31, 2023

@michaelraczycki sure! Thank you.

In addition I will also be implementing ICDFs for these:
pymc.distributions.continuous.Kumaraswamy
pymc.distributions.continuous.Exponential

@ricardoV94 Yes, however I discovered this only after I started working on the ICDF function for the beta distribution. I was about to create a betaincinv function in pytensor. However I might need likely need some review of the approach / help here (especially with implementing the gradient in pytensor) and was about to open that as a draft PR.
If the iterative approach you mention turns out to be simpler to implement, I will go for it. I need to know more about that. Perhaps we can discuss this in the draft PR for the beta ICDF.

@james-2001
Copy link
Contributor

I'll be picking up

  • Laplace
  • Pareto

Cheers 😄

@gokuld
Copy link
Contributor

gokuld commented May 4, 2023

I have no immediate plans of finishing the work on the beta distribution, anyone else interested may pick it up.

@michaelraczycki
Copy link
Contributor Author

Good luck @james-2001 , and than you for your contribution @gokuld !

james-2001 added a commit to james-2001/pymc that referenced this issue May 6, 2023
Adds ICDF (quantile) function for the laplace distribution. Source https://en.wikipedia.org/wiki/Laplace_distribution

Issue pymc-devs#6612
james-2001 added a commit to james-2001/pymc that referenced this issue May 6, 2023
Adds ICDF (quantile) function for the Pareto distribution. Source https://en.wikipedia.org/wiki/Pareto_distribution

Issue pymc-devs#6612
james-2001 added a commit to james-2001/pymc that referenced this issue May 6, 2023
Adds ICDF (quantile) function for the Pareto distribution. Source https://en.wikipedia.org/wiki/Pareto_distribution

Issue pymc-devs#6612
ricardoV94 pushed a commit that referenced this issue May 11, 2023
Adds ICDF (quantile) function for the laplace distribution. Source https://en.wikipedia.org/wiki/Laplace_distribution

Issue #6612
ricardoV94 pushed a commit that referenced this issue May 11, 2023
Adds ICDF (quantile) function for the Pareto distribution. Source https://en.wikipedia.org/wiki/Pareto_distribution

Issue #6612
@amyoshino
Copy link
Member

I will work on the LogNormal :)

@amyoshino
Copy link
Member

I'm tackling now:

  • Cauchy distribution
  • Logistic distribution

😄

@ricardoV94
Copy link
Member

ricardoV94 commented Jun 16, 2023

This should suffice for all the "Half"Distributions:

def icdf(value, *args)
  return icdf(abs(Full.dist(*args), value))

Where Full is the equivalent non-half version of the distribution (Normal for HalfNormal, Student for HalfStudent and so on).

Update: No, I don't think it will work, because our automatic cdf is also wrong for these

@amyoshino
Copy link
Member

amyoshino commented Jun 16, 2023

Thanks for the tips, I am going to work on and add to the next PR the Half Cauchy and Half Normal implementation since I am working on the lognormal already and just got the Cauchy icdf merged.

Once I get used to this approach I can try to adapt it to the other "Halfs".

@ricardoV94
Copy link
Member

ricardoV94 commented Jun 23, 2023

Thanks for checking out the half-dist idea @amyoshino, you made me realize these don't work as other transforms and the automatic icdf should raise. I opened a PR for that effect: #6793

@amyoshino
Copy link
Member

amyoshino commented Jun 24, 2023

Thanks for checking out the half-dist idea @amyoshino, you made me realize these don't work as other transforms and the automatic icdf should raise. I opened a PR for that effect: #6793

@ricardoV94 I'm glad I was able to help! 😄

@amyoshino
Copy link
Member

amyoshino commented Jun 27, 2023

I will now work on the Triangular, Weibull, Gumbel and Moyal distributions :)

@amyoshino
Copy link
Member

It looks like the remaining ones have no closed form (not that I have found so far). I will give it a try on developing the icdf functions for the remaining ones. It might take a while but I will do my best to get used to all we need as fast as possible.
So, just to get some focus in here, I will start with some that require the Inverse Regularized Gamma function and Inverse Regularized Beta function to get used to its implementation:

  • Gamma Distribution
  • ChiSquared Distribution
  • Beta Distribution
  • StudentT Distribution

@niknow
Copy link

niknow commented Jun 15, 2024

I will work on Binomial now.

@fireddd
Copy link

fireddd commented Sep 20, 2024

Hi, are there any more functions that need to be added here?

@ricardoV94
Copy link
Member

Hi, are there any more functions that need to be added here?

I'm not 100% if the list is up to date, but it suggests missing distributions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants