Difference in beta support between pymc3 and scipy results in error in sample_prior_predictive() #3898

zack-kimble · 2020-04-30T20:01:15Z

Description of your problem

Please provide a minimal, self-contained, and reproducible example.

import numpy as np
import pymc3 as pm

idx = np.random.randint(0,1000,10000)

with pm.Model() as model:
    mu = pm.Beta('mu', alpha=.01, beta=.01, shape=1000)
    nu = pm.Gamma('nu', mu=30, sigma=5, shape=1000)
    theta = pm.Beta('theta', alpha=mu[idx] * nu[idx],  beta=(1-mu[idx])*nu[idx], shape=10000)
    
    
with model:
    prior_sample = pm.sample_prior_predictive()

Please provide the full traceback.

Traceback (most recent call last):

  File "<ipython-input-39-03487933ffab>", line 14, in <module>
    prior_sample = pm.sample_prior_predictive()

  File "/home/zack/miniconda3/envs/_/lib/python3.8/site-packages/pymc3/sampling.py", line 1495, in sample_prior_predictive
    values = draw_values([model[name] for name in names], size=samples)

  File "/home/zack/miniconda3/envs/_/lib/python3.8/site-packages/pymc3/distributions/distribution.py", line 617, in draw_values
    value = _draw_value(next_,

  File "/home/zack/miniconda3/envs/_/lib/python3.8/site-packages/pymc3/distributions/distribution.py", line 787, in _draw_value
    return param.random(point=point, size=size)

  File "/home/zack/miniconda3/envs/_/lib/python3.8/site-packages/pymc3/model.py", line 56, in __call__
    return getattr(self.obj, self.method_name)(*args, **kwargs)

  File "/home/zack/miniconda3/envs/_/lib/python3.8/site-packages/pymc3/distributions/continuous.py", line 1264, in random
    return generate_samples(stats.beta.rvs, alpha, beta,

  File "/home/zack/miniconda3/envs/_/lib/python3.8/site-packages/pymc3/distributions/distribution.py", line 960, in generate_samples
    samples = generator(size=dist_bcast_shape, *args, **kwargs)

  File "/home/zack/miniconda3/envs/_/lib/python3.8/site-packages/scipy/stats/_distn_infrastructure.py", line 966, in rvs
    raise ValueError("Domain error in arguments.")

ValueError: Domain error in arguments.

Please provide any additional information below.

I am using beta as a hyperprior for other beta distributions and parameterizing based on mean and sample size. As a result, if the random variable from the hyper prior is 0 or 1, alpha or beta will end up 0, which is outside their domain. pymc3.Beta has support on (0,1), so I haven't seen an issue with sampling posteriors. When running sample_prior_predictive(), I get a domain error, which I believe is because scipy.stats.beta().rvs() will happily return 0 or 1.

Maybe this is just because of truncation in float64 (1 is far more common than 0), but I noticed the scipy.stats.beta documentation does state support on [0,1]. Not sure whether or not that makes it a scipy issue instead, but thought it was worth mentioning here.

Versions and main components

PyMC3 Version: 3.8
Theano Version: 1.0.4
Python Version: 3.8.1
Operating system: Ubuntu 16.04
How did you install PyMC3: (conda/pip) conda-forge

AlexAndorra · 2020-05-01T08:55:11Z

Hi Zachary,
Thanks for reporting! This seems like a bug to me, as I'm under the impression that the Beta distribution's support should be [0, 1] and alpha and beta are both >= 0 (see here for instance). But maybe someone knows why alpha and beta must be > 0 in pm.Beta?

zack-kimble · 2020-05-01T17:20:28Z

Alpha and beta have to be > 0. The pdf of Beta contains Γ(α) and Γ(β). The gamma function is infinite at zero.

fonnesbeck · 2020-05-05T04:02:47Z

Some parameterizations/transformations have support on the closed interval, but for ours it should be open.

rpgoldman · 2020-05-05T23:14:37Z

Some parameterizations/transformations have support on the closed interval, but for ours it should be open.

@fonnesbeck Shouldn't the beta only be 0 when α = β = 1? This is the definition in the doc page for scipy.stats.beta:

And if x = 0, then x^{a - 1} can only be non-zero when a = 1.

@zack-kimble 's example shows (and I checked this) that scipy.stats.beta(a, b).rvs() can return 0 when a = b = 0.1, which seems wrong.

This is not sufficiently my thing that I don't feel confident reporting this to the scipy.stats project -- @zack-kimble would you do that?

I also don't know what we should do about this. Writing a high performance RNG for betas using numpy is really not my thing, but if someone has one that doesn't have this pathology, I'd be happy to try to plug it in.

Alternatively, should we just return np.amax(samples, np.finfo(float).eps) where we previously returned samples? That's a quick and brutal fix.

fonnesbeck · 2020-05-13T02:12:15Z

When alpha and beta are 1, the beta distribution just becomes a uniform distribution.

rpgoldman · 2020-05-13T04:42:13Z

@fonnesbeck I have replicated this issue, though, and we can get x = 0 when alpha and beta are 0.1, which is definitely wrong.

@zack-kimble We are using scipy.stats.beta so I think yes, you should report this as a bug to them. I was easily able to replicate the issue invoking their beta random generator with a=b=0.01 as in your example, without any involvement of PyMC3. Please keep us posted!

lucianopaz · 2020-05-13T21:57:20Z

@zack-kimble, we've managed to identify that this is a problem caused by numpy.random.beta, and have opened an issue and a PR over at numpy. If the numpy devs agree with us that the beta's pdf domain should be the open (0, 1) interval, the fix will only become available once numpy makes a new release.

zack-kimble · 2020-05-14T02:06:27Z

@lucianopaz, @rpgoldman Thanks for tracking it back to numpy and opening a PR there. Sorry I didn't follow up myself there earlier!

lucianopaz · 2020-05-14T21:57:44Z

After some discussion with the numpy devs, their random generator will remain as is, and we will clip the values of the samples it generates to lie in the open (0, 1) interval in pymc3. I hope to be able to open a PR tomorrow.

lucianopaz mentioned this issue May 13, 2020

Beta random number generator can produce values outside its domain numpy/numpy#16230

Closed

lucianopaz self-assigned this May 14, 2020

lucianopaz mentioned this issue May 15, 2020

Cap the values that Beta.random can generate. #3924

Merged

5 tasks

AlexAndorra closed this as completed in #3924 May 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference in beta support between pymc3 and scipy results in error in sample_prior_predictive() #3898

Difference in beta support between pymc3 and scipy results in error in sample_prior_predictive() #3898

zack-kimble commented Apr 30, 2020 •

edited

Loading

AlexAndorra commented May 1, 2020

zack-kimble commented May 1, 2020

fonnesbeck commented May 5, 2020

rpgoldman commented May 5, 2020

fonnesbeck commented May 13, 2020

rpgoldman commented May 13, 2020

lucianopaz commented May 13, 2020

zack-kimble commented May 14, 2020

lucianopaz commented May 14, 2020

Difference in beta support between pymc3 and scipy results in error in sample_prior_predictive() #3898

Difference in beta support between pymc3 and scipy results in error in sample_prior_predictive() #3898

Comments

zack-kimble commented Apr 30, 2020 • edited Loading

Versions and main components

AlexAndorra commented May 1, 2020

zack-kimble commented May 1, 2020

fonnesbeck commented May 5, 2020

rpgoldman commented May 5, 2020

fonnesbeck commented May 13, 2020

rpgoldman commented May 13, 2020

lucianopaz commented May 13, 2020

zack-kimble commented May 14, 2020

lucianopaz commented May 14, 2020

zack-kimble commented Apr 30, 2020 •

edited

Loading