Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shape issue when prior predictive sampling from ZeroInflatedPoisson #3310

Closed
twiecki opened this issue Dec 18, 2018 · 3 comments · Fixed by #3319
Closed

Shape issue when prior predictive sampling from ZeroInflatedPoisson #3310

twiecki opened this issue Dec 18, 2018 · 3 comments · Fixed by #3319

Comments

@twiecki
Copy link
Member

twiecki commented Dec 18, 2018

with pm.Model() as model:
    θ = pm.Beta('θ', alpha=1, beta=1)
    ψ = pm.HalfNormal('ψ', sd=1)
    s = pm.ZeroInflatedPoisson('suppliers', psi=ψ, theta=θ, 
                               shape=20)
    gen_data = pm.sample_prior_predictive(samples=5000)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-24-aef3a9b7b63a> in <module>()
      4     s = pm.ZeroInflatedPoisson('suppliers', psi=ψ, theta=θ, 
      5                                shape=20)
----> 6     gen_data = pm.sample_prior_predictive(samples=5000)

~/working/projects/pymc/pymc3/sampling.py in sample_prior_predictive(samples, model, vars, random_seed)
   1323     names = get_default_varnames(model.named_vars, include_transformed=False)
   1324     # draw_values fails with auto-transformed variables. transform them later!
-> 1325     values = draw_values([model[name] for name in names], size=samples)
   1326 
   1327     data = {k: v for k, v in zip(names, values)}

~/working/projects/pymc/pymc3/distributions/distribution.py in draw_values(params, point, size)
    374                     # the stack of nodes to try to draw from. We exclude the
    375                     # nodes in the `params` list.
--> 376                     stack.extend([node for node in named_nodes_parents[next_]
    377                                   if node is not None and
    378                                   node.name not in drawn and

~/working/projects/pymc/pymc3/distributions/distribution.py in _draw_value(param, point, givens, size)
    468                 # shape inspection for ObservedRV
    469                 dist_tmp = param.distribution
--> 470                 try:
    471                     distshape = param.observations.shape.eval()
    472                 except AttributeError:

~/working/projects/pymc/pymc3/model.py in __call__(self, *args, **kwargs)
     41 
     42     def __call__(self, *args, **kwargs):
---> 43         return getattr(self.obj, self.method_name)(*args, **kwargs)
     44 
     45 

~/working/projects/pymc/pymc3/distributions/discrete.py in random(self, point, size)
    848                              dist_shape=self.shape,
    849                              size=size)
--> 850         return g * (np.random.random(np.squeeze(g.shape)) < psi)
    851 
    852     def logp(self, value):

ValueError: operands could not be broadcast together with shapes (5000,20) (5000,) 
@twiecki
Copy link
Member Author

twiecki commented Dec 18, 2018

CC @lucianopaz

@lucianopaz
Copy link
Contributor

theta and psi are scalars but when we draw 5000 samples from their distribution, we get them as 1D arrays of shape (5000,) instead of (5000, 1) which would broadcast well. I'll check to see what we need to do to get these things to work

@lucianopaz
Copy link
Contributor

This is kind of a recurrent problem that we face with distributions that inside of their random method do many calls to different random number generators apart from the main generate_samples. All ZeroInflated* classes have the same problem. The Mixture class has a similar but harder problem (#3270). The Multinomial class had a similar problem (#3271). The thing is that generate_samples does some logic on how things should broadcast and returns samples of the correct shape in the end, but the random method should also rebroadcast the distribution parameters being aware of the size input. I'll try to make a more generic fix than just patching up ZeroInflatedPoisson.

lucianopaz added a commit to lucianopaz/pymc that referenced this issue Dec 21, 2018
…ps broadcasting multiple rvs calls with different size and distribution parameter shapes. Added shape guards to other continuous distributions.
twiecki pushed a commit that referenced this issue Dec 22, 2018
* Fixed #3310. Added broadcast_distribution_samples, which helps broadcasting multiple rvs calls with different size and distribution parameter shapes. Added shape guards to other continuous distributions.

* Fixed broken continuous distributions. Did not notice that _random got a parameter shape aware size input thanks to generate_samples.

* Fixed lint error

* Addressed comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants