Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom distribution's random method fails if it depends on transformed RVs #2900

Closed
lucianopaz opened this issue Mar 14, 2018 · 0 comments
Closed

Comments

@lucianopaz
Copy link
Contributor

lucianopaz commented Mar 14, 2018

Description of your problem

When trying to define a custom Continuous distribution which depends on random variables that are automatically transformed to *_log__ or *_logodds__ inside a Model, pymc3.draw_values fails.

Please provide a minimal, self-contained, and reproducible example.

Provided in the following gist

The example that I show uses a custom normally distributed variable c that depends on the variables a and b, which are automatically transformed.

Please provide the full traceback.

bar.c.random() failed with the following traceback
Traceback (most recent call last):
  File "test_pymc3_custom_distribution.py", line 39, in <module>
    bar.c.random()
  File "/usr/local/lib/python2.7/dist-packages/pymc3/model.py", line 41, in __call__
    return getattr(self.obj, self.method_name)(*args, **kwargs)
  File "test_pymc3_custom_distribution.py", line 16, in random
    a, b = pm.distributions.draw_values([self.a, self.b], point=point)
  File "/usr/local/lib/python2.7/dist-packages/pymc3/distributions/distribution.py", line 218, in draw_values
    givens[name] = (node, _draw_value(node, point=point))
  File "/usr/local/lib/python2.7/dist-packages/pymc3/distributions/distribution.py", line 285, in _draw_value
    func = _compile_theano_function(param, variables)
  File "/usr/local/lib/python2.7/dist-packages/pymc3/memoize.py", line 20, in memoizer
    cache[key] = obj(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/pymc3/distributions/distribution.py", line 247, in _compile_theano_function
    allow_input_downcast=True)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function.py", line 317, in function
    output_keys=output_keys)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 486, in pfunc
    output_keys=output_keys)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1839, in orig_function
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1487, in __init__
    accept_inplace)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 181, in std_fgraph
    update_mapping=update_mapping)
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/fg.py", line 175, in __init__
    self.__import_r__(output, reason="init")
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/fg.py", line 356, in __import_r__
    raise MissingInputError("Undeclared input", variable=variable)
MissingInputError: Undeclared input

Testing pm.distributions.draw_values([c])
pm.distributions.draw_values([bar.c]) failed with the following traceback
Traceback (most recent call last):
  File "test_pymc3_custom_distribution.py", line 49, in <module>
    pm.distributions.draw_values([bar.c])
  File "/usr/local/lib/python2.7/dist-packages/pymc3/distributions/distribution.py", line 221, in draw_values
    values.append(_draw_value(param, point=point, givens=givens.values()))
  File "/usr/local/lib/python2.7/dist-packages/pymc3/distributions/distribution.py", line 279, in _draw_value
    return param.random(point=point, size=None)
  File "/usr/local/lib/python2.7/dist-packages/pymc3/model.py", line 41, in __call__
    return getattr(self.obj, self.method_name)(*args, **kwargs)
  File "test_pymc3_custom_distribution.py", line 16, in random
    a, b = pm.distributions.draw_values([self.a, self.b], point=point)
  File "/usr/local/lib/python2.7/dist-packages/pymc3/distributions/distribution.py", line 218, in draw_values
    givens[name] = (node, _draw_value(node, point=point))
  File "/usr/local/lib/python2.7/dist-packages/pymc3/distributions/distribution.py", line 285, in _draw_value
    func = _compile_theano_function(param, variables)
  File "/usr/local/lib/python2.7/dist-packages/pymc3/memoize.py", line 20, in memoizer
    cache[key] = obj(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/pymc3/distributions/distribution.py", line 247, in _compile_theano_function
    allow_input_downcast=True)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function.py", line 317, in function
    output_keys=output_keys)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 486, in pfunc
    output_keys=output_keys)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1839, in orig_function
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1487, in __init__
    accept_inplace)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 181, in std_fgraph
    update_mapping=update_mapping)
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/fg.py", line 175, in __init__
    self.__import_r__(output, reason="init")
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/fg.py", line 356, in __import_r__
    raise MissingInputError("Undeclared input", variable=variable)
MissingInputError: Undeclared input

Please provide any additional information below.

After taking a closer look at what happens when draw_values is called inside of Foo.random, it appears that in the call draw_values([self.a, self.b]), the random methods of both self.a and self.b never get called because the named nodes Bar_a_log__ and Bar_b_logodds__ are found first. This means that a drawn value of a will try to use a value drawn from Bar_a_log__ and Bar_b_logodds__ as a given. When the call, to _draw_value([Bar_a_log__]) happens, this variable has its random attribute set to None, and it is then interpreted as a theano function with no input given to it.

Versions and main components

  • PyMC3 Version: 3.3
  • Theano Version: 1.0.1
  • Python Version: Both 2.7.12 and 3.5.2
  • Operating system: Ubuntu 16.04
  • How did you install PyMC3: pip
junpenglao pushed a commit that referenced this issue Mar 21, 2018
* Fix for #2900. Changed the way in which draw_values handles the named node-inputs. Now the tree dependence is constructed to set the givens dict.

* Fixed conflicts

* Fixed more conflicts

* Fixed typo

* Changed test_dep_vars to test for successful draws even in the cases of dependent variables.

* Removed comments from test_random.py, and distribution.py. Added content to RELEASE-NOTES. Fixed bug in the interaction between draw_values, _draw_value and _compile_theano_function. In some cases, draw_values would set an item of the givens dictionary to a theano.tensor.TensorConstant. In _draw_value(param, ...), if param was a theano.tensor.TensorVariable without a random method, and not set in point, _compile_theano_function would be called, using as one of its variables, a theano.tensor.TensorConstant. This lead to TypeError: ("Constants not allowed in param list",...) exceptions being raised. The fix was to skip the inclusion into the givens dictionary of named nodes that were instances of theano.tensor.TensorConstant, because their value would already be available for theano during the function compilation.

* Fixed another bug which was similar to the theano.tensor.TensorConstant, but it occurred on theano.tensor.sharedvar.SharedVariable instances. The error that was raised was similar, SharedVariables cannot be supplied as raw input to theano.function. The fix was the same as for TensorConstants, skip them when constructing the givens dictionary.

* Guarded against a potencial bug. In draw_values, when skipping for TensorConstant and SharedVariable types, these nodes could be added to the stack again later because their names would not be in givens.keys(). To counter that, a separate set, `stored`, with the names of nodes that are either stored in givens or whos values should be available to theano.function, is used to chose which nodes to add to the stack.

* Syntax change based on twiecki's comment.

* Extended RELEASE-NOTES.md to also mention the sharedvar.ShareVariable fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants