Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theano error when only 1 value is masked #3122

Closed
JackCaster opened this issue Jul 28, 2018 · 9 comments
Closed

Theano error when only 1 value is masked #3122

JackCaster opened this issue Jul 28, 2018 · 9 comments

Comments

@JackCaster
Copy link
Contributor

This issues follows the discussion on discourse. I encountered an error when I have a numpy masked array with only 1 observation missing. In this case, Theano raises the error:

TypeError: Cannot convert Type TensorType(int64, vector) (of Variable obs_t_minus_1_missing_missing_shared__) into Type TensorType(int64, (True,)). You can try to manually convert obs_t_minus_1_missing_missing_shared__ into a TensorType(int64, (True,)).

Instead, if I have 2 obervations missing, everything works fine. I have attached a Jupyter notebook to recreate the problem (link). In the notebook change the variable NUM_MISSING_DATAPOINTS to 1 or 2 to investigate the issue.

PyMC3 version 3.5
Theano version 1.0.2

@junpenglao
Copy link
Member

Thanks. Oh the good old shape problem...

@mattpitkin
Copy link
Contributor

mattpitkin commented Jan 9, 2019

I think I've found a fix to this (although it may be a bit of a hack) that works for me. In model.py, if I add the following after line 439:

if isinstance(var.tag.test_value, np.ndarray):
    if len(var.tag.test_value) == 1:
        shared.type = theano.tensor.TensorType(var.dtype, (True,))

the problem is solved. I don't know if this could introduce any other issue, but I think it should be ok.

@junpenglao should I make a PR with this change?

mattpitkin added a commit to mattpitkin/pymc3 that referenced this issue Jan 9, 2019
 - this patch fixes an issue highlighted in pymc-devs#3122
   where imputation of a single missing observation fails.
   It implements the suggestion from the theano error message
   to manual force a TensorType change in cases where the
   variable has a length of one.
mattpitkin added a commit to mattpitkin/pymc3 that referenced this issue Jan 9, 2019
 - this patch fixes an issue highlighted in pymc-devs#3122
   where imputation of a single missing observation fails.
   It implements the suggestion from the theano error message
   to manual force a TensorType change in cases where the
   variable has a length of one.
@twiecki
Copy link
Member

twiecki commented Jan 10, 2019

@mattpitkin Sure, PRs are welcome. It does seem like a hack but maybe OK to get around a corner case.

@lucianopaz
Copy link
Contributor

@mattpitkin, I think that what you propose is a bit dangerous because it applies a 1d tag.test_value to anything. If it did so to a matrix or higher ranked tensor, then I imagine all sorts of errors would pop out. I think that this change should be constrained to as_tensor, where the missing values are handled. Maybe just adding a np.atleast_1d here could do the trick.

@mattpitkin
Copy link
Contributor

@lucianopaz I'd done some playing around with setting things in as_tensor, but my approaches hadn't worked there. I'll test using np.atleast_1d though to see if fixes the problem.

@mattpitkin
Copy link
Contributor

mattpitkin commented Jan 10, 2019

I think I've figured out the cause of the problem. When the fakedist variable in as_tensor is created, the distribution's TensorType is set by the pymc3-defined TensorType here, rather than theano's TensorType class. For a 1d int64 array with a single value this sets the distributions TensorType to:

TensorType(int64, (True,))

whereas for a 1d array containing more than one value this sets the distributions TensorType to:

TensorType(int64, vector)

This is fine... until in the ValueGradFunction class the shared variables are being set. Here, it takes the variable's test_value (which for our problem will be a 1d np.array of length 1), and uses theano's shared class. When this converts the array to a theano tensor variable it obviously isn't using pymc3's TensorType definition, and the shared variable is created with the type:

TensorType(int64, vector)

So, the variable's distribution and its test value have incompatible TensorTypes, which causes this problem.

Therefore, I don't think that this can be fixed in as_tensor (it seems you can't set the testval as a tensor value and change it's TensorType using the pymc3 TensorType definition, because in ValueGradFunction it uses the variable's tag value, which is the numpy array (and shared has to be passed a numerical array and not a tensor anyway!)), and I imaging you wouldn't want to change pymc3's TensorType definition. But, a potentially safer fix than mine current one in ValueGradFunction might be (after line 439):

from .distributions import TensorType  # have this at the start of the class
if hasattr(var.tag.test_value, 'shape'):
    testtype = TensorType(var.dtype, var.tag.test_value.shape)

    # fix the type
    if testtype != shared.type:
        shared.type = testtype

@lucianopaz @twiecki what are your thoughts on this option? I can test this and update my PR #3335.

@mattpitkin
Copy link
Contributor

The above change, works in my own basic tests, but I'll have to wait and see if #3335 passes the Travis CI build tests.

@mattpitkin
Copy link
Contributor

mattpitkin commented Jan 11, 2019

A different option would be to make a change in as_tensor (after line 1271) like the following:

if data.mask.sum() == 1:
    broadcastable = [False] # manually set as vector
else:
    broadcastable = None

fakedist = NoDistribution.dist(shape=data.mask.sum(), dtype=dtype, testval=testval, parent_dist=distribution, broadcastable=broadcastable)

But, this means that single values will be set as a vector, which I assume pymc3's own TensorType definition was put in to avoid.

mattpitkin added a commit to mattpitkin/pymc3 that referenced this issue Jan 14, 2019
mattpitkin added a commit to mattpitkin/pymc3 that referenced this issue Jan 15, 2019
twiecki pushed a commit that referenced this issue Jan 15, 2019
* model.py: fix issue with length 1 shared variables
 - this patch fixes an issue highlighted in #3122
   where imputation of a single missing observation fails.
   It implements the suggestion from the theano error message
   to manual force a TensorType change in cases where the
   variable has a length of one.

* model.py: add a comment about the previous change

* model.py: extra check to deal with test errors

* model.py: changes to the length 1 fix

* test_model.py: check shared tensor type conversion in ValueGradFunction

 - test the error described in #3122 and fixed in #3335.

* RELEASE-NOTES.md: added mention of fix to #3122

* Update RELEASE-NOTES.md

fix typo

Co-Authored-By: mattpitkin <m@ttpitk.in>
@junpenglao
Copy link
Member

Close by #3335

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants