Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Last dimension is dropped by random when size is one #3896

Closed
huard opened this issue Apr 28, 2020 · 26 comments
Closed

Last dimension is dropped by random when size is one #3896

huard opened this issue Apr 28, 2020 · 26 comments
Labels

Comments

@huard
Copy link

huard commented Apr 28, 2020

Description of your problem

The last dimension is squeezed when generating random samples. This creates a problem when converting posterior predictive samples to DataArray objects.

In the following example, I would expect the second case to have shape (10, 1, 2, 1).

>>> pm.distributions.Uniform.dist(lower=np.zeros((1,2,3)), upper=np.ones((1,2,3))).random(size=10).shape
(10, 1, 2, 3)

>>> pm.distributions.Uniform.dist(lower=np.zeros((1,2,1)), upper=np.ones((1,2,1))).random(size=10).shape
(10, 1, 2)

Versions and main components

  • PyMC3 Version: 3.8
  • Theano Version: 1.0.4
  • Python Version: 3.6.8
  • Operating system: Ubuntu
  • How did you install PyMC3: pip
@twiecki
Copy link
Member

twiecki commented Apr 29, 2020

Hey David, I'll ping @lucianopaz -- wizard of shapes.

Thanks for PyMC2 :).

@lucianopaz
Copy link
Contributor

@huard, @twiecki, I became aware of this some time ago, while refactoring pymc3.distributions.distribution.generate_samples. When the last axis of a distribution's shape is 1, or when we ask for random(..., size=1) of a scalar distribution, the last singleton axis is dropped. I'm totally against dropping these axis, but it is something that pymc3 has always done, and it is tested to do so (for some reason, probably regarding samples from scalar distributions). I'm in favor of changing generate_samples to prevent it from dropping the singleton axis, the main challenge with this will be to refactor all the tests that currently explicitly test that random(..., size=1) of a scalar distribution returns a scalar, instead of an array of shape (1,).

@rpgoldman rpgoldman changed the title Last dimension if dropped by random when size is one Last dimension is dropped by random when size is one May 5, 2020
@huard
Copy link
Author

huard commented May 6, 2020

@lucianopaz Thanks for the explanation. If you make the fix, I can then spend some time adjusting the tests.

@twiecki You're welcome, fun times !

@ricardoV94
Copy link
Member

#4214 seems to have fixed this. I can reproduce before but not after.

Feel free to re-open if the problem subsists.

@twiecki
Copy link
Member

twiecki commented Jan 19, 2021

Nice!

@MarcoGorelli
Copy link
Contributor

Nice that it's fixed, thanks for checking @ricardoV94 !

Reopening though as it would be good to include this little example as a test case to make sure this continues working in the future, and this makes for a good beginner-friendly issue

@ricardoV94
Copy link
Member

@MarcoGorelli that's a good point. I assumed the new tests after that PR would cover this issue but that is not necessarily the case (even less so since the issue was not even linked)

@canyon289
Copy link
Member

canyon289 commented Mar 17, 2021

@almostmeenal Assigning you to this one per your comment here #4214

Quick tip: Its better to comment on the issue you want to work on, rather than a related PR, as this issue would be the more relevant thread of discussion

Edit. Seems that github is not allowing me to do so but consider yourself assigned.

@mjhajharia
Copy link
Member

@almostmeenal Assigning you to this one per your comment here #4214

Quick tip: Its better to comment on the issue you want to work on, rather than a related PR, as this issue would be the more relevant thread of discussion

Edit. Seems that github is not allowing me to do so but consider yourself assigned.

cool, thanks.

@OriolAbril
Copy link
Member

Seems that github is not allowing me to do so but consider yourself assigned.

GitHub only allows assigning issues to team members or people who have commented on it, which is again another motive to comment on the issue, then we can assign it to you to indicate you are working on it

@canyon289
Copy link
Member

@almostmeenal Do you feel like you have a good idea of how to approach this issue or do you want to talk it through? Were happy answer any questions now or later if you have them too

@mjhajharia
Copy link
Member

thanks, um so I'm sorry for the delay i was occupied with some college work, I'll try to get to it today and get back to you tomorrow with concerns, does that work?

@OriolAbril
Copy link
Member

You can get back to us with questions at any time, and don't worry too much about the timeline or speed of work, we all have other commitments in addition to open source, you should work at a pace that is comfortable for you and fits with the rest of your schedule.

@canyon289
Copy link
Member

canyon289 commented Mar 20, 2021

@almostmeenal I agree with what Oriol said. The code will be here, take care of yourself first!

@OriolAbril
Copy link
Member

I'm not sure about what are we supposed to take a look at to see if it works, if you can clarify I'll happily do it.

@OriolAbril
Copy link
Member

um I attached a png image of the graph and the code below

Ok, I have already taken a look at the code and image, I was confused because it is on a different issue. Thanks for clarifying :)

@canyon289
Copy link
Member

@almostmeenal Can you repost the image and code in this issue? Im unable to find it as well :)

@OriolAbril
Copy link
Member

You have to go to the issue about the interpolated docstring: #3859 (comment)

@mjhajharia
Copy link
Member

I'm really sorry, I replied from mail and hence replied to the wrong issue in confusion.

@mjhajharia
Copy link
Member

idk if im missing something but I don't see the drop of last dimension here
image

@OriolAbril
Copy link
Member

The issue was already fixed by #4214, however, the PR did not add a test to check for this and ensure that posterior PRs don't break this again. You'll see that the issue was originally closed but was then reopened as a remainder to add the test.

@Sayam753
Copy link
Member

Sayam753 commented Mar 23, 2021

Sorry for joining in late to the discussion.
In V3, we have a base test class that checks for shape correctness of random samples. The parametrized tests there are reused by each distribution.

@michaelosthege had already added unit tests including parametrizing distribution_shape and size ending in 1 via #4214. See this file diff. So, I suspect this issue has already been resolved.

Also, I think it would be better if efforts are redirected towards solving #4554 which aims at a complete revamp of test_distributions_random :)

@mjhajharia
Copy link
Member

mjhajharia commented Mar 23, 2021

@MarcoGorelli that's a good point. I assumed the new tests after that PR would cover this issue but that is not necessarily the case (even less so since the issue was not even linked)
@Sayam753 Yeah I initially thought it was fixed and now I'm not sure? does this mean all the tests for distribution shape are done? @canyon289 um essentially, should I unassign myself and look at #4554 or go through the previous PR to see if something is missing or not robust enough.

@canyon289
Copy link
Member

@almostmeenal It seems like theres a lot of "noise" around this issue so unassigning it to you since its not exactly clear what the need is.

For the other pymc devs. Removing the beginner label since this conversation, and timing with our new release, is too confusing for a beginner to navigate now,

@Sayam753
Copy link
Member

Since the discussion was about having tests and we already have that tests ----> closing the issue.

@canyon289
Copy link
Member

Thanks @Sayam753

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants