Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor AR distribution #5734

Merged
merged 3 commits into from
May 5, 2022
Merged

Refactor AR distribution #5734

merged 3 commits into from
May 5, 2022

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Apr 29, 2022

  • Adds random method
  • Adds arbitrary noise for the innovations Not yet
  • Batched dimensions are now on the left (like other RVs)
  • More extensive tests for batching
  • Removes AR1 class
    • Could not figure out why tau_e is specified as it is, and in general it lacked a good docstring and other kwargs. We can always reintroduce it, but maybe it's useful to first deprecate because of tau_e

Closes #5291
Supersedes #5388

Also considered relying on Aeppl, but the returned Scan based logp performs extremely slowly: https://gist.github.com/ricardoV94/2167ab7214affef47a86582141205bf5

@ricardoV94 ricardoV94 changed the title Refactor AR distributions Refactor AR distribution Apr 29, 2022
@codecov
Copy link

codecov bot commented Apr 29, 2022

Codecov Report

Merging #5734 (21c2e6c) into main (21c2e6c) will not change coverage.
The diff coverage is n/a.

❗ Current head 21c2e6c differs from pull request most recent head 45d0c08. Consider uploading reports for the commit 45d0c08 to get more accurate results

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #5734   +/-   ##
=======================================
  Coverage   88.95%   88.95%           
=======================================
  Files          75       75           
  Lines       13751    13751           
=======================================
  Hits        12232    12232           
  Misses       1519     1519           

@ricardoV94 ricardoV94 marked this pull request as ready for review May 2, 2022 18:06
@twiecki
Copy link
Member

twiecki commented May 3, 2022

This is pretty intense, I could have never written that. I suppose the complexity is somewhat warranted because the distribution has to handle multiple orders and batch dimensions so I'd expect there to be some shape complexity. But I suppose my hope was that with scan this could look much simpler.

Anyway, that's more of a general statement as to what it takes to create a new distribution (which I think @canyon289 also expressed), not a comment on this PR which I trust could not be simplified.

@twiecki
Copy link
Member

twiecki commented May 3, 2022

Also considered relying on Aeppl, but the returned Scan based logp performs extremely slowly: https://gist.github.com/ricardoV94/2167ab7214affef47a86582141205bf5

Is there an issue about this? Also, curious if this would have made things much simpler?

@ricardoV94
Copy link
Member Author

ricardoV94 commented May 3, 2022

This is pretty intense, I could have never written that. I suppose the complexity is somewhat warranted because the distribution has to handle multiple orders and batch dimensions so I'd expect there to be some shape complexity. But I suppose my hope was that with scan this could look much simpler.

Anyway, that's more of a general statement as to what it takes to create a new distribution (which I think @canyon289 also expressed), not a comment on this PR which I trust could not be simplified.

The reason this is complicated, is that we are not talking about a single distribution but more like a self-contained model-factory. The extra logic is there to 1) coordinate different distributions and 2) make life easier for users so that they don't need to create in advance the building blocks with the right shapes.

Once we have it in place it is also easy to extend. For instance, it is now pretty straightforward to allow for arbitrary noise distributions instead of only Normal, by letting users to pass a noise_dist like they can do with init_dist, and plug it in the appropriate places.

@ricardoV94
Copy link
Member Author

ricardoV94 commented May 3, 2022

Also considered relying on Aeppl, but the returned Scan based logp performs extremely slowly: https://gist.github.com/ricardoV94/2167ab7214affef47a86582141205bf5
Is there an issue about this? Also, curious if this would have made things much simpler?

Not really, it would only save us having to write the logp method ourselves. We still need the logic to generate the random method and do automatic resizing, which is where the new complexity comes from. This was completely missing in V3!

For the logp performance question, there is a somewhat related issue here: aesara-devs/aesara#617

)

ar = ar_op(rhos, sigma, init_dist, steps)
noise_rng.default_update = ar.owner.outputs[0]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be changed in #5738, or here, depending on what gets merged first

@canyon289
Copy link
Member

Once we have it in place it is also easy to extend. For instance, it is now pretty straightforward to allow for arbitrary noise distributions instead of only Normal, by letting users to pass a noise_dist like they can do with init_dist, and plug it in the appropriate places.

In the following comment I mean this with complete respect, its easy for you to extend Ricardo because you're the man genius that knows whats going on. What would help new contributors a lot are comments, type hints in everything, including tests, and even for lines of code that look trivial.

I didnt want to bug you too much but if the above sounds good I could indicate everywhere that I personally could use a comment to understand whats going on, would that be helpful?

@ricardoV94
Copy link
Member Author

I didnt want to bug you too much but if the above sounds good I could indicate everywhere that I personally could use a comment to understand whats going on, would that be helpful?

Perhaps take a look first at how Mixtures are implemented, it's basically the same process as in here. Seeing the commonalities might help you understand what is going on: https://github.com/pymc-devs/pymc/blob/main/pymc/distributions/mixture.py

@canyon289
Copy link
Member

I didnt want to bug you too much but if the above sounds good I could indicate everywhere that I personally could use a comment to understand whats going on, would that be helpful?

Perhaps take a look first at how Mixtures are implemented, it's basically the same process as in here. Seeing the commonalities might help you understand what is going on: https://github.com/pymc-devs/pymc/blob/main/pymc/distributions/mixture.py

I certainly will ! Be prepared that I might start requesting docstrings there as well!

Here's where I'm coming from
Nicoleta will (likely) join us for GSOC soon. When she pulls up this code how will she know to look to mixtures to understand whats going on from the code here? And then there'lll be a dev after that and another dev.

PyMC is an awesome project and its going to be around for years so in addition to building great functionality now, itll be good to ensure

I'm not saying you havent done this btw, I havent done a review yet as Ive been awake for 20 mins, just responding to the comments

@canyon289
Copy link
Member

@ricardoV94 is it fine if I review on Saturday my time? I really want to do a thorough job, both for my learning and so I can support the Time Series GSOC project

Thats my side of things , let me know what you "i want this merged by X date" timeline is and I'll work around that :)

@ricardoV94
Copy link
Member Author

Sure, I am in no rush

@ricardoV94
Copy link
Member Author

Nicoleta will (likely) join us for GSOC soon. When she pulls up this code how will she know to look to mixtures to understand what's going on from the code here? And then there will be a dev after that and another dev.

They will have to read the code and check documentation of the functions used, and if that's not enough set up a debugger and start breaking the code.

Taking to the extreme we can't include a whole Aesara tutorial as comments for future devs. If the same functionality can be achieved by simpler more maintainable means that's a different question.

@canyon289
Copy link
Member

canyon289 commented May 3, 2022

Taking to the extreme we can't include a whole Aesara tutorial as comments for future devs. If the same functionality can be achieved by simpler more maintainable means that's a different question.

Agree we cant take this to the extreme, just exploring if there is an optimum between the two

They will have to read the code and check documentation of the functions used, and if that's not enough set up a debugger and start breaking the code.

Normally id agree but realize this also is more difficult than normal code due to the way aesara and pymc is integrated. The code isn't "run directly" but sort of sucked into Aesara and recompiled and goes through a number of transformational steps that make it difficult to trace whats exactly going on.

Anyhow this is stuff thats outside of the scope of this PR, overall really excited to keep building out great functionality, and youre always quite kind and generous in mentoring others ❤️

Looking forward to reviewing this and asking a bunch of questions!

"of rho or remove constant_term"
)

if init_dist is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we should make a utility function for all time series eventually? No need on this PR just asking

Copy link
Member Author

@ricardoV94 ricardoV94 May 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah something like check_valid_dist(dist, allowed_ndim_supp=(x, y)), as this is used in all the distributions that take other distributions as inputs. Want to open an issue/PR for that?

)
eps = value[self.p :] - self.rho[0] - x
@classmethod
def ndim_supp(cls, *args):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add docstring to explain this as well. I think I get it, though a short docstring would reduce friction for those who dont know or in case I forgot :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a docstring suggestion

innov_like = Normal.dist(mu=0.0, tau=self.tau).logp(eps)
init_like = self.init.logp(value[: self.p])
@classmethod
def change_size(cls, rv, new_size, expand=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstring, especially for expand


@_get_measurable_outputs.register(AutoRegressiveRV)
def _get_measurable_outputs_ar(op, node):
# This tells Aeppl that the second output is the measurable one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# This tells Aeppl that the second output is the measurable one
"""This tells Aeppl that the second output is the measurable one."""

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be a docstring... but I can't quite put it to words why


@_moment.register(AutoRegressiveRV)
def ar_moment(op, rv, rhos, sigma, init_dist, steps, noise_rng):
# Use last entry of init_dist moment as the moment for the whole AR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Use last entry of init_dist moment as the moment for the whole AR
"""Use last entry of init_dist moment as the moment for the whole AR."""

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

np.testing.assert_allclose(ar_like, reg_like)

@pytest.mark.parametrize("constant", (False, True))
def test_batched_size(self, constant):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstring, its bordering obvious but its not obvious enough that I think a newcomer would get it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think a useful docstring would be?

assert y_eval[0].shape == (batch_size, steps)
assert not np.any(np.isclose(y_eval[0], y_eval[1]))

def test_batched_rhos(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

assert y_eval.shape == (batch_size, steps)
assert np.all(abs(y_eval[1]) < 5)

def test_batched_sigma(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for all batched tests

Copy link
Member

@canyon289 canyon289 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. Most comments are essentially add more docstrings on internal functions to aid future developers, or forgetful developers.

return 1

@classmethod
def ndim_supp(cls, *dist_params):
Copy link
Member

@canyon289 canyon289 May 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def ndim_supp(cls, *dist_params):
def ndim_supp(cls, *dist_params):
"""Refer to base class SymbolicDistribution for documentation."""

@@ -76,6 +76,14 @@ def dist(cls, dist, lower, upper, **kwargs):
check_dist_not_registered(dist)
return super().dist([dist, lower, upper], **kwargs)

@classmethod
def num_rngs(cls, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def num_rngs(cls, *args, **kwargs):
def num_rngs(cls, *args, **kwargs):
"""Refer to base class SymbolicDistribution for documentation."""

@canyon289
Copy link
Member

Thanks Ricardo and nicely done. Definitely ready for a merge and if theres anything I still have questions, or think we need docstrings which Im clearly a fan of, I can do it outside of this PR

…of RNGs needed

Reordered methods for consistency
* Deprecates AR1 distribution
* Implements random and moment methods
* Batching works on the left as with other distributions
@canyon289
Copy link
Member

@ricardoV94 I saw your comments but wont be able to give you proper replies for a couple of days. There's nothing that should hold up a merge. Please feel free to do so, but keep your branch around so I can come back to this PR later and refer to the comments to make proposals in separate issues or PRs 🎉

@ricardoV94 ricardoV94 merged commit c76b9b9 into pymc-devs:main May 5, 2022
@ricardoV94
Copy link
Member Author

Thanks for reviewing @twiecki and @canyon289

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add autoregressive model to V4
3 participants