Option for a functional API? #84

adamboche · 2019-02-20T09:42:39Z

Hello! I'm excited to see all the cool ideas going on in the new PyMC, and I'm looking forward to using it for real. I've been following the development a little, and had an idea I wanted to run by you. I'm still new to PyMC, so please correct me if I get anything wrong.

One of the distinctive features of PyMC is its usage of context managers for building models, like this:

with pm.Model() as model:
    eta = pm.Normal("eta", 0, 1, shape=J)
    mu = pm.Normal("mu", 0, sd=1e6)
    tau = pm.HalfCauchy("tau", 5)
    theta = pm.Deterministic("theta", mu + tau * eta)
    obs = pm.Normal("obs", theta, sd=sigma, observed=y)
    trace_h = pm.sample(1000)

plot_summary(model)

This kind of API is powerful in that it allows users to transparently access the sampling backend without extra work, and it makes common workflows really quick and easy. The decorator-based @pm.model API has similar advantages. The developer guide explains the power and flexibility that comes out of this design.

The design also has some side effects:

It relies on hidden global mutable state to manage the context, which can be hard for some users to understand. It's not always clear what must be done inside versus outside the context manager, or what state is attached to which objects.
It couples the model to the data -- there's no concept of a model in the absence of its observed data.
It requires passing the name of the each variable to the variable's constructor. This could be avoided by hacking the AST, but that would be rather less robust, and the Python AST is documented as unstable: "The abstract syntax itself might change with each Python release".

I've been wondering about some possible API designs. Some of them may have been discussed and rejected already; please forgive me if I'm being redundant.

One idea that might be familiar to Python developers might be using a class per model, something like this:

@model
class MyModel:
    J = ConstantInteger()
    eta = Normal(0, 1, shape=J)
    mu = Normal(0, sd=1e6)
    tau = HalfCauchy(5)
    theta = Deterministic(mu + tau * eta)


# Any of these functions could be methods instead.
model = MyModel()
observed = observe(model, data)
trace = sample(observed)
plot_summary(trace)

I'm not 100% sure that it can do everything PyMC needs, but, from my (possibly naive) perspective, having an option like this might have some benefits:

All the necessary state can live on the model instance, rather than in a global context or on the distribution objects. Simple functions (or methods) connect the objects of the API, making it composable and easy to use in a library.
The model can exist independent of any observed data.
No AST hacking is necessary to give each distribution a name. The setup can be done in a class decorator, as in the popular attrs library, or in the attribute initialization through the descriptor protocol, each of which produces plain ol' python objects without hidden state.

There's a lot to explore in this design space. If this seems interesting to people, I'm happy to discuss or try out some implementation ideas, to see if something like this could be possible, and if it'd be nice. I'd love to hear your thoughts! 🙂

The text was updated successfully, but these errors were encountered:

twiecki · 2019-02-20T10:31:06Z

I like it. One key question is if the model class can be initialized twice as we currently do with the function. We have to call it twice in different contexts currently, once to create RVs and gather the tensors, and then once to create the logp tensor with inputs from step 1.

twiecki · 2019-03-02T13:50:43Z

@adamboche Any more thoughts on this?

adamboche · 2019-03-02T20:30:01Z

@twiecki Sorry for the delay; I'm hoping for a moment to experiment with it this week.

jt-lab · 2019-03-12T20:18:22Z

I like the idea of separating model and data in this way. Would make applying the model to batches of datasets (e.g. to simulations for power estimations) much more convenient.

adamboche · 2019-03-16T23:12:00Z

I started trying out some basic ideas but it'll require more thought and experimentation -- nothing I have is usable and I wouldn't recommend implementing their current incarnation. In case anyone wants to read along, my code is available.

Some things I like so far:

A model is defined declaratively on a class which becomes a plain python class immediately
The model produces instances which are very basic
Defining the model happens separately from combining it with data

On the other hand, there is still a bit more magic involved than I'd like, and there's an issue of how to represent variables that depend on other variables.

twiecki · 2019-03-17T22:07:35Z

that looks quite interesting, is this functional or just pseudo code?

twiecki · 2019-03-18T08:36:45Z

Just looked a bit over the code base, looks really nice. Unless I missed it, the key missing piece is construction of a tensor-in-tensor-out logp function.

twiecki · 2019-09-05T12:58:34Z

Closing due to inactivity.

Padarn · 2023-03-31T13:35:21Z

Curious if this ever went anywhere? I see no linked issues but I'd not be surprised if it was picked up elsewhere

twiecki · 2023-04-06T11:53:12Z

@Padarn pymc4 is no more, check out pymc 5: https://github.com/pymc-devs/pymc

Padarn · 2023-04-08T00:11:30Z

Thanks @twiecki. I had seen that but couldn't find discussion on this topic there. Would you suggest opening an issue to discuss in the pymc repo?

twiecki · 2023-04-09T14:08:49Z

I'd be curious what use-case you're after.

Padarn · 2023-04-10T00:08:16Z

Sure. Nothing very specific, I just thought the proposal here was quite nice and made the API easier to understand. Teaching people about the context manager API has been a hurdle in getting people to work on pymc code in my team.

twiecki · 2023-04-12T13:55:29Z

Curious, it hasn't been a problem in my experience, but I also don't go into detail of what it does. In any case, we probably won't provide an alternative API in PyMC.

Padarn · 2023-04-12T14:41:48Z

Got it, totally reasonable. Thanks for your responses.

junpenglao added the discussion Open for input label Mar 18, 2019

twiecki closed this as completed Sep 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option for a functional API? #84

Option for a functional API? #84

adamboche commented Feb 20, 2019

twiecki commented Feb 20, 2019

twiecki commented Mar 2, 2019

adamboche commented Mar 2, 2019

jt-lab commented Mar 12, 2019

adamboche commented Mar 16, 2019

twiecki commented Mar 17, 2019

twiecki commented Mar 18, 2019

twiecki commented Sep 5, 2019

Padarn commented Mar 31, 2023

twiecki commented Apr 6, 2023

Padarn commented Apr 8, 2023

twiecki commented Apr 9, 2023

Padarn commented Apr 10, 2023

twiecki commented Apr 12, 2023

Padarn commented Apr 12, 2023

Option for a functional API? #84

Option for a functional API? #84

Comments

adamboche commented Feb 20, 2019

twiecki commented Feb 20, 2019

twiecki commented Mar 2, 2019

adamboche commented Mar 2, 2019

jt-lab commented Mar 12, 2019

adamboche commented Mar 16, 2019

twiecki commented Mar 17, 2019

twiecki commented Mar 18, 2019

twiecki commented Sep 5, 2019

Padarn commented Mar 31, 2023

twiecki commented Apr 6, 2023

Padarn commented Apr 8, 2023

twiecki commented Apr 9, 2023

Padarn commented Apr 10, 2023

twiecki commented Apr 12, 2023

Padarn commented Apr 12, 2023