Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to DifferentiationInterface #111

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

gdalle
Copy link

@gdalle gdalle commented Jul 29, 2024

Hi @olivierlabayle,
My new package DifferentiationInterface.jl is pretty much becoming the successor of AbstractDifferentiation.jl, so I took the liberty of opening this small PR to help you make the switch if you're interested

@gdalle
Copy link
Author

gdalle commented Jul 30, 2024

Okay this is a bit more complicated than I thought and I can't do it blindly. Can you tell me a little more about the objects f and point_estimates for which you compute derivatives? Apparently the output of f is either a scalar or a vector? What is the type of the individual point estimates?

@olivierlabayle
Copy link
Member

Okay this is a bit more complicated than I thought and I can't do it blindly. Can you tell me a little more about the objects f and point_estimates for which you compute derivatives? Apparently the output of f is either a scalar or a vector? What is the type of the individual point estimates?

Hi @gdalle, thank you for taking the time to open this PR. Could you tell me a bit more about your new DifferentiationInterface.jl as compared to AbstractDifferentiation.jl and the problems it solves. I can't see any deprecation or migration notes on their github.

The f function is any differentiable function taking a vector of real numbers point_estimates and outputing a real number or a vector thereof. I have mostly resorted to splatting for now to pass the inputs since point_estimates is usually not so high dimensional in the use cases of this package.

@gdalle
Copy link
Author

gdalle commented Jul 30, 2024

DI takes most of its inspiration from AbstractDifferentiation, while learning from its few design shortcomings. The main improvements wrt AD are the breadth of coverage (a dozen supported backends), the caching mechanism and support for mutation, the solid testing and benchmarking infrastructure, as well as the reliance on ADTypes for backend specification. The main limitation of DI right now is that it only supports a single argument, but it should be okay for your use case if we put the estimates into a vector.

DI is already being adopted by the SciML ecosystem, and aims to become a central component of the Julia package ecosystem. Part of this involves me spontaneously asking users (like yourself) what they need 😉

Is there a way to know a priori whether f outputs a number or a vector? In DI, the relevant operators have different names: gradient and Jacobian respectively

@olivierlabayle
Copy link
Member

Thanks that looks very promising and I'm looking forward to integrating AbstractDifferentiation.

The problem I see at the moment with single argument function is that it would be breaking since users would have to define f as a single argument function. Most of the functions fprovided by users are pretty simple (e.g. f(x,y,z) = [y-x, z-y]) and the jacobian could be computed easily by hand. The interface is rather a convenience than a necessity and changing the interface would make the function less readable and easy to write for a user.

I don't know if there is an easy way to look into f's output unfortunately, it is a priori completely unknown from this package.

So I'd say supporting multiple argument functions and a seemless way to compute jacobian or gradient would be ideal for the change :-)

@gdalle
Copy link
Author

gdalle commented Jul 31, 2024

Multiple argument support is not really an issue because we can always collect these arguments into a Vector before differentiating.
The uncertainty on the output type of f is more problematic, because the gradient and the Jacobian are fundamentally different objects (even though in the scalar output case one is the transpose of the other). What do those two situations correspond to in your package?

@olivierlabayle
Copy link
Member

I'm sorry I'm not sure I understand the question, but basically the purpose of this package is to estimate statistical quantities that can be multidimensional e.g. [x, y, z]. Once this is done, more quantities can be of further interest, e.g. differences. One could be interested in only one difference f(x,y,z) = y - z or all of them f(x,y,z) = [y-x, z-y]. The function f is really specified dynamically as a post analysis and even though we could force the user to return a vector f(x,y,z) = [y - z], this is not very natural. Does that make sense?

@gdalle
Copy link
Author

gdalle commented Aug 1, 2024

I think I have a way to handle this, which is to turn everything into a vector under the hood and always compute a Jacobian. Will update the PR accordingly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants