-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unbalanced OT in OptimalTransport.jl #9
Comments
I saw, very nice work! When I created this package, OptimalTransport.jl had a python dependency and GPL license and so I decided to try my own. In retrospect those things probably were fine anyway for my purpose which was to make VisualStringDistances.jl to add a visual distance check to RegistryCI.jl (well, maybe not the python dependency, not sure we want big deps for RegistryCI). But it was a good chance to learn a bit anyway. I think currently UnbalancedOptimalTransport implements a different algorithm than OptimalTransport for the unbalanced sinkhorn. It’s been awhile since I’ve looked at these things but I if I remember correctly the one here is more numerically stable for small epsilon. I also test against Convex.jl which provides some additional reassurance. |
Yes, this was very unfortunate and therefore I also reimplemented some things in some other packages initially. Fortunately, the Python dependency was removed (JuliaOptimalTransport/OptimalTransport.jl#29 and JuliaOptimalTransport/OptimalTransport.jl#64) and the license was changed to MIT (JuliaOptimalTransport/OptimalTransport.jl#21).
This was my first impression as well. It would be interesting to check if the algorithm here is numerically more stable, I assumed that the algorithm by Chizat et al. was already optimized for numerical stability (but it was a while ago when I read the paper so this might be wrong). In any case I guess we might want to add the alternative algorithm to OptimalTransport.jl as well (in particular with JuliaOptimalTransport/OptimalTransport.jl#100 it should be easy to provide multiple algorithms with a unified interface). |
Regarding numerical stability, I went back to check the paper; I followed Section 6.1.2 of
and saw they wrote there that
where CPSV18 refers to the Chizat paper. So at least they claim so, aha. I remember I wasn't sure if I wanted regularization or not so taking ε small was of interest to me. I think it would be great to add this algorithm to OptimalTransport.jl. I will probably keep maintaining this package though (at least while it doesn't require much maintenance, as is the current state), since I think it's nice to have a simple dependency-free version for RegistryCI (and I believe @baggepinnen's SpectralDistances uses it as well, though perhaps he might be interested in the greater variety of options provided by OptimalTransport.jl). |
@zsteve @davibarreira I think maybe it would make sense to reorganize OptimalTransport.jl at some point and move implementations to more lightweight packages such as ExactOptimalTransport.jl, EntropicOptimalTransport.jl, UnbalancedOptimalTransport.jl or QuadraticOptimalTransport.jl (if it actually helps to reduce dependencies), possibly with a lightweight interface and types that are used by multiple packages in something like OptimalTransportBase.jl. Then OptimalTransport.jl could just reexport these individual packages, similar to DifferentialEquations.jl, but users could also depend on the more lightweight packages directly. |
I like that! I think the DiffEq system is a great model to follow. I think they actually have a paper on the design of the API: https://escholarship.org/uc/item/0tz878vq (edit: though you probably know it better than I do, since you helped develop some DiffEq libraries if I'm remembering correctly?). I think with an OptimalTransportBase, that "moving out" part wouldn't even need to happen all at once or immediately, because having that would already allow e.g. this UnbalancedOptimalTransport to implement the API while only depending on the light OptimalTransportBase. |
Yeah, I like the DiffEq system but I guess I'm biased a bit since I am involved there quite a bit. This is also why I tried to push JuliaOptimalTransport/OptimalTransport.jl#100 😄 |
I have no problem breaking down in many packages. I just don't know how much dependencies this would actually free-up. Would you say that OptimalTransport.jl is a "heavy" package? |
Of course, there are much heavier packages but it accumulated quite some dependencies over time (e.g., now also NNlib for batched multiplication which however is not required for something like ExactOptimalTransport). We would have to check to what extent dependencies can be reduced in individual packages but in any case smaller packages would lead to reduced compilation and loading time. |
Definitely not super heavy, but MOI for example is a very big package. Actually, since I have these numbers already, these are the 15 biggest packages in General:
(Several of the bigger ones use automated code generation). Of course, it's great to use MOI when appropriate, but for say UnbalancedOptimalTransport which is currently 447 LoC with no dependencies, MOI alone is adding two orders of magnitude more code. |
Here's the full list of transitive dependencies of OptimalTransport.jl btw:
edit: numbers unfortunately include lines of docstrings, ref XAMPPRocky/tokei#763 |
One other good part about splitting them up is that it could actually make the precompilation time of the meta package OptimalTransport itself faster, even if it just re-exports all the individual packages, because on Julia 1.6+ it precompiles in parallel. (But the code that it can now precompile in parallel is the code of the OptimalTransport* packages themselves since it already can precompile the dependencies in parallel, so it might not be a such big gain because OptimalTransport is only 1k LoC. But it could be more important in the future as more algos are implemented). |
@devmotion not sure if OptimalTransport has this issue, but I found a somewhat subtle bug in my implementation, so I thought I'd let you know in case similar issues affect the other packages: #11 |
Thanks @ericphanson. At the moment we've not implemented any of the proxdiv operators for divergences other than the KL (... so it seems like we're OK for now?) but indeed I will keep it in mind. Just a question regarding the way sinkhorn_divergence is implemented in your package: you're using Prop 12 of [SFVTP19], which is a dual formula for the unbalanced sinkhorn divergence. This lower bounds the primal value in general and would be equal if convergence is achieved (+ convexity conditions on the choice of \phi-divergence, I guess). Is there any particular reason to use the dual formulas, rather than the primal formula? |
I just learnt about the existence of this package and thought you might be interested to know that OptimalTransport.jl (former zsteve/OptimalTransport.jl that was transferred to JuliaOptimalTransport/OptimalTransport.jl) also contains an implementation of the Sinkhorn algorithm for unbalanced entropically regularized optimal transport problems: https://juliaoptimaltransport.github.io/OptimalTransport.jl/dev/#Unbalanced-optimal-transport It is an implementation of algorithm 2 in "Scaling algorithms for unbalanced optimal transport problems" (https://doi.org/10.1090/mcom/3303) and supports general soft marginal constraints (including, of course, the KL divergence with a simplified interface) and GPUs (also part of the CI tests).
The text was updated successfully, but these errors were encountered: