Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbalanced OT in OptimalTransport.jl #9

Open
devmotion opened this issue Jun 27, 2021 · 14 comments
Open

Unbalanced OT in OptimalTransport.jl #9

devmotion opened this issue Jun 27, 2021 · 14 comments

Comments

@devmotion
Copy link

I just learnt about the existence of this package and thought you might be interested to know that OptimalTransport.jl (former zsteve/OptimalTransport.jl that was transferred to JuliaOptimalTransport/OptimalTransport.jl) also contains an implementation of the Sinkhorn algorithm for unbalanced entropically regularized optimal transport problems: https://juliaoptimaltransport.github.io/OptimalTransport.jl/dev/#Unbalanced-optimal-transport It is an implementation of algorithm 2 in "Scaling algorithms for unbalanced optimal transport problems" (https://doi.org/10.1090/mcom/3303) and supports general soft marginal constraints (including, of course, the KL divergence with a simplified interface) and GPUs (also part of the CI tests).

@ericphanson
Copy link
Owner

I saw, very nice work!

When I created this package, OptimalTransport.jl had a python dependency and GPL license and so I decided to try my own. In retrospect those things probably were fine anyway for my purpose which was to make VisualStringDistances.jl to add a visual distance check to RegistryCI.jl (well, maybe not the python dependency, not sure we want big deps for RegistryCI). But it was a good chance to learn a bit anyway.

I think currently UnbalancedOptimalTransport implements a different algorithm than OptimalTransport for the unbalanced sinkhorn. It’s been awhile since I’ve looked at these things but I if I remember correctly the one here is more numerically stable for small epsilon. I also test against Convex.jl which provides some additional reassurance.

@devmotion
Copy link
Author

When I created this package, OptimalTransport.jl had a python dependency and GPL license and so I decided to try my own.

Yes, this was very unfortunate and therefore I also reimplemented some things in some other packages initially. Fortunately, the Python dependency was removed (JuliaOptimalTransport/OptimalTransport.jl#29 and JuliaOptimalTransport/OptimalTransport.jl#64) and the license was changed to MIT (JuliaOptimalTransport/OptimalTransport.jl#21).

I think currently UnbalancedOptimalTransport implements a different algorithm than OptimalTransport for the unbalanced sinkhorn.

This was my first impression as well. It would be interesting to check if the algorithm here is numerically more stable, I assumed that the algorithm by Chizat et al. was already optimized for numerical stability (but it was a while ago when I read the paper so this might be wrong). In any case I guess we might want to add the alternative algorithm to OptimalTransport.jl as well (in particular with JuliaOptimalTransport/OptimalTransport.jl#100 it should be easy to provide multiple algorithms with a unified interface).

@ericphanson
Copy link
Owner

Regarding numerical stability, I went back to check the paper; I followed Section 6.1.2 of

Séjourné, Thibault, Jean Feydy, François-Xavier Vialard, Alain Trouvé, and Gabriel Peyré. “Sinkhorn Divergences for Unbalanced Optimal Transport.” (2019) http://arxiv.org/abs/1910.12958

and saw they wrote there that

We provide here a slightly different exposition of the algorithm than the one detailed in [CPSV18]. Indeed, they describe the Sinkhorn algorithm through an operator called “proxdiv”, while we express it as the composition of a Soft- min (i.e. a Log-Sum-Exp reduction) with the operator aprox^ε_{φ^∗} . Considering directly iterations over the dual potentials as we do makes the algorithm easier to stabilize for small value of ε.

where CPSV18 refers to the Chizat paper. So at least they claim so, aha. I remember I wasn't sure if I wanted regularization or not so taking ε small was of interest to me.

I think it would be great to add this algorithm to OptimalTransport.jl. I will probably keep maintaining this package though (at least while it doesn't require much maintenance, as is the current state), since I think it's nice to have a simple dependency-free version for RegistryCI (and I believe @baggepinnen's SpectralDistances uses it as well, though perhaps he might be interested in the greater variety of options provided by OptimalTransport.jl).

@devmotion
Copy link
Author

I will probably keep maintaining this package though (at least while it doesn't require much maintenance, as is the current state), since I think it's nice to have a simple dependency-free version for RegistryCI (and I believe @baggepinnen's SpectralDistances uses it as well, though perhaps he might be interested in the greater variety of options provided by OptimalTransport.jl).

@zsteve @davibarreira I think maybe it would make sense to reorganize OptimalTransport.jl at some point and move implementations to more lightweight packages such as ExactOptimalTransport.jl, EntropicOptimalTransport.jl, UnbalancedOptimalTransport.jl or QuadraticOptimalTransport.jl (if it actually helps to reduce dependencies), possibly with a lightweight interface and types that are used by multiple packages in something like OptimalTransportBase.jl. Then OptimalTransport.jl could just reexport these individual packages, similar to DifferentialEquations.jl, but users could also depend on the more lightweight packages directly.

@ericphanson
Copy link
Owner

ericphanson commented Jun 27, 2021

I like that! I think the DiffEq system is a great model to follow. I think they actually have a paper on the design of the API: https://escholarship.org/uc/item/0tz878vq (edit: though you probably know it better than I do, since you helped develop some DiffEq libraries if I'm remembering correctly?). I think with an OptimalTransportBase, that "moving out" part wouldn't even need to happen all at once or immediately, because having that would already allow e.g. this UnbalancedOptimalTransport to implement the API while only depending on the light OptimalTransportBase.

@devmotion
Copy link
Author

Yeah, I like the DiffEq system but I guess I'm biased a bit since I am involved there quite a bit. This is also why I tried to push JuliaOptimalTransport/OptimalTransport.jl#100 😄

@davibarreira
Copy link

davibarreira commented Jun 27, 2021

I will probably keep maintaining this package though (at least while it doesn't require much maintenance, as is the current state), since I think it's nice to have a simple dependency-free version for RegistryCI (and I believe @baggepinnen's SpectralDistances uses it as well, though perhaps he might be interested in the greater variety of options provided by OptimalTransport.jl).

@zsteve @davibarreira I think maybe it would make sense to reorganize OptimalTransport.jl at some point and move implementations to more lightweight packages such as ExactOptimalTransport.jl, EntropicOptimalTransport.jl, UnbalancedOptimalTransport.jl or QuadraticOptimalTransport.jl (if it actually helps to reduce dependencies), possibly with a lightweight interface and types that are used by multiple packages in something like OptimalTransportBase.jl. Then OptimalTransport.jl could just reexport these individual packages, similar to DifferentialEquations.jl, but users could also depend on the more lightweight packages directly.

I have no problem breaking down in many packages. I just don't know how much dependencies this would actually free-up. Would you say that OptimalTransport.jl is a "heavy" package?
I mean, I can see some future implementation that perhaps should be done separately. I just don't know about the current ones. But again, sure, we could break down in many packages.

@devmotion
Copy link
Author

Would you say that OptimalTransport.jl is a "heavy" package?

Of course, there are much heavier packages but it accumulated quite some dependencies over time (e.g., now also NNlib for batched multiplication which however is not required for something like ExactOptimalTransport). We would have to check to what extent dependencies can be reduced in individual packages but in any case smaller packages would lead to reduced compilation and loading time.

@ericphanson
Copy link
Owner

ericphanson commented Jun 27, 2021

Would you say that OptimalTransport.jl is a "heavy" package?

Definitely not super heavy, but MOI for example is a very big package. Actually, since I have these numbers already, these are the 15 biggest packages in General:

name Lines of Julia code in src
AWSSDK 264115
AWS 222122
Hecke 129242
TensorFlow 115544
OrdinaryDiffEq 91894
GSL 69761
DSGE 58835
ClimateMachine 58283
Kuber 54331
Azure 44766
MathOptInterface 44725
Nemo 41083
NIDAQ 40913
GlobalSensitivityAnalysis 40719
LazySets 35232

(Several of the bigger ones use automated code generation). Of course, it's great to use MOI when appropriate, but for say UnbalancedOptimalTransport which is currently 447 LoC with no dependencies, MOI alone is adding two orders of magnitude more code.

@ericphanson
Copy link
Owner

ericphanson commented Jun 27, 2021

Here's the full list of transitive dependencies of OptimalTransport.jl btw:

name Lines of Julia code in src
MathOptInterface 44725
Distributions 12682
DataStructures 8040
HTTP 6411
StatsBase 6302
SpecialFunctions 5394
IterativeSolvers 4130
NNlib 3891
Parsers 3809
MutableArithmetics 3686
MbedTLS 2281
TranscodingStreams 1476
Distances 1310
ChainRulesCore 1310
BenchmarkTools 1294
DocStringExtensions 1150
StatsFuns 1109
JSON 1025
Compat 969
ZipFile 794
FillArrays 785
OrderedCollections 734
URIs 648
PDMats 612
QuadGK 561
JSONSchema 554
JLLWrappers 455
Missings 441
SortingAlgorithms 408
CodecZlib 388
LogExpFunctions 342
RecipesBase 337
CodecBzip2 333
Rmath 304
Preferences 267
DataAPI 178
Adapt 176
Requires 170
StatsAPI 4

edit: numbers unfortunately include lines of docstrings, ref XAMPPRocky/tokei#763

@ericphanson
Copy link
Owner

One other good part about splitting them up is that it could actually make the precompilation time of the meta package OptimalTransport itself faster, even if it just re-exports all the individual packages, because on Julia 1.6+ it precompiles in parallel. (But the code that it can now precompile in parallel is the code of the OptimalTransport* packages themselves since it already can precompile the dependencies in parallel, so it might not be a such big gain because OptimalTransport is only 1k LoC. But it could be more important in the future as more algos are implemented).

@ericphanson
Copy link
Owner

@devmotion not sure if OptimalTransport has this issue, but I found a somewhat subtle bug in my implementation, so I thought I'd let you know in case similar issues affect the other packages: #11

@zsteve
Copy link

zsteve commented Nov 23, 2023

Thanks @ericphanson. At the moment we've not implemented any of the proxdiv operators for divergences other than the KL (... so it seems like we're OK for now?) but indeed I will keep it in mind.

Just a question regarding the way sinkhorn_divergence is implemented in your package: you're using Prop 12 of [SFVTP19], which is a dual formula for the unbalanced sinkhorn divergence. This lower bounds the primal value in general and would be equal if convergence is achieved (+ convexity conditions on the choice of \phi-divergence, I guess). Is there any particular reason to use the dual formulas, rather than the primal formula?

@ericphanson
Copy link
Owner

It seemed to be the most numerically stable option. In section 6.1 they write:
Screenshot 2023-11-23 at 13 41 07

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants