-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use Gather/Scatter instead of distributive laws #51
base: master
Are you sure you want to change the base?
Conversation
Fixes recursion-schemes#50. Distributive monads and comonads looked like a principled way to combine recursion schemes, by reusing the way in which monad and comonad transformers combine. For this approach to work, we also need to combine the distributive laws, but unfortunately this is error prone because sometimes the only implementation which type-checks, e.g. `distZygoT`, produces distributive laws which aren't lawful, e.g. `distZygoT g distHisto`. See recursion-schemes#50 for details. Here is an alternative approach. For folds, the idea is that the information should always flow from the leaves to the root. For `cata`, the f-algebra computes an `a` for the leaves, and then uses those `a`s to compute an `a` for their parents, and then for their grand-parents, and so on. For `gcata`, the algebra still computes an `a` for the leaves, but when computing the `a` for a node from the information computed for the direct children, that information includes more than just an `a`; it could be `(t,a)`, or `Cofree f a`, etc. -- let's call it `s`. Since the algebra only computes an `a` from this `f s`, something else, a "gathering function", needs to compute the rest of the `s` from this `f s`. We can combine those gathering functions in the same way we were combining the distributive laws; except this time `gatherZygoT g gatherHisto` behaves sensibly. For unfolds, the dual idea is that the information flows from the root to the leaves. For `ana`, the f-coalgebra computes an `f a` from the root's `a`, thereby creating the root's direct children, and then we use those `a`s to compute the `a`s for the grand-children, and so on. For `gcata`, the coalgebra still computes an `f` from an `a`, but it has the opportunity to describe the sub-trees using other representation than an `a`; it could be `Either t a`, or `Free f a`, etc. -- let's call it `s`. Since the coalgebra only knows how to expand an `a` into an `f s`, something else, a "scattering function", needs to handle the rest of `s`'s constructors.
Also includes #49, so the diff will be easier to look at if the PR's target is temporarily set to its branch, or if #49 is merged first. #49 fixes the type of ghisto to use |
Here are the recursion schemes whose type has changed. They all include a change from asking for a distributive law to asking for a
As you can see, a common theme is to use a plain type, such as
or maybe even
But I have not explored the ramifications of that alternate design. One advantage of the no-transformers design is that
That is, with the no-transformers design, the coalgebra generates a subtree of depth at least 1 (because of the outer |
, scatterGApo | ||
, scatterGApoT | ||
, scatterFutu | ||
, scatterGFutu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the opportunity to move the futu
- and gfutu
-related entries from the folding section to the unfolding section.
@@ -159,10 +156,10 @@ class Functor (Base t) => Recursive t where | |||
cata f = c where c = f . fmap c . project | |||
|
|||
para :: (Base t (t, a) -> a) -> t -> a | |||
para t = p where p x = t . fmap ((,) <*> p) $ project x | |||
para f = p where p x = f . fmap ((,) <*> p) $ project x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, I tried to always use f
for algebras and g
for coalgebras.
gpara :: (Corecursive t, Comonad w) => (forall b. Base t (w b) -> w (Base t b)) -> (Base t (EnvT t w a) -> a) -> t -> a | ||
gpara t = gzygo embed t | ||
gpara :: Corecursive t => Gather (Base t) a s -> (Base t (t, s) -> a) -> t -> a | ||
gpara = gzygo embed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I keep the t
like in the original version? It's not clear to me why it was eta-expanded in the first place.
=> (forall b. m (Base t b) -> Base t (m b)) -- distributive law | ||
-> (forall c. Base t c -> Base t c) -- natural transformation | ||
-> (a -> Base t (m a)) -- a (Base t)-m-coalgebra | ||
-> a -- seed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, I had to remove the comments because I wasn't sure what to call the a -> Base t s
; it's certainly not a "(Base t)
-m
-coalgebra" anymore. I guess I could have kept the other comments...
gcata gather f = f . fmap go . project where | ||
go :: t -> s | ||
go = uncurry gather . (f &&& id) . fmap go . project | ||
gfold gather f t = gcata gather f t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This time I kept the eta-expanded version of gfold
-> t | ||
-> a | ||
zygoHistoPrepro f g t = gprepro (distZygoT f distHisto) g t | ||
zygoHistoPrepro f g t = gprepro (gatherZygoT f gatherHisto) g t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behaviour of zygoHistoPrepro
has changed; it was using the problematic distZygoT f distHisto
I argued against in #49, and now it's using the fixed version which passes the sanity check I gave there.
I confess this is getting far enough away from any of the established material for recursion schemes that I'm not terribly well equipped to or comfortable in maintaining it personally. I can see the general advantages of this approach, however. @gelisam Would you be interested in taking over maintainership of |
Sure, I would love to! In order to do that though, I would have to understand the whole codebase, and there's still one part which bugs me: why does the library provide both |
The library in its current state supplies the types it supplies so that you can do things like build instances for existing data types rather than have to go through a single fixed type instance Base [a] = ListF a The benefit of this flexibility is that you can supply instances and base functors that can work for multiple types. All my previous code in The boilerplate of writting manual fold/unfold combinators to a separate data type with slightly redundant constructor names is usually cheaper than the code burden of using a one-size-fits-all To your point, though, in the events you are just trying to tie a knot in a type, That said, there are reasons to consider using something like the other forms in limited situations. e.g. https://www.schoolofhaskell.com/user/edwardk/moore/for-less uses a form of Also, there is a niggling caveat. Consider the scenario where |
Ah! That makes sense: we can use the strictness trick to create a recursive datatype for which we can only create finite values, and so Haskell does support a way to express least-fixed points after all. The base functor As for All right, I'm ready. Bestow maintainership upon me, my liege! |
Just a heads up that I'm trying out gather/scatter in Scala via the evolving https://github.com/andyscott/droste project (@sellout pointed me your way!). |
This is a huge, yet backwards-compatible change! It introduces a new API which is both more readable and more correct than the previous API, and it also changes the name of the recursion-schemes import. First, the name of the module. "Data.Functor.Foldable" is a strange choice of module name by modern standards, "RecursionSchemes" is better because it matches the name of the library instead of trying to fit into the "Data.<...>" / "Control.<...>" semantic hierarchy. By introducing the new API and the new module name at the same time, it becomes possible to preserve backwards compatibility simply by keeping the old API at the old module name and adding the new API at the new module name. Second, the new API. One big advantage of Mendler-style recursion-schemes such as `mcata` and `mhisto` is that the resulting code is more readable. At least in my opinion, since readability is always subjective. Still, compare the following four implementations of `fib`: data Unary = Zero | Succ Unary makeBaseFunctor ''Unary fibA :: Unary -> Int fibA = \case Zero -> 1 Succ Zero -> 1 Succ nMinus1@(Succ nMinus2) -> fibA nMinus1 + fibA nMinus2 fibB :: Unary -> Int fibB = histo $ \case ZeroF -> 1 SuccF (_ :< ZeroF) -> 1 SuccF (fibNMinus1 :< SuccF (fibNMinus2 :< _)) -> fibNMinus1 + fibNMinus2 fibC :: Fix (Base Unary) -> Int fibC = mhisto $ \recur split -> \case ZeroF -> 1 SuccF nMinus1 -> case split nMinus1 of ZeroF -> 1 SuccF nMinus2 -> recur nMinus1 + recur nMinus2 fibD :: RecFun Unary Int fibD = recFun histo $ \case ZeroF -> 1 SuccF nMinus1 -> case project nMinus1 of ZeroF -> 1 SuccF nMinus2 -> recur nMinus1 + recur nMinus2 I would say that `fibA` is the most readable, as it directly expresses that `fib 0 = 1`, `fib 1 = 1`, and `fib n = fib (n-1) + fib (n-2)`. But that definition uses general recursion, which does not guarantee that we're only making recursive calls on smaller terms. It is also very inefficient. `fibB` improves upon `fibA` by using the `histo` recursion scheme, which perform the recursion on `fibB`'s behalf; and since `fibB` does not make any recursion calls itself, it cannot accidentally make recursive calls on terms which are not smaller. It is also much more efficient, because the recursion structure is no longer branching into two, instead it combines two pre-computed results. However, `fibB`'s pattern-matching is a lot messier: the structure of the Unary and the pre-computed results are interleaved, and you have to be very familiar with the details of the `Base Unary (Cofree (Base Unary) Int)` type on which it is pattern-matching in order to know which position holds what. This becomes even messier when combining multiple recursion-schemes, e.g. with a zygo-paramorphism. Furthermore, while the word "histo" is helpful to those who are familiar with that name, as it immediately conveys the idea that we are going to be combining recursively-pre-computed values from multiple depths, it is deeply unhelpful to those who aren't, as the recursion structure of `fibB` is no longer explicit, it is now hidden inside the implementation of `histo`. Next, `fibC` has the best of both worlds: the word `mhisto` conveys the same idea to those who are familiar with the name "histo", while the explicit `recur` calls clarify the recursion structure for those who aren't. Furthermore, while the pattern-matching is not as readable as in `fibA`, it is much more readable than in `fibB`, because we don't have to bring all of the information into scope all at once. Instead, the helper functions `recur` and `split` extract the bits we need when we need them. We also get the same efficient implementation; `recur` extracts the pre-computed value for that recursive position, it does not actually make a recursive call, but `recur` is still a very good name for it because it returns the same result as if we were making a recursive call to `fibC`. Moreover, the type of `recur` is restricted in a way which guarantees that it can only be called on smaller terms. Finally, `fibD` demonstrates the new API. Like `fibC`, it has the best of both worlds, but in addition, the `recur` and `split` (renamed to `project`, because we can reuse `Recursive.project`) functions are now global instead of local. Also, while `Data.Functor.Foldable.histo` has a `ghisto` variant which makes it possible to combine recursion-schemes, `Data.Functor.Foldable.mhisto` doesn't. In contrast, the new API's design does make it possible to provide a `histoT` variant while retaining the advantages of the Mendler style. See #105 (comment) for a similar example with `para` instead of `histo`. One last advantage of the new API is that is is more correct than the old API. See #50 (comment) for an explanation of what is wrong with the old API, and #51 for an explanation of the Gather/Scatter solution to that problem. In short, generalized recursion-schemes such as `ghisto` can sometimes combine in ways that appear type-correct but actually give subtly-incorrect results, while the new API's transformers, such as `histoT`, use a much simpler approach, in the hope of making the library closer to "so simple that there are obviously no deficiencies" than to "so complicated that there are no obvious deficiencies". In the new API, everything is a catamorphism applied to an algebra of type `base pos -> pos`. This `pos` is different for every recursion scheme, and it contains all the data which the recursion scheme needs to keep track during the catamorphism. The user provides a function of type `base pos -> a`, so `pos` must also have instances which allow the user to call functions like `recur` on the recursive positions. Each recursion scheme picks their `pos` so that the user is only allowed to make the observations which make sense for that recursion scheme. That's it. Well, sort of. The user provides a function of type `base pos -> a`, but we need a function of type `base pos -> pos`. The user's function explains how one step of the recursion computes the value the user is interested in from the values provided by the `base pos`. What we need is a function explaining how one step of the recursion computes all the data we need in order to continue the fold, including both the data which the user is interested in and the data which the next incarnation of the user function will be able to draw from. Each recursion scheme must thus provide a "gather" function of type `a -> base pos -> pos`, which extracts all that information, and stores it in the `pos` along with the `a` which the user function returned; after all, the next incarnation of the user function must be able to call `recur` on that `pos` in order to get back that `a`. We might also want to combine recursion schemes; that's the feature for which the old API brought in complex things like comonad transformers and functions expressing distributive laws. In the new API, a recursion scheme transformer such as `paraT` is associated with a `pos` transformer, such as `ParaPosT`, and instead of providing a gather function, it transforms a gather function on `pos` into a gather function on `ParaPosT pos`. The `Scatter` version, where everything is an anamorphism, will be implemented in a latter commit.
Fixes #50.
Distributive monads and comonads looked like a principled way to
combine recursion schemes, by reusing the way in which monad and
comonad transformers combine. For this approach to work, we also need
to combine the distributive laws, but unfortunately this is error
prone because sometimes the only implementation which type-checks,
e.g.
distZygoT
, produces distributive laws which aren't lawful, e.g.distZygoT g distHisto
. See #50 for details.Here is an alternative approach.
For folds, the idea is that the information should always flow from
the leaves to the root. For
cata
, the f-algebra computes ana
forthe leaves, and then uses those
a
s to compute ana
for theirparents, and then for their grand-parents, and so on. For
gcata
, thealgebra still computes an
a
for the leaves, but when computing thea
for a node from the information computed for the direct children,that information includes more than just an
a
; it could be(t,a)
,or
Cofree f a
, etc. -- let's call its
. Since the algebra onlycomputes an
a
from thisf s
, something else, a "gatheringfunction", needs to compute the rest of the
s
from thisf s
. Wecan combine those gathering functions in the same way we were
combining the distributive laws; except this time
gatherZygoT g gatherHisto
behaves sensibly.For unfolds, the dual idea is that the information flows from the root
to the leaves. For
ana
, the f-coalgebra computes anf a
from theroot's
a
, thereby creating the root's direct children, and then weuse those
a
s to compute thea
s for the grand-children, and so on.For
gcata
, the coalgebra still computes anf
from ana
, but ithas the opportunity to describe the sub-trees using other
representation than an
a
; it could beEither t a
, orFree f a
,etc. -- let's call it
s
. Since the coalgebra only knows how toexpand an
a
into anf s
, something else, a "scattering function",needs to handle the rest of
s
's constructors.