Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use Gather/Scatter instead of distributive laws #51

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gelisam
Copy link
Collaborator

@gelisam gelisam commented Jul 11, 2018

Fixes #50.

Distributive monads and comonads looked like a principled way to
combine recursion schemes, by reusing the way in which monad and
comonad transformers combine. For this approach to work, we also need
to combine the distributive laws, but unfortunately this is error
prone because sometimes the only implementation which type-checks,
e.g. distZygoT, produces distributive laws which aren't lawful, e.g.
distZygoT g distHisto. See #50 for details.

Here is an alternative approach.

For folds, the idea is that the information should always flow from
the leaves to the root. For cata, the f-algebra computes an a for
the leaves, and then uses those as to compute an a for their
parents, and then for their grand-parents, and so on. For gcata, the
algebra still computes an a for the leaves, but when computing the
a for a node from the information computed for the direct children,
that information includes more than just an a; it could be (t,a),
or Cofree f a, etc. -- let's call it s. Since the algebra only
computes an a from this f s, something else, a "gathering
function", needs to compute the rest of the s from this f s. We
can combine those gathering functions in the same way we were
combining the distributive laws; except this time gatherZygoT g gatherHisto behaves sensibly.

For unfolds, the dual idea is that the information flows from the root
to the leaves. For ana, the f-coalgebra computes an f a from the
root's a, thereby creating the root's direct children, and then we
use those as to compute the as for the grand-children, and so on.
For gcata, the coalgebra still computes an f from an a, but it
has the opportunity to describe the sub-trees using other
representation than an a; it could be Either t a, or Free f a,
etc. -- let's call it s. Since the coalgebra only knows how to
expand an a into an f s, something else, a "scattering function",
needs to handle the rest of s's constructors.

Fixes recursion-schemes#50.

Distributive monads and comonads looked like a principled way to
combine recursion schemes, by reusing the way in which monad and
comonad transformers combine. For this approach to work, we also need
to combine the distributive laws, but unfortunately this is error
prone because sometimes the only implementation which type-checks,
e.g. `distZygoT`, produces distributive laws which aren't lawful, e.g.
`distZygoT g distHisto`. See recursion-schemes#50 for details.

Here is an alternative approach.

For folds, the idea is that the information should always flow from
the leaves to the root. For `cata`, the f-algebra computes an `a` for
the leaves, and then uses those `a`s to compute an `a` for their
parents, and then for their grand-parents, and so on. For `gcata`, the
algebra still computes an `a` for the leaves, but when computing the
`a` for a node from the information computed for the direct children,
that information includes more than just an `a`; it could be `(t,a)`,
or `Cofree f a`, etc. -- let's call it `s`. Since the algebra only
computes an `a` from this `f s`, something else, a "gathering
function", needs to compute the rest of the `s` from this `f s`. We
can combine those gathering functions in the same way we were
combining the distributive laws; except this time `gatherZygoT g
gatherHisto` behaves sensibly.

For unfolds, the dual idea is that the information flows from the root
to the leaves. For `ana`, the f-coalgebra computes an `f a` from the
root's `a`, thereby creating the root's direct children, and then we
use those `a`s to compute the `a`s for the grand-children, and so on.
For `gcata`, the coalgebra still computes an `f` from an `a`, but it
has the opportunity to describe the sub-trees using other
representation than an `a`; it could be `Either t a`, or `Free f a`,
etc. -- let's call it `s`. Since the coalgebra only knows how to
expand an `a` into an `f s`, something else, a "scattering function",
needs to handle the rest of `s`'s constructors.
@gelisam
Copy link
Collaborator Author

gelisam commented Jul 11, 2018

Also includes #49, so the diff will be easier to look at if the PR's target is temporarily set to its branch, or if #49 is merged first.

#49 fixes the type of ghisto to use CofreeT (Base t) w a instead of Cofree h a; and now in #50 we change its type again to use Cofree (Base t) s. The types of all the generalized recursion schemes is changed: they use Gather/Scatter instead of distributive laws, and the rest of their type uses an s instead of a w or an m.

@gelisam
Copy link
Collaborator Author

gelisam commented Jul 11, 2018

Here are the recursion schemes whose type has changed. They all include a change from asking for a distributive law to asking for a Gather or a Scatter, but it's the other changes I want to highlight, so to make that part of the type signature more similar I'll pretend like the Gather and Scatter type synonyms used to mean the two directions of the distributive laws, and that this PR changes the definition of those type synonyms.

-type Gather f w = forall b. f (w b) -> w (f b)
+type Gather f a s = a -> f s -> s

-type Scatter f m = forall b. m (f b) -> f (m b)
+type Scatter f a s = s -> Either a (f s)

-gcata :: (Recursive t, Comonad w)
-      => Gather (Base t) w
-      -> (Base t (w a) -> a)
-      -> t -> a
+gcata :: Recursive t
+      => Gather (Base t) a s
+      -> (Base t s -> a)
+      -> t -> a

-gana :: (Corecursive t, Monad m)
-     => Scatter (Base t) m
-     -> (a -> Base t (m a))
-     -> a -> t
+gana :: Corecursive t
+     => Scatter (Base t) a s
+     -> (a -> Base t s)
+     -> a -> t

-ghylo :: (Comonad w, Functor f, Monad m)
-      => Gather f w
-      -> Scatter f m
-      -> (f (w b) -> b)
-      -> (a -> f (m a))
-      -> a -> b
+ghylo :: Functor f
+      => Gather f b r
+      -> Scatter f a s
+      -> (f r -> b)
+      -> (a -> f s)
+      -> a -> b

-gzygo :: (Recursive t, Comonad w)
-      => (Base t b -> b)
-      -> Gather (Base t) w
-      -> (Base t (EnvT b w a) -> a)
-      -> t -> a
+gzygo :: Recursive t
+      => (Base t b -> b)
+      -> Gather (Base t) a s
+      -> (Base t (b, s) -> a)
+      -> t -> a

-gpara :: (Recursive t, Corecursive t, Comonad w)
-      => Gather (Base t) w
-      -> (Base t (EnvT t w a) -> a)
-      -> t -> a
+gpara :: (Recursive t, Corecursive t)
+      => Gather (Base t) a s
+      -> (Base t (t, s) -> a)
+      -> t -> a

-gprepro :: (Recursive t, Corecursive t, Comonad w)
-        => Gather (Base t) w
-        -> (forall c. Base t c -> Base t c)
-        -> (Base t (w a) -> a)
-        -> t -> a
+gprepro :: (Recursive t, Corecursive t)
+        => Gather (Base t) a s
+        -> (forall c. Base t c -> Base t c)
+        -> (Base t s -> a)
+        -> t -> a

-gpostpro :: (Corecursive t, Recursive t, Monad m)
-         => Scatter (Base t) m
-         -> (forall c. Base t c -> Base t c)
-         -> (a -> Base t (m a))
-         -> a -> t
+gpostpro :: (Corecursive t, Recursive t)
+         => Scatter (Base t) a s
+         -> (forall c. Base t c -> Base t c)
+         -> (a -> Base t s)
+         -> a -> t

-ghisto :: (Recursive t, Comonad w)
-       => Gather (Base t) w
-       -> (Base t (CofreeT (Base t) w a) -> a)
-       -> t -> a
+ghisto :: Recursive t
+       => Gather (Base t) a s
+       -> (Base t (Cofree (Base t) s) -> a)
+       -> t -> a

-gfutu :: (Corecursive t, Monad m)
-      => Scatter (Base t) m
-      -> (a -> Base t (FreeT (Base t) m a))
-      -> a -> t
+gfutu :: Corecursive t
+      => Scatter (Base t) a s
+      -> (a -> Base t (Free (Base t) s))
+      -> a -> t

-gchrono :: (Functor f, Comonad w, Monad m)
-        => Gather f w
-        -> Scatter f m
-        -> (f (CofreeT f w b) -> b)
-        -> (a -> f (FreeT f m a))
-        -> a -> b
+gchrono :: Functor f
+        => Gather f b r
+        -> Scatter f a s
+        -> (f (Cofree f r) -> b)
+        -> (a -> f (Free f s))
+        -> a -> b

-zygoHistoPrepro :: (Corecursive t, Recursive t)
-                => (Base t b -> b)
-                -> (forall c. Base t c -> Base t c)
-                -> (Base t (EnvT b (Cofree (Base t)) a) -> a)
-                -> t -> a
+zygoHistoPrepro :: (Corecursive t, Recursive t)
+                => (Base t b -> b)
+                -> (forall c. Base t c -> Base t c)
+                -> (Base t (b, Cofree (Base t) a) -> a)
+                -> t -> a

As you can see, a common theme is to use a plain type, such as (b, f a), instead of the transformer version, EnvT b f a. If we wanted, I think it might be possible to change the definitions of Gather and Scatter so that transformers would again be used everywhere:

type Gather f a w = a -> f (w a) -> w a
type Scatter f a m = m a -> Either a (f (m a))

or maybe even

type Gather f w = forall b. b -> f (w b) -> w b
type Scatter f m = forall b. m b -> Either b (f (m b))

But I have not explored the ramifications of that alternate design. One advantage of the no-transformers design is that gfutu scatterApo's type is a bit more useful than gfutu distApo's:

gfutu scatterApo :: (Corecursive t, Recursive t)
  => (a -> Base t (Free (Base t) (Either t a)))
  -> a -> t
gfutu distApo :: (Corecursive t, Recursive t)
  => (a -> Base t (FreeT (Base t) (Either t) a))
  -> a -> t

That is, with the no-transformers design, the coalgebra generates a subtree of depth at least 1 (because of the outer Base t wrapper) whose leaves are seeds of type Either t a, that is, either seeds which terminate the recursion with a t subtree or seeds which continue the recursion with an a. With the transformers design, the coalgebra also generates a subtree of depth at least 1 whose leaves are seeds of type Either t a, except there is this weird restriction that t can only be used at depth 1 (because a lift (Left t) would short-circuit the FreeT computation). So the no-transformers version seems a lot more pleasant and natural, to me at least.

, scatterGApo
, scatterGApoT
, scatterFutu
, scatterGFutu
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the opportunity to move the futu- and gfutu-related entries from the folding section to the unfolding section.

@@ -159,10 +156,10 @@ class Functor (Base t) => Recursive t where
cata f = c where c = f . fmap c . project

para :: (Base t (t, a) -> a) -> t -> a
para t = p where p x = t . fmap ((,) <*> p) $ project x
para f = p where p x = f . fmap ((,) <*> p) $ project x
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, I tried to always use f for algebras and g for coalgebras.

gpara :: (Corecursive t, Comonad w) => (forall b. Base t (w b) -> w (Base t b)) -> (Base t (EnvT t w a) -> a) -> t -> a
gpara t = gzygo embed t
gpara :: Corecursive t => Gather (Base t) a s -> (Base t (t, s) -> a) -> t -> a
gpara = gzygo embed
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I keep the t like in the original version? It's not clear to me why it was eta-expanded in the first place.

=> (forall b. m (Base t b) -> Base t (m b)) -- distributive law
-> (forall c. Base t c -> Base t c) -- natural transformation
-> (a -> Base t (m a)) -- a (Base t)-m-coalgebra
-> a -- seed
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, I had to remove the comments because I wasn't sure what to call the a -> Base t s; it's certainly not a "(Base t)-m-coalgebra" anymore. I guess I could have kept the other comments...

gcata gather f = f . fmap go . project where
go :: t -> s
go = uncurry gather . (f &&& id) . fmap go . project
gfold gather f t = gcata gather f t
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time I kept the eta-expanded version of gfold

-> t
-> a
zygoHistoPrepro f g t = gprepro (distZygoT f distHisto) g t
zygoHistoPrepro f g t = gprepro (gatherZygoT f gatherHisto) g t
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behaviour of zygoHistoPrepro has changed; it was using the problematic distZygoT f distHisto I argued against in #49, and now it's using the fixed version which passes the sanity check I gave there.

@ekmett
Copy link
Collaborator

ekmett commented Jul 12, 2018

I confess this is getting far enough away from any of the established material for recursion schemes that I'm not terribly well equipped to or comfortable in maintaining it personally.

I can see the general advantages of this approach, however.

@gelisam Would you be interested in taking over maintainership of recursion-schemes?

@gelisam
Copy link
Collaborator Author

gelisam commented Jul 12, 2018

Would you be interested in taking over maintainership of recursion-schemes?

Sure, I would love to!

In order to do that though, I would have to understand the whole codebase, and there's still one part which bugs me: why does the library provide both Fix, Mu, and Nu? In a total language, Mu and Nu would provide the least and greatest fixpoint, but in Haskell, the distinction seems moot. Especially in a library about recursion-schemes, in which an infinite structure would be constructed using ana and friends, not using Nu. Are users encouraged to use Nu f to document the fact that they intend the values to be possibly infinite, and Mu f to document the fact that they intend the values to be definitely finite? And Fix f would be what, the default, to indicate that the author of the datatype hasn't yet given some thought as to whether they want to allow infinite values or not?

@ekmett
Copy link
Collaborator

ekmett commented Jul 13, 2018

Fix, Mu and Nu are mostly just illustrative/pedagogical.

The library in its current state supplies the types it supplies so that you can do things like build instances for existing data types rather than have to go through a single fixed Fix or Mu or Nu, e.g.

type instance Base [a] = ListF a

The benefit of this flexibility is that you can supply instances and base functors that can work for multiple types.

All my previous code in category-extras before I wrote recursion-schemes, used to force you to go through one "fixed" Fix type. This did make inference work ever so slightly better, but it came at a cost.

The boilerplate of writting manual fold/unfold combinators to a separate data type with slightly redundant constructor names is usually cheaper than the code burden of using a one-size-fits-all Fix for a large body of code.

To your point, though, in the events you are just trying to tie a knot in a type, Fix is almost always the one of Fix, Mu or Nu you should be using in practice, unless you are working with an existing type out of the box like [a], rather than trying to mock it up through some Fix (Compose Maybe ((,)a)) and dealing with the much noisier construction/deconstruction noise.

That said, there are reasons to consider using something like the other forms in limited situations. e.g. https://www.schoolofhaskell.com/user/edwardk/moore/for-less uses a form of Nu modified to work with some seed that is the representation of a representable functor to great effect, but it is worth noting that that usage doesn't fit into the formalism offered here.

Also, there is a niggling caveat. Consider the scenario where f is made strict in its argument by using bang patterns or an Identity-like newtype, and then re-compare Mu vs. Nu.

@gelisam
Copy link
Collaborator Author

gelisam commented Jul 13, 2018

Also, there is a niggling caveat. Consider the scenario where f is made strict in its argument by using bang patterns or an Identity-like newtype, and then re-compare Mu vs. Nu.

Ah! That makes sense: we can use the strictness trick to create a recursive datatype for which we can only create finite values, and so Haskell does support a way to express least-fixed points after all. The base functor F of such a finite recursive datatype is also strict, and so Fix F constructs the least fixed-point when given a strict base functor, and the greatest fixed-point when given a lazy base functor. Nu expresses the greatest fixed-point regardless of the strictness of the base functor, because F only forces the seeds, not some recursive datatype made of more Fs.

As for Mu, the situation is more complicated because we don't know which a and f :: F a -> a the caller will choose. Let's insist that the mu :: forall a. (f a -> a) -> a implementation must never bottom out, regardless of the choice of a or f, as long as f is itself total (i.e. it never returns ⊥ unless given ⊥). Since a and f could be Fix F and Fix, mu must construct its a under the same constraints as if it was building a Fix F, and so may only apply f a finite number of times. These constraints do not apply when F is lazy, and so like Fix, Mu F constructs the least fixed-point when given a strict base functor, and the greatest fixed-point when given a lazy base functor.

All right, I'm ready. Bestow maintainership upon me, my liege!

@andyscott
Copy link
Contributor

Just a heads up that I'm trying out gather/scatter in Scala via the evolving https://github.com/andyscott/droste project (@sellout pointed me your way!).

@gelisam gelisam mentioned this pull request Jan 28, 2019
gelisam added a commit that referenced this pull request Sep 19, 2020
This is a huge, yet backwards-compatible change! It introduces a new API
which is both more readable and more correct than the previous API, and
it also changes the name of the recursion-schemes import.

First, the name of the module. "Data.Functor.Foldable" is a strange
choice of module name by modern standards, "RecursionSchemes" is better
because it matches the name of the library instead of trying to fit into
the "Data.<...>" / "Control.<...>" semantic hierarchy. By introducing
the new API and the new module name at the same time, it becomes
possible to preserve backwards compatibility simply by keeping the old
API at the old module name and adding the new API at the new module
name.

Second, the new API. One big advantage of Mendler-style
recursion-schemes such as `mcata` and `mhisto` is that the resulting
code is more readable. At least in my opinion, since readability is
always subjective. Still, compare the following four implementations of
`fib`:

    data Unary = Zero | Succ Unary
    makeBaseFunctor ''Unary

    fibA :: Unary -> Int
    fibA = \case
      Zero
        -> 1
      Succ Zero
        -> 1
      Succ nMinus1@(Succ nMinus2)
        -> fibA nMinus1 + fibA nMinus2

    fibB :: Unary -> Int
    fibB = histo $ \case
      ZeroF
        -> 1
      SuccF (_ :< ZeroF)
        -> 1
      SuccF (fibNMinus1 :< SuccF (fibNMinus2 :< _))
        -> fibNMinus1 + fibNMinus2

    fibC :: Fix (Base Unary) -> Int
    fibC = mhisto $ \recur split -> \case
      ZeroF
        -> 1
      SuccF nMinus1 -> case split nMinus1 of
        ZeroF
          -> 1
        SuccF nMinus2
          -> recur nMinus1 + recur nMinus2

    fibD :: RecFun Unary Int
    fibD = recFun histo $ \case
      ZeroF
        -> 1
      SuccF nMinus1 -> case project nMinus1 of
        ZeroF
          -> 1
        SuccF nMinus2
          -> recur nMinus1 + recur nMinus2

I would say that `fibA` is the most readable, as it directly expresses
that `fib 0 = 1`, `fib 1 = 1`, and `fib n = fib (n-1) + fib (n-2)`. But
that definition uses general recursion, which does not guarantee that
we're only making recursive calls on smaller terms. It is also very
inefficient.

`fibB` improves upon `fibA` by using the `histo` recursion scheme, which
perform the recursion on `fibB`'s behalf; and since `fibB` does not
make any recursion calls itself, it cannot accidentally make recursive
calls on terms which are not smaller. It is also much more efficient,
because the recursion structure is no longer branching into two, instead
it combines two pre-computed results. However, `fibB`'s pattern-matching
is a lot messier: the structure of the Unary and the pre-computed
results are interleaved, and you have to be very familiar with the
details of the `Base Unary (Cofree (Base Unary) Int)` type on which it
is pattern-matching in order to know which position holds what. This
becomes even messier when combining multiple recursion-schemes, e.g.
with a zygo-paramorphism.

Furthermore, while the word "histo" is helpful to those who are familiar
with that name, as it immediately conveys the idea that we are going to
be combining recursively-pre-computed values from multiple depths, it is
deeply unhelpful to those who aren't, as the recursion structure of
`fibB` is no longer explicit, it is now hidden inside the implementation
of `histo`.

Next, `fibC` has the best of both worlds: the word `mhisto` conveys
the same idea to those who are familiar with the name "histo", while the
explicit `recur` calls clarify the recursion structure for those who
aren't. Furthermore, while the pattern-matching is not as readable as in
`fibA`, it is much more readable than in `fibB`, because we don't have
to bring all of the information into scope all at once. Instead, the
helper functions `recur` and `split` extract the bits we need when we
need them. We also get the same efficient implementation; `recur`
extracts the pre-computed value for that recursive position, it does not
actually make a recursive call, but `recur` is still a very good name
for it because it returns the same result as if we were making a
recursive call to `fibC`. Moreover, the type of `recur` is restricted in
a way which guarantees that it can only be called on smaller terms.

Finally, `fibD` demonstrates the new API. Like `fibC`, it has the best
of both worlds, but in addition, the `recur` and `split` (renamed to
`project`, because we can reuse `Recursive.project`) functions are now
global instead of local. Also, while `Data.Functor.Foldable.histo` has a
`ghisto` variant which makes it possible to combine recursion-schemes,
`Data.Functor.Foldable.mhisto` doesn't. In contrast, the new API's
design does make it possible to provide a `histoT` variant while
retaining the advantages of the Mendler style.

See
#105 (comment)
for a similar example with `para` instead of `histo`.

One last advantage of the new API is that is is more correct than the
old API. See
#50 (comment)
for an explanation of what is wrong with the old API, and
#51
for an explanation of the Gather/Scatter solution to that problem. In
short, generalized recursion-schemes such as `ghisto` can sometimes
combine in ways that appear type-correct but actually give
subtly-incorrect results, while the new API's transformers, such as
`histoT`, use a much simpler approach, in the hope of making the library
closer to "so simple that there are obviously no deficiencies" than to
"so complicated that there are no obvious deficiencies".

In the new API, everything is a catamorphism applied to an algebra of
type `base pos -> pos`. This `pos` is different for every recursion
scheme, and it contains all the data which the recursion scheme needs to
keep track during the catamorphism. The user provides a function of type
`base pos -> a`, so `pos` must also have instances which allow the user
to call functions like `recur` on the recursive positions. Each
recursion scheme picks their `pos` so that the user is only allowed to
make the observations which make sense for that recursion scheme. That's
it. Well, sort of.

The user provides a function of type `base pos -> a`, but we need a
function of type `base pos -> pos`. The user's function explains how one
step of the recursion computes the value the user is interested in from
the values provided by the `base pos`. What we need is a function
explaining how one step of the recursion computes all the data we need
in order to continue the fold, including both the data which the user is
interested in and the data which the next incarnation of the user
function will be able to draw from. Each recursion scheme must thus
provide a "gather" function of type `a -> base pos -> pos`, which
extracts all that information, and stores it in the `pos` along with the
`a` which the user function returned; after all, the next incarnation of
the user function must be able to call `recur` on that `pos` in order to
get back that `a`.

We might also want to combine recursion schemes; that's the feature for
which the old API brought in complex things like comonad transformers
and functions expressing distributive laws. In the new API, a recursion
scheme transformer such as `paraT` is associated with a `pos`
transformer, such as `ParaPosT`, and instead of providing a gather
function, it transforms a gather function on `pos` into a gather
function on `ParaPosT pos`.

The `Scatter` version, where everything is an anamorphism, will be
implemented in a latter commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants