Yet another … yet another recursion scheme library for Haskell
Total recursion scheme library for Haskell.
Recursion schemes allow you to separate any recursion from your business logic, writing step-wise operations that can be applied in a way that guarantees termination (or, dually, progress).
How's this possible? You can’t have totality and Turing-completeness, can you? Oh, but you can – there is a particular type, Partial a
(encoded with a fixed-point) that handles potential non-termination, akin to the way that Maybe a
handles exceptional cases. It can be folded into IO
in your main function, so that the runtime can execute a Turing-complete program that was modeled totally.
NB: The tests for this package are unfortunately included in yaya-hedgehog
instead, to avoid a dependency cycle.
This organization is intended to make this a lightly-opinionated library. You should only need to import one module (per package) into any module of yours.
Pattern
– This is what you should use most of the time. It provides common pattern functors that aren’t found elsewhere as well as other operations that are useful when you’re defining your own algebras.Fold
– This (and its submodules) provides algebra transformers, fixed-point operators, and other things you use when applying folds.Retrofit
– Utilities for making your existing data types compatible with recursion schemes.Applied
– A number of commonly-useful utilities defined as folds. Intended both as examples and code that you can actually use in your projects.Zoo
– Names that you may have seen in the recursion scheme literature, but that we generally avoid using here. In general, prefer the right-hand side of these definitions, which shouldn’t require importing this module.
Greek characters (and names) for things can often add confusion, however, there are some that we’ve kept here because I think (with the right mnemonic) they're actually clarifying.
φ
– an algebra – “phi” (pronounced /faɪ/)ψ
– a coalgebra – “psi” (pronounced /ˈ(p)saɪ/)
These are the symbols used in “the literature”, but I think they also provide a good visual mnemonic – φ, like an algebra, is folded inward, while ψ, like a coalgebra, opens up. So, I find these symbols more evocative than f
and g
or algebra
and coalgebra
, or any other pair of names I’ve come across for these concepts.
There are two other names, Mu
and Nu
(for the inductive and coinductive fixed-point operators), that I don’t think have earned their place, but I just haven’t come up with something more helpful yet.
There's a set of conventions around the naming of the operations. There are many variants of each operation (and they're all ultimately variants of cata
and ana
), so understanding this convention should help make it easier to understand the myriad possibilities rather than learning them by rote. The general pattern is
[
e
][g
]operation
[T
][M
]
“Generalized” variant – This parameterizes the fold over some DistributiveLaw
that generalizes the (co)algebra over some Monad
or Comonad
. This is normally only applied to the fundamental operations – cata
, ana
, and hylo
, but there is also a gapo
(dual to zygo
) that really only coincidentally follows this naming pattern.
Many of the other “well-known” named folds are specializations of this:
- when specialized to
((,) T)
, it’spara
; - when
((,) B)
,zygo
; - when
Free f
,futu
; - etc.
“Elgot” variant – Named after the form of coalgebra used in an “Elgot algebra”. If there is an operation that takes some f (x a) -> a
, the Elgot variant takes x (f a) -> a
, which often has similar but distinct properties from the original.
As a mnemonic, you can read the e
as “exterior” as with a regular generalized fold, the x
is on the inside of the f
, while with the Elgot variant, it's on the outside of the f
.
“Transformer” variant – For some fold that takes an algebra like f (x a) -> a
, and where t
is the (monad or comonad) transformer of x
, the transformer variant takes an algebra like f (t m a) -> a
.
Kleisli (“monadic”) variant – This convention is much more widespread than simply recursion schemes. A fold that returns its result in a Monad
, by applying a Kleisli algebra (that is, f a -> m a
rather than f a -> a
. The dual of this might be something like anaW
(taking a seed value in a Comonad
), but those are uninteresting. Having Kleisli variants of unfolds is unsafe, as it can force traversal of an infinite structure. If you’re looking for an operation like that, you are better off with an effectful streaming library.
This project largely follows the Haskell Package Versioning Policy (PVP), but is more strict in some ways.
The version always has four components, A.B.C.D
. The first three correspond to those required by PVP, while the fourth matches the “patch” component from Semantic Versioning.
Here is a breakdown of some of the constraints:
PVP recommends that clients follow these import guidelines in order that they may be considered insensitive to additions to the API. However, this isn’t sufficient. We expect clients to follow these additional recommendations for API insensitivity
If you don’t follow these recommendations (in addition to the ones made by PVP), you should ensure your dependencies don’t allow a range of C
values. That is, your dependencies should look like
yaya >=1.2.3 && <1.2.4
rather than
yaya >=1.2.3 && <1.3
If your imports are package-qualified, then a dependency adding new modules can’t cause a conflict with modules you already import.
Because of the transitivity of instances, orphans make you sensitive to your dependencies’ instances. If you have an orphan instance, you are sensitive to the APIs of the packages that define the class and the types of the instance.
One way to minimize this sensitivity is to have a separate package (or packages) dedicated to any orphans you have. Those packages can be sensitive to their dependencies’ APIs, while the primary package remains insensitive, relying on the tighter ranges of the orphan packages to constrain the solver.
Type class instances are imported transitively, and thus changing them can impact packages that only have your package as a transitive dependency.
This is a consequence of instances being transitively imported. A new major version of a dependency can remove instances, and that can break downstream clients that unwittingly depended on those instances.
A library may declare that it always bumps the A
component when it removes an instance (as this policy dictates). In that case, only A
widenings need to induce A
bumps. B
widenings can be D
bumps like other widenings, Alternatively, one may compare the APIs when widening a dependency range, and if no instances have been removed, make it a D
bump.
Consumers have to contend not only with our version bounds, but also with those of other libraries. It’s possible that some dependency overlapped in a very narrow way, and even just restricting a particular patch version of a dependency could make it impossible to find a dependency solution.
Making a license more restrictive may prevent clients from being able to continue using the package.
A new dependency may make it impossible to find a solution in the face of other packages dependency ranges.
This is also what PVP recommends. However, unlike in PVP, this is because we recommend that package-qualified imports be used on all imports.
This is fairly uncommon, in the face of ^>=
-style ranges, but it can happen in a few situations.
NB: This case is weaker than PVP, which indicates that packages should bump their major version when adding deprecation
pragmas.
We disagree with this because packages shouldn’t be publishing with -Werror
. The intent of deprecation is to indicate that some API will change. To make that signal a major change itself defeats the purpose. You want people to start seeing that warning as soon as possible. The major change occurs when you actually remove the old API.
Yes, in development, -Werror
is often (and should be) used. However, that just helps developers be aware of deprecations more immediately. They can always add -Wwarn=deprecation
in some scope if they need to avoid updating it for the time being.
This package is licensed under The GNU AGPL 3.0 or later. If you need a license for usage that isn’t covered under the AGPL, please contact Greg Pfeil.
You should review the license report for details about dependency licenses.
Other projects similar to this one, and how they differ.
This project has been deprecated. Check out Droste instead.
Yaya is a sister library to Turtles – the same approach, but implemented in Scala. Here are some differences to be aware of:
- the
Zoo
modules in Turtles are both larger and their use is more encouraged, because Scala’s inference makes it harder to usegcata
etc. directly; - the
Unsafe
andNative
modules have different contents, because different structures are strict or lazy between the two languages. For example., in Scala,scala.collection.immutable.List
is strict, so theRecursive
instance is inNative
, while theCorecursive
instance is inUnsafe
, but Haskell’sData.List
is lazy, so theCorecursive
instance is inNative
while theRecursive
instance is inUnsafe
.
The c
type parameter specifies the arrow to use, so while it's common to
specialize to (->)
, other options can give you polymorphic recursion over
nested data types (for example., GADTs). Among other things, you can use this to define
folds of fixed-sized structures:
data VectF elem a (i :: Nat) where
EmptyVect :: VectF elem a 0
VCons :: KnownNat n => elem -> a i -> VectF elem a (n + 1)
type Vect elem n = HMu (VectF elem) n
Yaya tries to encourage you to define things in ways that are likely to maintain promises of termination. Sometimes, the compiler can even tell you when you've broken these promises, but it falls short of any guarantee of totality.
Anything known to be partial is relegated to the yaya-unsafe
package -- mostly
useful when you're in the process of converting existing directly-recursive
code.
NB: There are a number of instances (for example, Corecursive [a] (XNor a)
) that
are actually safe, but they rely on Haskell’s own recursion. We could
potentially add a module/package in between the safe and unsafe ones, containing
Corecursive
instances for types that are lazy in their recursive parameters
and Recursive
instances for ones that are strict.
We try to provide fewer fold operations (although all the usual ones can be
found in the Zoo
modules). Instead, we expect more usage of gcata
, and we
provide a collection of "algebra transformers" to make it easier to transform
various functions into generalized algebras, and between different generalized
algebras to maximize the opportunities for fusion. Although, more importantly,
it allows you to write "proto-algebras", which are functions that you expect to
use in a fold but that aren't strictly in the shape of an algebra.
Yaya has productive metamorphisms (see streamAna
and streamGApo
-- also
stream
and fstream
for more specialized versions). The naïve composition of
cata
and ana
has no benefits.
recursion-schemes combines cata
and project
into a single type
class. However, the laws for project
require either embed
or ana
, never
cata
. Similarly, the laws for cata
either stand alone or require embed
,
never project
. And you can restate this paragraph, replacing each operation
with its dual.
Also, it's impossible to define embed
for some pattern functors where it's
still possible to define project
, so project
and embed
need to be
independent.
One unfortunate consequence of the above conditions is that Projectable
is
lawless on its own. However, we expect there to be a corresponding instance of
either Corecursive
or Steppable
in all cases.
A purely ergonomic difference, yaya uses multi-parameter type classes instead of
a Base
type family.
The latter frequently requires constraints in the form of
(Recursive t, Base t ~ f)
, so we prefer Recursive t f
.
Pattern functors and algebras tend to be named independently of their
fixed-points. For example, we use Maybe
directly instead of some NatF
, XNor a b
instead of ListF
, and AndMaybe a b
instead of NonEmptyF
.
This is because many pattern functors and algebras can be applied differently in different situations, so we try to avoid pigeon-holing them and rather trying to understand what the definition itself means, rather than in the context of a fold.
I’m not as familiar with compdata, so I’ll have to look at it more before fleshing this out.
- poly-kinded recursion schemes instead of separate classes for type-indexed
recursion schemes. Using
PolyKinds
also allows for a wider variety of folds, for example, where the type index has kindType -> Type
rather than kindType
.