In The Point of Laziness, Bob Harper mentioned a core duality between lazy and eager languages[1]. Eager languages have sum types but not product types and lazy languages have product types but no sum types. I've seen a few questions about what he means. I thought I'd try to explain. It all has to do with non-termination, our old friend "bottom" which I'll write ⊥. In this entire article I'm focusing on languages that don't allow any side effects other than ⊥ - that way I don't have to deal with ordering print statements or whatever. Also, I'm using an MLish notation that hopefully isn't too confusing.
Update: This comment by Dan Doel does a better job of explaining the issues than my whole article.
Product Types
A product type is a pair. The pair type (A,B) is the product of A and B, and is often written as A * B or A × B. A 3-tuple[2] (A,B,C) can be seen as another way of saying ((A,B),C), etc. so any n-Tuple is just a pretty way to write nested pairs. The simplest way to define exactly what a pair means is through projection functions "first" and "second" which pull out the first or second element. The equations we want are
first (a,b) == a second (a,b) == b
Pretty straight forward right? Well, those equations hold just fine in a lazy language even when a or b is ⊥. But in an eager language, first (a,⊥) == ⊥. Same deal with second. Oops. Hence lazy languages have product types in the sense that they completely meet these equations where eager languages don't.
That has a practical consequence. A compiler for an eager language can't necessarily simplify first(a, someComplicatedExpression) unless it can prove that someComplicatedExpression is never ⊥, and in general it can't do that. A lazy language compiler would always be free to throw away someComplicatedExpression.
Sum Types
A sum type is an alternative between two choices. A | B is the sum of the two things, A and B, and is frequently written A + B. Choices among three elements, like A | B | C can be considered as just a pretty form of (A | B) | C - a nesting of binary choices. The simplest sum type is good old Boolean, the choice between true or false. Here's one equation we might expect to hold for Booleans and something that examines booleans, the if expression[3]
((if condition then x else y), z) == if condition then (x,z) else (y,z)
Ignore the possibility of x, y, and z being ⊥ and focus on "condition". In an eager language, condition must be evaluated in either of the two forms above and either way you get ⊥ if condition is ⊥. But in a lazy language, the left side will compute (⊥, z) where he right side will compute ⊥. Pulling the second element from those two results gives differing results, proving they are unequal.
Again, this has consequences. An eager language can simplify the right side form to the left side form if it wants to. A lazy language can't.
Footnotes
- Strictly speaking I should say strict and non-strict instead of eager and lazy, but I'll go with Dr. Harper's wording because the difference isn't that important to the point.
- Or a record (aka a struct). A tuple can be seen as a record where the field labels are numbers or a record can be seen as a tuple with some convenient labels for the positions.
- If you are more familiar with C like languages that treat "if/else" as statements instead of expressions then pretend I'm using a ternary operator "condition ? x : y" instead of "if condition then x else y"
15 comments:
Great tidbit as always, thanks!
Eh. Saying that:
((if condition then x else y), z) == if condition then (x,z) else (y,z)
is an invariant is akin to saying that:
(a/b)*b == a
holds for all numeric types and then pulling claiming that language X "doesn't support numbers" because you can construct a case where it doesn't hold.
-- MarkusQ
James, is not it about, roughly speaking, List being a monad and Stream being a comonad, and having an adjunction in opposite directions, so that one forgetful functor preserves limits and another preserves colimits?
Bottom can also signify "undefinedness" in addition to non-termination.
@openid-70529:disqus , the same objection holds for the eager language problem with products - the invariant holds up just fine as long as we ignore 0's (bottom). Bob Harper's point, I think, is that there's a nifty duality here where bottom causes problems in invariants in different but equal ways for both eager and lazy languages.
@c4aceff2d37f44f053b54f1427e3e2e6:disqus , I'm not as versed in category theory as you are. But it is related to the same kind of dualities (sum is often called coproduct). However, in this case we don't even need infinite co-data structures like stream to see the differences.
@be34e931ba2fc0eed93a0c47d616fce9:disqus true, I linked to another article to talk more about bottoms. I didn't want to go back over the same territory.
Agreed. It's simply not true that eager/lazy languages lack products/sums. They both do well with both.
There are simpler, cleaner ways to describe these differences in strict and functional languages. For the products example, the issue is that beta reduction is an available equational rewrite in lazy languages but not eager languages; anyone who has implemented a function inliner becomes familiar with this issue one way or another.
For the sums example, I'm less sure how to explain it. One possibility is to dwell on the number of different bottoms that are available. Eager languages just have _|_, whereas lazy languages have exotica like (_|_, 5, _|_, "foo"). The only sound equational rewrites are those that preserve not just bottom-ness, but the precise choice of bottom.
What do sum types have to do with _|_? unless you've solved the halting problem, any value, of any type can non-terminate.
also, given that the principle lazy language (Haskell) has extensive sum types, you might want to reconsider your title.
That difference is interesting, but the "X languages do not have Y" is very overwrought to the point of being false. You explanation was mainly that different evaluation strategies allow different transformations. Also, your Product example assumed (,) could not be bottom. And yet you assume that the sum (False|True) can be bottom to make your point. And the Sum example you use depends on the presence of the Product with z while the Produce example has no Sum present. The examples thus seem mainly unrelated to each other and I cannot divine any "core duality" from them.
What definition of product and 'sum' are we using? If it's categorical, then I'm not sure either set of languages have either. In particular, I'm pretty sure Haskell has neither products nor coproducts in its usual repertoire.
You mentioned the rules:
fst (a, b) = a
snd (a, b) = b
But that is only half of what is required of a categorical product. The other half is an eta rule:
(fst p, snd p) = p
And Haskell fails this for its products, because when you put bottom in for p, you get:
(_|_, _|_) = _|_
which is false. A categorical product for Haskell would be unlifted, and would be considered non-bottom if either component were non-bottom, but bottom if both were. But we don't have those available.
Coproducts also have two sets of requirements. One is:
(case Left e of Left x -> f x ; Right y -> g y) = f e
(case Right e of Left x -> f x ; Right y -> g y) = g e
This seems satisfied by both languages. The other is harder to write down as a single equation:
if f = h . Left and g = h . Right
then (case s of Left x -> f x ; Right y = g y) = h s
And this is what is violated by Haskell, because const 1 _|_ = 1, but case _|_ of ... = _|_.
I can't think of a way that an eager language would fail the sum requirements, so it may be the case that eager languages have sums but not products, while most lazy languages don't have either.
This comment is fascinating. Dr. Harper isn't the first person from whom I've heard "lazy languages have products but not sums" and the argument I've seen is the fst (a, b) = a business. I don't think I've ever seen anybody carry it on to (fst p, snd p) = p. I wonder if we could get the unlifted behavior you describe using parallel execution.
Of course Haskell "has sum types" in its library and of course SML "has products types" in its library. This article is an attempt to explain what Dr. Harper was getting at. However, Dan Doel's comment has put me to shame.
Dan Doel's comment is that cleaner way to describe the differences and why in some sense it's technically accurate to say that design choices mean that a language "doesn't have sums or products."
I dimly recall that an explanation of why Haskell doesn't have unlifted products (an implementation difficulty). Do you know why?
Another way to express the issue is via an isomorphism (which is unsurprising if you know how sums and products are involved in adjunctions). For products, the isomorphism is:
a -> b*c ~ (a -> b) and (a -> c)
In Haskell, this fails because the latter has no way to encode the former's "const _|_" differently from "const (_|_,_|_)". The ML (and similar) failure is that if either function on the right is "const _|_", so must be the left.
For sums, it's:
a+b -> c ~ (a -> c) and (b -> c)
And a dual problem occurs for Haskell, having no way to distinguish lazy from strict constant functions when encoding on the right.
This, however, makes the eager success clear. The reason they fail the product case is that function spaces lift their codomains (when you do semantics of eager languages). So in ML, you get a failure that:
a -> Lifted (b*c) /~ (a -> Lifted b) and (a -> Lifted c)
However, there is no lifting going on on the left of a function arrow, so it has no problem with:
a+b -> Lifted c ~ (a -> Lifted c) and (b -> Lifted c)
In Haskell, everything is lifted, so there are problems on both sides.
There's probably an interplay of a couple reasons. One choice of behavior would probably incur a performance penalty. For instance, suppose we want seq to behave correctly for unlifted products. Then you need:
(undefined,undefined) `seq` ()
to be undefined, but neither of:
(3, undefined) `seq` ()
(undefined, 3) `seq` ()
So you can't just seq both components, because that will be wrong. You essentially need to implement lub, so:
p `seq` e = (fst p `seq` e) `lub` (snd p `seq` e)
And in GHC core, you can implement seq by:
x `seq` y = case x of _ -> y
because case statements always evaluate their arguments in core. So you're making your fundamental evaluation model more complicated and probably less efficient.
But, maybe you don't care about seq. In that case you can just make all product matches irrefutable. Which means when you see:
case p of (x, y) -> ...
it behaves instead like:
case p of ~(x, y) -> ...
This probably works fine, but a lot of people seem to have qualms about the meaning of the above pattern changing when you go from:
data T = P X Y Z
to:
data T = P X Y Z | Q W
Adding a constructor to my datatype makes it lifted, and the match becomes 'strict' instead of 'lazy'. This doesn't bother me too much, but it seems to turn some people off.
And of course, in this latter scenario, the presence of seq in the language means you still don't really have products.
Post a Comment