## Monday, March 21, 2016

### Agda is not a purely functional language

One of the more surprising things I've learned as a lecturer is the importance of telling lies to students. Basically, the real story regarding nearly everything in computer science is diabolically complicated, and if we try to tell the whole truth up front, our students will get too confused to learn anything.

So what we have to do is to start by telling a simple story which is true in broad outline, but which may be false in some particulars. Then, when they understand the simple story, we can point out the places where the story breaks down, and then use that to tell a slightly less-false story, and so on and so on, until they have a collection of models, which range from the simple-but-inaccurate to the complex-but-accurate. Then our students can use the simplest model they can get away with to solve their problem.

The same thing happens when teaching physics, where we start with Newtonian mechanics, and then refine it with special relativity and quantum mechanics, and then refine it further (in two incompatible ways!) with general relativity and quantum field theories. The art of problem solving is to pick the least complicated model that will solve the problem.

But despite how essential lies are, it's always a risk that our students discover that we are telling lies too early -- because their understanding is fragile, they don't have a sense of the zone of applicability of the story, and so an early failure can erode the confidence they need to try tackling problems. Happily, though, one advantage that computer science has over physics is that computers are more programmable than reality: we can build artificial worlds which actually do exactly obey the rules we lay out.

This is why my colleague Martín Escardó says that when teaching beginners, he prefers teaching Agda to teaching Haskell -- he has to tell fewer lies. The simple story that we want to start with when teaching functional programming is that data types are sets, and computer functions are mathematical functions.

In the presence of general recursion, though, this is a lie. Martín has done a lot of work on more accurate stories, such as data types are topological spaces and functions are continuous maps, but this is not what we want to tell someone who has never written a computer program before (unless they happen to have a PhD in mathematics)!

Agda's termination checking means that every definable function is total, and so in Agda it's much closer to true that types are sets and functions are functions. But it's not quite true!

Here's a little example, as an Agda program. First, let's get the mandatory bureaucracy out of the way -- we're defining the program name, and importing the boolean and identity type libraries.

module wat where

open import Data.Bool
open import Relation.Binary.PropositionalEquality using (_≡_; refl)


Next, let's define the identity function on booleans.


f : Bool → Bool
f true = true
f false = false


In fact, let's define it a second time!

g : Bool → Bool
g true = true
g false = false


So now we have two definitions of the identity function f, and g, so they must be equal, right? Let's see if we can write a proof this fact.

wat : f ≡ g
wat = refl


Oops! We see that refl is in red -- Agda doesn't think it's a proof. In other words, two functions which are identical except for their name are not considered equal -- α-equivalence is not a program equivalence as far as Agda is concerned!

This is somewhat baffling, but does have an explanation. Basically, an idea that was positively useful in non-dependent functional languages has turned toxic in Agda. Namely, type definitions in ML and Haskell are generative -- two declarations of the same type will create two fresh names which cannot be substituted for each other. For example, in Ocaml the following two type definitions

type foo = A | B
type bar = A | B


are not interchangeable -- Ocaml will complain if you try to use a foo where a bar is expected:

OCaml version 4.02.3

let x : foo = A

let bar_id (b : bar) = b

let _ = bar_id x;;

Characters 96-97:
let _ = bar_id x;;
^
Error: This expression has type foo but an
expression was expected of type bar


In other words, type definitions have an effect -- they create fresh type names. Haskell alludes to this fact with the newtype keyword, and in fact that side-effect is essential to its design: the type class mechanism permits overloading by type, and the generativity of type declarations is the key feature that lets users define custom overloads for their application-specific types.

Basically, Agda inherited this semantics of definitions from Haskell and ML. Unfortunately, this creates problems for Agda, by complicating its equality story. So we can't quite say that Agda programs are sets and functions. We could say that they are sets and functions internal to the Schanuel topos, but (a) that's something we may not want to hit students with right off the bat, and (b) Agda doesn't actually include the axioms that would let you talk about this fact internally. (In fact, Andy Pitts tells me that the right formulation of syntax for nominal dependent types is still an open research question.)

1. Fascinating post! I was going to complain that whilst Agda signatures/developments can definitely be seen as formulated internally to the Schanuel topos, Agda *programs* need not be seen in this light, but then I remembered that it is possible to define a fresh datatype inside a where-clause!

For what it's worth, in the Red JonPRL proof assistant we have implemented a nominal variant of Computational Type Theory, which can be seen as taking Stuart Allen's meaning explanation for CTT and running it internally inside the Schanuel topos. The correct syntax for this type theory is quite straightforward—we essentially use the abstract binding trees from PFPL, which have nominal characteristics built-in; I've written up the categorical semantics for this, since the precise semantics for the framework were not spelled out in PFPL. As far as the type theory is concerned, do not yet exploit the nominal character of our framework in any significant way, but it will be interesting to see what can be done in the future.

-- Jon Sterling

2. One other remark I meant to say is, I am not sure I would agree that this aspect of Agda makes it "impure". Impurity, as far as I (and Longley, and most people) am concerned is when you can define an operation that does not respect equality; Agda may fail to recognize extensional equalities in many cases, but I do not believe it is possible to actually *subvert* extensional equality (i.e. observe that it fails).

3. Great post -- I was looking forward to it!

But as we discussed, the picture is somewhat cloudier than that because alpha equivalence sometimes holds:

_∧_ : Bool → Bool → Bool
true ∧ b = b
false ∧ b = false

f : Bool → Bool → Bool
f x y = x ∧ y

g : Bool → Bool → Bool
g x y = x ∧ y

p : f ≡ g
p = refl

q : f ≡ _∧_
q = refl

As for interpreting this into the Schanuel topos, it seems to me as an effort to make a bug look like a feature :)

drg

4. Excellent point! f ≡ g holds in a univalent type theory though, so would you say that HoTT is a purely functional language?

5. I would not consider datatype generativity to be a real "effect" in the usual sense. In my first work on ML modules (POPL'13), we called it a "static effect" in the sense that it somehow only takes effect in a kind of static prepass over the program, prior to execution. In particular, you do not want to view datatype generativity as causing a dynamic, computational effect. Zhong Shao did effectively (ho, ho) just this in his ICFP'99 paper, and as a result it meant that in his type system any datatype definitions in the body of a functor forced it to be generative rather than applicative, even though there's no good reason for that. My first real contribution to science, as a matter of fact, was pointing out to my advisors that Shao's treatment of datatype generativity, which matched their own intuitions about generativity as a computational effect, did not in fact match the semantics of OCaml (or Moscow ML). For example, in OCaml, you can have a "generative" definition of datatype T in the body of a functor F, but each application of F to the same argument X will give you the same type F(X).T. As we argue in the F-ing modules paper, this is exactly as it should be because signature ascription (sealing) by itself is not a true effect. A functor should only be treated as generative if the body has true computational effects (whatever those are).

Sorry, that was kind of rambling...I'm not sure what is the takeaway for teaching about datatype generativity. Generally, I feel this is a kind of tricky point to explain. There are good "engineering" reasons in practice for datatypes to be opaque rather than transparent, and there are times where this bites you, but it is not so easy to explain the matter directly in terms of effects.

6. Argh, my first work on ML modules was POPL'03, not POPL'13. Wishful thinking, old man...

7. I tried to address this issue (that equality is not structural) in the definition of PiSigma which is a partial language with dependent types (hence it is not pure because of non-termination). See http://www.cs.nott.ac.uk/~psztxa/publ/pisigma-new.pdf

8. I think this is a laudable but still probably too high standard for what pure functional programming should mean. You seem to be implying that a language is not a pure functional programming language unless all its programs can be treated, using the features of the language, exactly as mathematical functions. So while computationally, programs in Agda behave as pure (total) mathematical functions (their outputs are always defined and always completely determined by their inputs), the reasoning facilities of Agda don't let you equate extensionally equal functions. But I think this is not a fair criticism. You can just as well define a notion of extensional equality, and then functions will be identified if they are input-output equivalent. The fact that the standard propositional equality is not extensional just means that it is not intended for this kind of reasoning. I feel your criticism would be like objecting to the use of propositional equality for reasoning about co-recursively defined values. "Hey, equality is broken for these!" No, we all know you are supposed to use bisimulation for reasoning about streams and such. So I have to say I don't accept your argument that Agda is not a pure functional language. I do appreciate that you raise the point in an interesting way.

9. What about saying to students that Agda is incomplete—f = g is true in at least a model, just not provable. Indeed, postulating extensional equality is known to be consistent (though it does break canonicity of course), hence Jon Sterling (I guess) is right when claiming you can't subvert extensional equality:

> Agda may fail to recognize extensional equalities in many cases, but I do not believe it is possible to actually *subvert* extensional equality (i.e. observe that it fails).

10. How is declaring a new data type an "effect"? At best, it's a hidden tag on the type. A piece of data that has no effect on the structure other than distinguishing it from other types which are not tagged the same way.

The issue with f not being reflexively equal to g seems like an implementation quirk to me. I wonder if they would be equal if expressed via explicit case expressions. It's really more of an artifact that case-wise pattern match definitions are not done in any principled way in Agda. Which I suppose is a fair criticism.

11. > But as we discussed, the picture is somewhat cloudier than that because alpha equivalence sometimes holds:

That example confused me. But in that example proving f, g, and _∧_ equal, you only need eta rules for function types, which Agda supports. Normalization reduces f to 𝜆 x y → f x y and g to 𝜆 x y → g x y (via eta-expansion). Those two functions reduce to 𝜆 x y → x ∧ y by inlining their definitions. _∧_ reduces to 𝜆 x y → x ∧ y by eta-expansion. Since 𝜆 x y → x ∧ y is a normal form, and f, g and _∧_ all reduce to it, they're definitionally equal.

Instead, proving equal functions f and g from the blog would require an eta rule on datatypes. Such a rule is false on datatypes with indexes. And it's not enough on structurally recursive functions. Even just eta on sum types is pretty tricky to implement.

> The issue with f not being reflexively equal to g seems like an implementation quirk to me.

Not quite.

> I wonder if they would be equal if expressed via explicit case expressions.

You can prove f and g are equal on all inputs (see fx≡gx below), by case analysis on the input, but you can't prove f and g are equal unless you *postulate* extensional equality for functions. This is a consistent postulate, but you can't derive it in Agda or in intensional Martin-Löf type theory.

fx≡gx : ∀ x → f x ≡ g x
fx≡gx true = refl
fx≡gx false = refl

> It's really more of an artifact that case-wise pattern match definitions are not done in any principled way in Agda.

That seems too harsh. The view from the left explains how you could reduce pattern matching to eliminators, even though that's not implemented.

I see that if you reduced to eliminators this particular problem would disappear.
Still, the category of sets and functions has extensional equality and Agda and Coq don't—two different stable sorts are extensionally equivalent functions, but not equal.