Monday, December 14, 2020

TypeFoundry: new ERC Consolidator Grant

I am very pleased to have received an ERC Consolidator Grant for my TypeFoundry proposal.

This will be a five year project to develop the foundations of bidirectional type inference. If you are interested in pursuing a PhD in this area, or conversely, are finishing a PhD in this area, please get in touch!

Many modern programming languages, whether developed in industry, like Rust or Java, or in academia, like Haskell or Scala, are typed. All the data in a program is classified by its type (e.g., as strings or integers), and at compile-time programs are checked for consistent usage of types, in a process called type checking. Thus, the expression 3 + 4 will be accepted, since the + operator takes two numbers as arguments, but the expression 3 + "hello" will be rejected, as it makes no sense to add a number and a string. Though this is a simple idea, sophisticated type system can track properties like algorithmic complexity, data-race freedom, differential privacy, and data abstraction.

In general, programmers must annotate programs to tell compilers the types to check. In theoretical calculi, it is easy to demand enough annotations to trivialize typechecking, but this can make the annotation burden unbearable: often larger than the program itself! So, to transfer results from formal calculi to new programming languages, we need type inference algorithms, which reconstruct missing data from partially-annotated programs.

However, the practice of type inference has outpaced its theory. Compiler authors have implemented many type inference systems, but the algorithms are often ad-hoc or folklore, and the specifications they are meant to meet are informal or nonexistent. The makes it hard to learn how to implement type inference, hard to build alternative implementations (whether for new compilers or analysis engines for IDEs), and hard for programmers to predict if refactorings will preserve typability.

In TypeFoundry, we will use recent developments in proof theory and semantics (like polarized type theory and call-by-push-value) to identify the theoretical structure underpinning type inference, and use this theory to build a collection of techniques for type inference capable of scaling up to the advanced type system features in both modern and future languages.

One of the things that makes me happy about this (beyond the obvious benefits of research funding and the recognition from my peers) is that it shows off the international character of science. I'm an Indian-American researcher, working in the UK, and being judged and funded by researchers in Europe. There has been a truly worrying amount of chauvinism and jingoism in public life recently, and the reminder that cosmopolitanism and universalism are real things too is very welcome.

Also, if you are planning on submitting a Starting Grant or Consolidator proposal to the ERC in the coming year about programming languages, verification or the like, please feel free to get in touch, and I'll be happy to share advice.

Wednesday, December 2, 2020

Church Encodings, Inductive Types, and Relational Parametricity

My blogging has been limited this past year due to RSI, but I do not want to leave things entirely fallow, and last year I wrote an email which can be edited into a decent enough blog post.

Quite often, people will hear that System F, the polymorphic lambda calculus, satisfies a property called relational parametricity. We also often hear people say that the parametricity property of System F lets you prove that a Church encoding of an inductive datatypes actually satisfies all the equational properties we expect of inductive types.

But it's surprisingly hard to find an explicit account of how you go from the basic properties of parametricy (the relational interpretation of System F and the identity extension principle) to the proof that inductive types are definable. I learned how this works from Bob Atkey's paper Relational Parametricity for Higher Kinds, but the proofs are kind of terse, since his page count was mostly focused on the new stuff he had invented.

In the sequel, I'm going to assume that you know what the relational model of system F looks like, that the term calculus satisfies the abstraction theorem (i.e., the fundamental theorem of logical relations), and also that the model satisfies the identity extension property -- if you take the relational interpretation of a type B(α), and fill in the type variable α with the equality relation for the type A, then the relational interpretation of B[A/α] will the equality relation for the type B[A/α]. In what follows I will often write Id[X] to mean the equality relation for the type X.

For completeness' sake, I also give a quick summary of the model at the end of the post.

Haskell-style functors in System F

Given all this preface, we now define a functor F as a type expression F(-) with a hole, along with an operation fmap : ∀a,b. (a → b) → F(a) → F(b), such that

fmap _ _ id = id
fmap _ _ f ∘ fmap _ _ g = fmap _ _ (f ∘ g)

I've written _ to indicate System F type arguments I'm suppressing. Basically think of these as Haskell-style functors.

Next, we can define the inductive type μF using the usual Church encoding as:

μF = ∀a. (F(a) → a) → a

foldF : ∀a. (F(a) → a) → μF → a
foldf = Λa λf : F(a) → a. λx:μF. x [a] f

inF : F(μF) → μF
inF x = Λa. λf : F(a) → a. 
          let g : μF → a       = fold [μF] [a] f in
          let h : F(μF) → F(a) = fmap [μF] [a] g in
          let v : F(a)          = h x in
          f v

I've written out inF with type-annotated local bindings to make it easier to read. If you inlined all the local bindings and suppressed type annotations, then it would read:

inF : F(μF) → μF
inF x = Λa. λf : F(a) → a. 
          f (fmap (fold f) x)

F-algebra Homomorphisms and the Graph Lemma

Now, we can prove a lemma about F-algebra homomorphisms, which is the tool we will use to prove initiality. But what are F-algebra homomorphisms?

An F-algebra is a pair (X, g : F(X) → X). An F-algebra homomorphism between two F-algebras (X, k : F(X) → X) and (Y, k' : F(Y) → Y) is a function f : X → Y such that the following ASCII-art diagram commutes:

 F(X) — k  → X
  |           |
 fmap X Y f   f
  ↓           ↓
 F(Y) — k' → Y

That is, for all u : F(X), we want f(k u) = k'(fmap [X] [Y] f u)

Before we prove something about homomorphisms, we need to prove a technical lemma called the graph lemma, which will let us connect parametricity with our functorial definitions.

Graph Lemma: Let (F, fmap) be a functor, and h : A → B be a function. Define the graph relation <h> to be the relation {(a,b) | b = f(a) }. Then F<h> ⊆ <fmap [A] [B] h>.

Proof:

  1. First, note that fmap has the type ∀a b. (a → b) → F(a) → F(b).
  2. Since fmap is parametric, we can choose the relations <h> and the identity to instantiate F with, giving us:

    (fmap [A] [B], fmap [B] [B]) ∈ (<h> → Id[B]) → (F<h> → Id<B>)
  3. Note that (h, id) ∈ <h> → Id[B].
  4. Hence (fmap [A] [B] h, fmap [B] [B] id) ∈ (F<h> → Id<B>)
  5. By functoriality, (fmap [A] [B] h, id) ∈ (F<h> → Id<B>)
  6. Assume (x, y) ∈ F<h>.
    1. Hence (fmap [A] [B] h x, id y) ∈ Id<B>)
    2. So fmap [A] [B] h x = y.
  7. Therefore (x, y) ∈ <fmap [A] [B] h>.
  8. Therefore F<h> ⊆ <fmap [A] [B] h>.

Graph relations are just functions viewed as relations, and the graph lemma tells us that the relational semantics of a type constructor applied to a graph relation <h> will behave like the implementation of the fmap term. In other words, it connects the relational semantics to the code implementing functoriality. (As an aside, this feels like it should be an equality, but I only see how to prove the inclusion.)

We use this lemma in the proof of the homomorphism lemma, which we state below:

Homomorphism Lemma: Let (F, fmap) be a functor. Given two F-algebras (A, k : F(A) → A) and (B, k' : F(B) → B), and an F-algebra homomorphism h : (A,k) → (B,k'), then for all e : μF, we have

e [B] k' = h (e [A] k)

Proof:

  1. First, by the parametricity of e, we know that

     (e [A], e [B]) ∈ (F<h> → <h>) → <h>
  2. We want to apply the arguments (k, k') to (e [A], e [B]), so we have to show that (k, k') ∈ F<h> → <h>.
  3. To show this, assume that (x, y) ∈ F<h>.

    1. Now we have to show that (k x, k' y) ∈ <h>.
    2. Unfolding the definition of <h>, we need (h (k x), k' y) ∈ Id[B].
    3. Since h is an F-algebra homomorphism, we have that h (k x) = k' (fmap [A] [B] h x).
    4. So we need to show (k' (fmap A B h x), k' y) ∈ Id[B].
    5. Now, we know that (x, y) ∈ F<h>.
    6. By the graph lemma, we know F<h> ⊆ <fmap [A] [B] h>.
    7. So (x, y) ∈ <fmap [A] [B] h>.
    8. Unfolding the definition, we know y = fmap [A] [B] h x.
    9. So we want (k' (fmap [A] [B] h x), k' (fmap [A] [B] h x)) ∈ Id[B].
    10. Since Id[B] is an equality relation, this is immediate.
  4. Hence (k, k') ∈ F<h> → <h>.
  5. Therefore (e [A] k, e [B] k') ∈ <h>.
  6. Unfolding the definition of <h>, we know e [B] k' = h(e [A] k). This is what we wanted to show.

Discussion:

The whole machinery of F-algebra homomorphisms basically exists to phrase the commuting conversions in a nice way. We just proved that for e : μF, we have

 e [B] k' = h (e [A] k)

Recall that for Church encoded inductive types, the fold is basically the identity, so this result is equivalent (up to beta) to:

 foldF [B] k' e = h (foldF [A] k' e)

So this lets us shifts contexts out of iterators if they are F-algebra homomorphisms. Note that this proof was also the one where the graph lemma is actually used.

The Beta and Eta Rules for Inductive Types

Next, we'll prove the beta- and eta-rules for inductive types. I'll do it in three steps:

  • The beta rule (eg, let (x,y) = (e1, e2) in e' == [e1/x, e2/y]e'),
  • The basic eta rule (eg, let (x,y) = e in (x,y) == e)
  • The commuting conversions (eg, C[e] = let (x,y) = e in C[(x,y)])

Theorem (beta rule): If k : F(A) → A and e : F(μF), then

 foldF [A] k (inF e) = k (fmap [μF] [A] (foldF [A] k) e)

Proof: This follows by unfolding the definitions and beta-reducing. (You don't even need parametricity for this part!)

This shows that for any F-algebra (A, k), foldF [A] k is an F-algebra homomorphism from (μF, inF) to (A, k).

Theorem (basic eta) For all e : μF, we have e = e [μF] inF

Proof:

  1. Assume an arbitrary B and k : F(B) → B.
    1. Note h = foldF [B] k is an F-algebra homomorphism from (μF, inF) to (B, k) by the beta rule.
    2. Then by our lemma, e [B] k = h (e [μF] inF).
    3. Unfolding, h (e [μF] inF) = foldF [B] k (e [μF] inF).
    4. Unfolding, foldF [B] k (e [μF] inF) = e [μF] inF [B] k.
    5. So e [B] k = e [μF] inF [B] k.
  2. So for all B and k : F(B) → B, we have e [B] k = e [μF] inF [B] k.
  3. By extensionality, e = e [μF] inF.

Theorem (commuting conversions): If k : F(A) → A and f : μF → A and f is an F-algebra homomorphism from (μF, inF) to (A, k), then f = foldF [A] k.

Proof:

  1. We want to show f = foldF [A] k.
  2. Unfolding foldF, we want to show f = λx:μF. x [A] k
  3. Assume e : μF.
    1. Now we will show f e = e [A] k.
    2. By the homomorphism lemma, and the fact that f : (μF, inF) → (A, k) we know that e [A] k = f (e [μF] inF).
    3. By the basic eta rule, e [A] k = f e
  4. So for all e, we have f e = e [A] k.
  5. Hence by extensionality f = λx:μF. x [A] k.

Note that what I (as a type theorist) called "commuting conversions" is exactly the same as "initiality" (to a category theorist), so now we have shown that the Church encoding of a functor actually lets us define the initial algebra.

Thus, we know that inductive types are definable!

Appendix: the Relational Model of System F

In this appendix, I'll quickly sketch one particular model of System F, basically inspired from the PER model of Longo and Moggi. What we will do is to define types as partial equivalence relations (PERs for short) on terms. Recall that a partial equivalence relation A on a set X is a relation on X, which is (a) symmetric (i.e., if (x, x') ∈ R then (x',x) ∈ R), and (b) transitive (if (x,y) ∈ X and (y,z) ∈ R then (x,z) ∈ R).

We take the set of semantic types to be the set of PERs on terms which are closed under evalation in either direction:

Type = { X ∈ PER(Term) | ∀x, x', y, y'. 
                           if x ↔∗ x' and y ↔∗ y' 
                           then (x,y) ∈ X ⇔ (x',y') ∈ X}

PERs form a complete lattice, with the meet given by intersection, which is the property which lets us interpret the universal quantifier of System F. We can then take a type environment θ to be a map from variables to semantic types, and interpret types as follows:

〚a〛θ ≜ θ(α)
〚A → B〛θ ≜ { (e, e') ∈ Term × Term | ∀ (t,t') ∈ 〚A〛θ. 
                                          (e t,e' t') ∈ 〚B〛θ }
〚∀a. A〛θ ≜ ⋂_{X ∈ Type} 〚A〛(θ, X/α) 

This is the PER model of System F, which is quite easy to work with but not quite strong enough to model parametricity. To model parametricity, we need a second semantics for System F, the so-called relational semantics.

A relation R between two PERs A and B over a set X, is a a relation on X which respects A and B. Suppose that (a,a') ∈ A and (b,b') ∈ B. Then, a relation R respects A and B when (a, b) ∈ R if and only if (a', b') ∈ R.

Now, suppose we have two environments θ₁ and θ₂ sending type variables to types, and a relation environment ρ that sends each type variable a to a relation respecting θ₁(a) and θ₂(a). Then we can define the relational interpretation of System F types as follows:

R〚a〛θ₁ θ₂ ρ ≜ θ(α)
R〚A → B〛θ ≜ { (e, e') ∈ Term × Term | ∀ (t,t') ∈ R〚A〛θ₁ θ₂ ρ. 
                                          (e t,e' t') ∈ R〚B〛θ₁ θ₂ ρ }
R〚∀a. A〛θ ≜ ⋂_{X, Y ∈ Type, R ∈ Rel(X,Y)} 
                  R〚A〛(θ₁, X/a) (θ₂, Y/a) (ρ, R/a)

Additionally, we need to redefine the PER model's interpretation of the ∀a.A case, to use the relational model:

〚∀a. A〛θ ≜ ⋂_{X, Y ∈ Type, R ∈ Rel(X,Y)} 
                 R〚A〛θ θ (Id[θ], R/a)

Here, to give a PER interpetation for the forall type, we use the relational interpretation, duplicating the type environment and using the identity relation for each variable (written Id(θ)). By identity relation, we mean that given a PER A, the PER A is a relation between between A and A which respects A.

This modified definition ensures the model satisfies the identity extension property -- if you take the relational interpretation of a type B(α), and fill in the type variable α with the equality relation for the type A, then the relational interpretation of B[A/α] will the equality relation for the type B[A/α]. In what follows I will often write Id[X] to mean the equality relation for the type X.

The term calculus for System F also satisfies the abstraction property, (aka the fundamental property of logical relations), which says that given a well typed term Θ; · ⊢ e : A and two type environments θ₁ and θ₂, and a relation environment ρ between them, then e is related to itself in R〚A〛θ θ ρ.