## Monday, December 3, 2012

### Total Functional Programming in a Partial Impure Language

In my last post, I showed how you could implement the Int construction over resumptions, to derive a higher-order programming language from a first-order one. However, this is not yet a practical tool. The code I gave presented the language as a bunch of combinators, which rapidly get quite painful to write serious programs. Most of us prefer the lambda calculus to SKI combinators for good reason --- programs with variables and binders are much easier to read, since the code that actually does work in not obscured in a rat's nest of plumbing combinators that route data from one combinator to another.

I had hoped to rectify that in this post, by showing how to compile the linear lambda calculus into combinator terms. But when I started to write it, I realized that I needed to write a warmup post first, which would introduce the basic ideas in a simpler setting. So I'll defer compiling linear types for a little while, and show how to embed total functional programming into OCaml.

## Wednesday, November 28, 2012

### The Geometry of Interaction, as an OCaml program

In this post, I'll show how to turn the Geometry of Interaction construction --- a model of linear logic --- into a runnable Ocaml code. . The basic idea behind this paper is that we can view higher-order behavior, as a disciplined mode of use of a first-order language with recursion --- basically, if you fix a calling convention, then you can compile first-class functions directly and compositionally into a first-order program.

This is interesting because there are a lot of combinator libraries out there which can be seen as DSLs with a natural notion of "first-order program", but which lack a good notion of "higher-order program". (For example, most Haskell libraries based on arrows qualify.) As a programming model, this is very unsatisfying, since this restriction basically says that the author of the library can build interesting abstractions, but clients of the library can't. Wouldn't it be nice if there were a systematic way to turn any first-order language (with a bit of structure) into a higher-order one?

That's exactly what the G construction (also called the Int construction) does. It says that if you have a first-order language with sequential and parallel composition, plus feedback, then you've really got a higher-order language. Stated more formally, in categorical jargon, it says that there is a way of constructing a monoidal closed category from any symmetric monoidal category with a trace operator. (Monoidal products mean parallel composition, categories give you sequential composition, and trace is how category theorists pronounce recursion.)

Instead of giving the categorical derivation, I'll work very concretely, following the same example Abramsky gave in his classic paper Retracing Some Paths in Process Algebra.

First, let's get some Ocaml preliminaries out of the way. I'm just defining function composition and empty and sum types, and their operations. These should really be in the Ocaml Pervasives environment, but they aren't. The Haskell prelude is nicer in this regard.

let ($) f g x = f (g x) type ('a,'b) sum = Inl of 'a | Inr of 'b let inl x = Inl x let inr y = Inr y let case f g = function Inl x -> f x | Inr y -> g y let sum f g = case (inl$ f) (inr $g) let swap v = case inr inl v type zero = Void of zero let rec abort (Void v) = abort v  Our running example will come from dataflow and visual programming languages like Simulink. These languages have a model of programming in which you have a bunch of stateful components (adders, multipliers, accumulators), which are connected together by means of wires. This is naturally a first-order formalism, in that it really only makes sense to send data over the wires --- you can't send components over the wire. One way of modeling these kinds of stateful components is by means of resumptions. A resumption from$I$(read "input") to$O$(read "output") is essentially an element of the solution to the domain equation$R = I \to (O \times R)$. You can think of a gadget of this type as something which takes an input, and then gives you an output, plus a new resumption to use on the next input. So it's a bit like a state machine, but formulated in such a way that you don't have to give an explicit state set. It's easy to turn this domain equation into a recursive ML type: module Resumption = struct type ('i,'o) r = R of ('i -> 'o * ('i,'o) r)  The identity resumption just echoes its input as its output.  (* val id : ('a, 'a) r *) let rec id = R(fun x -> (x, id))  To compose two resumptions$f$and$g$, we just feed an input into$f$, and take$f$'s output as the input into$g$.  (* val ( >> ) : ('a, 'b) r -> ('b, 'c) r -> ('a, 'c) r *) let rec (>>) (R f) (R g) = R(fun x -> let (y, f') = f x in let (z, g') = g y in (z, f' >> g'))  Note that with composition, we have a category$R$, the category of resumptions. The objects are types, a morphisms$f : I \to O$is a resumption taking$I$inputs and producing$O$outputs. This category is also "monoidal closed", with the tensor product$A \otimes C$getting defined as the sum type$A + C$. The intuition is that if we think of$A$and$C$as message types, we may want to consider how to react to$A$and$C$messages simultaneously. In particular, if we have two machines$f : A \to B$and$g : C \to D$, we can combine them by taking an$A+C$message and dispatching to either$f$or$g$respectively.  (* val ( ** ) : ('a, 'b) r -> ('c, 'd) r -> (('a, 'c) sum, ('b, 'd) sum) r *) let rec ( ** ) (R f) (R g) = R(function | Inl x -> let (y, f') = f x in (inl y, f' ** (R g)) | Inr x -> let (y, g') = g x in (inr y, (R f) ** g'))  We can also implement feedback --- if we have an$f : A \otimes B \to C \otimes B$, then we can construct a map$A \to C$by feeding$f$the$A$, and repeatedly sending the output back into the input until$f$coughs up a$C$rather than a$B$:  (* val trace : (('a, 'b) sum, ('c, 'b) sum) r -> ('a, 'c) r *) let rec trace f = R(fun a -> let rec loop (R f) v = match f v with | (Inl c, f') -> (c, trace f') | (Inr b, f') -> loop f' (inr b) in loop f (inl a))  Now, we can also implement various associativity and commutativity properties. The names are not particularly great (they are just the standard names for the natural transformations associated with a tensor product in category theory), but all the real action is in the types.  (* val sym : (('a, 'b) sum, ('b, 'a) sum) r *) let rec sym = R(fun v -> (swap v, sym)) (* val rho : (('a, zero) sum, 'a) r *) let rec rho : (('a,zero) sum, 'a) r = R(function | Inl a -> (a, rho) | Inr z -> abort z) (* val rho' : ('a, ('a, zero) sum) r *) let rec rho' : ('a, ('a, zero) sum) r = R(fun a -> (inl a, rho')) (* val lambda : ((zero, 'a) sum, 'a) r *) let rec lambda : ((zero,'a) sum, 'a) r = R(function | Inl z -> abort z | Inr a -> (a, lambda)) (* val lambda' : ('a, (zero, 'a) sum) r *) let rec lambda' : ('a, (zero, 'a) sum) r = R(fun a -> (inr a, lambda')) (* val alpha : ((('a, 'b) sum, 'c) sum, ('a, ('b, 'c) sum) sum) r *) let rec alpha : ((('a,'b) sum, 'c) sum, ('a, ('b,'c) sum) sum) r = R(function | Inl (Inl a) -> (inl a, alpha) | Inl (Inr b) -> (inr (inl b), alpha) | Inr c -> (inr (inr c), alpha)) (* val alpha' : (('a, ('b, 'c) sum) sum, (('a, 'b) sum, 'c) sum) r *) let rec alpha' : (('a, ('b,'c) sum) sum, (('a,'b) sum, 'c) sum) r = R(function | Inl a -> (inl (inl a), alpha') | Inr (Inl b) -> (inl (inr b), alpha') | Inr (Inr c) -> (inr c, alpha')) end  Now, we give the G construction in terms of the resumption module we've just defined. In the G construction, we build a new category$G(R)$from the old one we've just defined. The basic idea here is to talk about bidirectional communication, and so for objects in our new category$G$will be pairs of objects from our old one. Objects of$G$will be pairs$(A^+, A^-)$, where you can think of$A^+$as a type of messages sent out, and the type$A^-$as the type of messages that are accepted. A morphisms of$G$,$f : (A^+,A^-) \to (B^+, B^-)$will be resumption of type$A^+ \otimes B^- \to A^- \otimes B^+$. Note that the polarities of$A$and$B$are flipped in this mapping -- we are transforming$A$messages out to$B$messages out, and we need to accept$B$messages in order to transform them into$A$messages in. module G = struct open Resumption  Now, let's define the type of morphisms in$G(R)$. Since we don't have type-level pairing in ML, we end up having four type parameters. We use a dummy G constructor to get the type system to produce more useful inferred types.  type ('aplus, 'aminus, 'bplus, 'bminus) map = G of (('aplus, 'bminus) sum, ('aminus, 'bplus) sum) r  The identity map in$G(R)$is just the symmetry map in$R$.  let gid = G sym  Composition is trickier. If we have a map$f : (A^+,A^-) \to (B^+, B^-)$and$g : (B^+,B^-) \to (C^+,C^-)$, we can compose it by feeding$f$'s$B$-output into$g$'s input, and vice-versa, using the trace operator. It's best visualized with the following diagram: Actually doing so requires applying the associativity and commutativity properties of sum types$A + B \simeq B + A$and$(A + B) + C \simeq A + (B + C)$.  let rec assoc : ((('aplus,'cminus) sum, ('bminus,'bplus) sum) sum, (('aplus, 'bminus) sum, ('bplus, 'cminus) sum) sum) r = R(function | Inl (Inl aplus) -> (inl (inl aplus), assoc) | Inl (Inr cminus) -> (inr (inr cminus), assoc) | Inr (Inl bminus) -> (inl (inr bminus), assoc) | Inr (Inr bplus) -> (inr (inl bplus), assoc)) let rec assoc2 : ((('aminus, 'bplus) sum, ('bminus, 'cplus) sum) sum, (('aminus, 'cplus) sum, ('bminus, 'bplus) sum) sum) r = R(function | Inl (Inl aminus) -> (inl (inl aminus), assoc2) | Inr (Inr cplus) -> (inl (inr cplus), assoc2) | Inl (Inr bplus) -> (inr (inr bplus), assoc2) | Inr (Inl bminus) -> (inr (inl bminus), assoc2)) (* val gcompose : ('a, 'b, 'c, 'd) map -> ('c, 'd, 'e, 'f) map -> ('a, 'b, 'e, 'f) map *) let gcompose (G f) (G g) = G (trace (assoc >> (f ** g) >> assoc2))  Once we have composition, we can define a tensor product on the$G(R)$category as well, so that$(A^+,A^-) \oplus (B^+,B^-) = (A^+ + B^+, A^- + B^-)$. Again, implementing the functorial action requires implementing the associativity maps.  (* (A+, A-) * (X+, X-) = (A+ + X+, A- + A-) f : A+ + B- -> A- + B+ g : X+ + Y- -> X- + Y+ f ** g : (A+ + B-) + (X+ + Y-) -> (A- + B+) + (X- + Y+) r : (A+ + X+) + (B- + Y-) -> (A+ + B-) + (X+ + Y-) r' : (A- + B+) + (X- + Y+) -> (A- + X-) + (B+ + Y+) *) (* val ( + ) : ('a, 'b, 'c, 'd) map -> ('e, 'f, 'g, 'h) map -> (('a, 'e) sum, ('b, 'f) sum, ('c, 'g) sum, ('d, 'h) sum) map *) let (+) (G f) (G g) = let rec r = R(function | Inl (Inl aplus) -> (inl (inl aplus), r) | Inl (Inr xplus) -> (inr (inl xplus), r) | Inr (Inl bminus) -> (inl (inr bminus), r) | Inr (Inr yminus) -> (inr (inr yminus), r)) in let rec r' = R(function | Inl (Inl aminus) -> (inl (inl aminus), r') | Inl (Inr bplus) -> (inr (inl bplus), r') | Inr (Inl xminus) -> (inl (inr xminus), r') | Inr (Inr yplus) -> (inr (inr yplus), r')) in G (r >> f ** g >> r')  One cool thing about this category (which leads to the coolest fact about it) is that it naturally supports a dualization, which sends$(A^+,A^-)^\ast \triangleq (A^-,A^+)$. This also has an action on morphisms, which flips around the direction, so that from$f : (A^+,A^-) \to (B^+,B^-)$we can get$f^\ast : (B^+,B^-)^\ast \to (A^+,A^-)^\ast$.  (* val dualize : ('a, 'b, 'c, 'd) map -> ('d, 'c, 'b, 'a) map *) let dualize (G (R f)) = let rec dual f = R(fun v -> let (v', f') = f (swap v) in (swap v', dual f)) in G (dual f)  Now, here's what I consider the neatest fact about the Int construction -- it turns a traced monoidal category into a monoidal closed category. So we can define the linear exponential$(A^+,A^-) \multimap (B^+,B^-) \triangleq (A^+,A^-)^\ast \otimes (B^+,B^-)$.  (* Here's a bit of unrolling type definitions to make things easier to grok. (A+, A-) -o (B+, B-) = (A+, A-)* * (B+, B-) = (A-, A+) * (B+, B-) = (A- + B+, A+ + B-) Hom((A+,A-) * (B+,B-), (C+, C-)) = Hom((A+ + B+, A- + B-), (C+, C-)) = ((A+, B+) sum, (A- + B-) sum, C+, C-) map = (((A+, B+) sum, C+) sum, ((A- + B-) sum, C-) sum) r Hom((A+,A-), (B+,B-) -o (C+,C-)) = Hom((A+,A-), (B- + C+, B+ + C-)) = (A+, A-, (B-, C+) sum, (B+, C-) sum) map = ((A+, (B+, C-) sum) sum, (A-, (B-, C+) sum) sum) r *) (* val curry : (('a, 'b) sum, ('c, 'd) sum, 'e, 'f) map -> ('a, 'c, ('d, 'e) sum, ('b, 'f) sum) map *) let curry (G f) = let rec curry (R f) = R(fun v -> let (v', f') = f (case (inl$ inl) (case (inl $inr) inr) v) in let v'' = case (case inl (inr$ inl)) (inr $inr) v' in (v'', curry f')) in G(curry f) (* val uncurry : ('a, 'b, ('c, 'd) sum, ('e, 'f) sum) map -> (('a, 'e) sum, ('b, 'c) sum, 'd, 'f) map *) let uncurry (G f) = let rec uncurry (R f) = R(fun v -> let (v', f') = f (case (case inl (inr$ inl)) (inr $inr) v) in let v'' = case (inl$ inl) (case (inl $inr) inr) v' in (v'', uncurry f')) in G (uncurry f) end  The ML types for curry and uncurry somewhat obscure the fact that they implement the isomorphism $$\mathrm{Hom}((A^+,A^-) \otimes (B^+,B^-), (C^+,C^-)) \simeq \mathrm{Hom}((A^+,A^-), (B^+,B^-) \multimap (C^+,C^-))$$ However, now that we have this isomorphism, it's possible to compile arbitrary lambda-terms in multiplicative linear logic down into compositions of resumptions. But that's a story for another post! In the meantime, if this intrigues you, you can play with the Verity compiler by Dan Ghica and his students, as well as the IntML compiler by Ulrich SchĂ¶pp and Ugo Dal Lago, both of which are compilers using the GoI interpretation. Verity is aimed at distributed computing, and IntML is a programming language whose type system allows you to write only (and all) the LOGSPACE programs. As you can see, there are a lot of potential applications... ## Wednesday, November 21, 2012 ### Polymorphism and limit-colimit coincidence in reactive languages In my previous post, I mentioned limit-colimit coincidence, and how the ability to define fixed points makes that possible. It turns out there's a very pretty way of using parametricity to encode this. Now, inductive and coinductive types are definable in System F as follows: ÎĽÎ±. F(Î±) = ∀Î±. (F(Î±) → Î±) → Î± Î˝Î˛. F(Î˛) = ∃Î˛. Î˛ × (Î˛ → F(Î˛))  It's possible to show using parametricity that these type definitions are actually the inductive and coinductive types we expect, whenever F is a positive type. That is, a type constructor F is positive whenever we have a function:  map : ∀Î±,Î˛. (Î± → Î˛) → F(Î±) → F(Î˛)  with the expected properties that: 1. map [A] [A] id = id 2. (map [A] [B] f); (map [B] [C] g) = map [A] [C] (f; g) In lattice theory, you can apply the Knaster-Tarski theorem to get a fixed point of a function f : L → L when L is a complete lattice and f is monotone. Intuitively, you can see map as the proof that the functor F is monotone (really, functorial) in the lattice (really, category) of types. It turns out that in a temporal setting, we also need to specify that we can invoke the algebra/coalgebra operations at multiple times, which we can specify using the modal stability connective □A, which changes the definitions to: ÎĽÎ±. F(Î±) = ∀Î±. □(F(Î±) → Î±) → Î± Î˝Î˛. F(Î˛) = ∃Î˛. Î˛ × □(Î˛ → F(Î˛))  Now, suppose we have a Nakano style recursive type ÎĽ̂Î±.F(•Î±), where every occurrence of the recursive type occurs beneath a delay modality. Then, assuming that F is positive, we can show how to embed elements of Î˝Î˛. F(•Î˛) into elements of ÎĽÎ±. F(•Î±) as follows: embed : Î˝Î±. F(a) → ÎĽÎ±. F(Î±) embed pack(Î˛, (seed, h₁)) = // seed : Î˛, h₁ : □(Î˛ → F(•Î˛)) let stable(f) = h₁ in // f : Î˛ → F(•Î˛) at all times Î›Î±. Î»h₂. // h₂ : □(F(•Î±) → Î±) let stable(g) = h₂ in // g : F(•Î±) → Î± at all times let loop : Î˛ → Î± = fix loop = Î»b:Î˛. // in body, loop can only be used later g(map [•Î±] [•Î˛] (Î»d:•Î˛. let •b' = d in •(loop b')) // has type •Î˛ → •Î± (f b)) // has type F(•Î˛) in loop seed  This suggests that guarded term-level recursion is enough to make certain least and greatest fixed points collapse. I say "suggests", since proving that these types are equivalent requires a parametric model, which I haven't actually constructed yet. It also means that just polymorphism and the stability modality □A are enough to construct an interesting reactive language, even without a term-level fixed point fix x. e. This function, minus the temporal staging, should also work for F plus ordinary general recursion. I had just never thought about how the coincidence plays out syntactically when you don't take recursive types as a primitive! ## Tuesday, November 20, 2012 ### When it rains, it pours Having written a new paper on FRP of my own, I've just discovered that lots of other people have been hard at work in this area. • Alan Jeffrey has a new paper for PLPV 2013, entitled Causality For Free!: Parametricity Implies Causality for Functional Reactive Programs. This looks very interesting, since it looks like he's found a different way of enforcing causality than the one Nick and I hit on. We formalized the order structure of time (this happens before that), and required functions to respect that order. Alan has a discrete model of time (i.e., he doesn't bake temporal order into his category of times), and instead he uses parametricity to prove that all the definable functions (of the appropriate type) are inherently causal. • Wolfgang Jeltsch also has a new paper for PLPV 2013, entitled Temporal Logic with “Until”, Functional Reactive Programming with Processes, and Concrete Process Categories. He's also been looking at using full temporal logic for typing reactive programs. This paper looks especially focused on the distinction between strong (must eventually happen) and weak eventually (may eventually happen) operators. • Brigitte Pientka visited us here at MPI last week, and she told me about some work that some of her students (Francisco Ferreira and someone whose name I unfortunately cannot recall -- Andrew Cave?) have been doing with her and Prakash Panangaden, on giving a proof theory for the modal mu-calculus. At a high level, their calculus looks very similar to the system I just wrote up, but there are some very intriguing semantic differences. I use a Nakano-style guarded recursive type, which allows negative occurences. This permits defining general fixed points$(\bullet{A} \to A) \to A$, which collapses greatest and least fixed points. They use strict positivity instead, and so they can distinguish them, at the cost of more restrictive typing of recursive definitions. (This ties into Jeltsch's work on strong/weak eventually, which correspond to inductive versus coinductive definitions of the eventually type.) Limit-colimit coincidence is one of the most striking features of categories of pointed domains, and it's (a) interesting that it can be induced already with guarded recursion, and (b) it might be easier to study in this setting. • Marc Bagnol and Adrien Guatto have written a draft paper, Synchronous Machines: a Traced Category. The basic idea, roughly speaking, is to start with Mealy automata as a model of reactive programs. These automata lack a notion of higher-order behavior, but if you hit them with the Geometry of Interaction construction, you can get out a higher-order (linearly-typed) programming language, which is still implementable with finite state. This is a very nice idea, and reminiscent of Dan Ghica's work on synthesizing circuits. ## Wednesday, November 14, 2012 ### New Draft: Simple and Efficient Higher-Order Reactive Programming I've got a new draft out Simple and Efficient Higher-Order Reactive Programming. Here's the abstract: Functional reactive programming (FRP) is an elegant approach to declaratively specify reactive systems. However, the powerful abstractions of FRP have historically made it difficult to predict and control the resource usage of programs written in this style. In this paper we give a simple type theory for higher-order functional reactive programming, as well as a natural implementation strategy for it. Our type theory simplifies and generalizes prior type systems for reactive programming. At the same time, we give a an efficient implementation strategy which eagerly deallocates old values, ruling out space and time leaks, two notorious sources of inefficiency in reactive programs. Our language neither restricts the expressive power of the FRP model, nor does it require a complex substructural type system to track the resource usage of programs. We also show that for programs well-typed under our type system, our implementation strategy of eager deallocation is safe: we show the soundness of our type system under our implementation strategy, using a novel step-indexed Kripke logical relation. Last year, Nick Benton, Jan Hoffmann, and I worked out how to control the space usage of FRP programs. Our techniques worked, but it required a rather complicated linear type discipline to make everything work. In this paper, I've worked out how to dramatically simplify the type system, make the language more expressive, use simpler implementation strategies, which also make it more efficient. This is also basically the implementation strategy I am using in my AdjS compiler, which should see a release as soon as I add support for a few more type system features, and squash the soundness bugs the correctness proof in this paper revealed. ## Monday, August 27, 2012 ### An astonishing isomorphism I just learned today that in classical linear logic, the following type isomorphism holds: $$(!A \multimap R) \multimap R \qquad\simeq\qquad R^\bot \multimap (!A \otimes R^\bot)$$ So linear CPS and linear state are the same thing, up to a dual notion of observation, and this is the fundamental reason why SSA and CPS are equivalent program representations. Apparently, this is the observation that prompted Egger, MĂ¸gelberg and Simpson to invent the enriched effect calculus, which is a reformulation of Paul Levy's call-by-push-value to make the linear nature of control more apparent. I guess this is also why MĂ¸gelberg and Staton were able to prove that linear state is a universal effect. Now I'm curious how all this fits with the story of the algebraic theory of effects, such as Pretnar and Plotkin's effect handlers. ## Thursday, August 23, 2012 ### A Computational Lambda Calculus for Applicative Functors These days, most people are familiar with using monads to structure functional programs, and are also familiar with Haskell's do-notation for writing monadic code. Slightly less well-known is the fact that the do-notation actually originates in Moggi's original 1991 paper on monads, in which he gave the following type-theoretic introduction and elimination rules for monadic types: ## Wednesday, August 8, 2012 ### An ML implementation of match compilation In this post, I'll give an OCaml implementation of the pattern match compiler from my previous post. Just for fun, it will also pattern compilation into a series of tests. We begin by giving some some basic types. The type var of variables is just strings, and we have a type tp of types, which are just units, sums, products, void types, and a default Other to handle all non-matchable types (e.g., functions). The type pat is the type of patterns. We have patterns for variables, wildcards, units, pairs, and left and right injections. Finally, we have the type exp of expressions. We've thrown in the minimum number of constructs necessary to elaborate pattern matching. Namely, we have variables, let-binding, unit and pair elimination (both given in the "open-scope" style), an abort (for the empty type), and a case expression for sums. type var = string type tp = One | Times of tp * tp | Zero | Sum of tp * tp | Other type pat = PVar of var | PWild | PUnit | PPair of pat * pat | PInl of pat | PInr of pat type exp = EVar of var | ELet of var * exp * exp | ELetUnit of exp * exp | ELetPair of var * var * exp * exp | EAbort of exp | ECase of exp * var * exp * var * exp  First, we give some utility operations. gensym does fresh name generation. This is an effect, but a simple one, and should be easy enough to transcribe into (say) monadic Haskell. let gensym = let r = ref 0 in fun () -> incr r; Printf.sprintf "u%d" !r  The filtermap operation maps an optional function over a list, returning the non-None values. I really should work out the general story for this someday -- there's obviously a nice story involving distributive laws at work here. (* filtermap : ('a -> 'b option) -> 'a list -> 'b list *) let rec filtermap f = function | [] -> [] | x :: xs -> (match f x with | None -> filtermap f xs | Some y -> y :: filtermap f xs)  Now, we have the simplification routines -- these correspond to the transformations of$\overrightarrow{ps \Rightarrow e}$to$\overrightarrow{ps' \Rightarrow e'}\$ in the inference rules I gave.
let simplify_unit u =
List.map (function
| (PVar x :: ps, e) -> (ps, ELet(x, EVar u, e))
| (PWild :: ps, e)  -> (ps, e)
| (PUnit :: ps, e)  -> (ps, e)
| (_, e) -> assert false)

let simplify_pair u =
List.map (function
| (PVar x :: ps, e) -> (PWild :: PWild :: ps, ELet(x, EVar u, e))
| (PWild :: ps, e)  -> (PWild :: PWild :: ps, e)
| (PPair(p1, p2) :: ps, e) -> (p1 :: p2 :: ps, e)
| (_, e) -> assert false)

let simplify_other u =
List.map (function
| (PVar x :: ps, e) -> (ps, ELet(x, EVar u, e))
| (PWild :: ps, e)  -> (ps, e)
| (_, e) -> assert false)


Unlike the simplify_unit and simplify_pair routines, the simplification routines for sums return an option, which we filtermap over. This is because when we want the inl patterns, the inr patterns are well-typed, but not needed. So we need to filter them out, but it's not actually an error to see them in the branch list.

let simplify_inl u =
filtermap (function
| (PVar x :: ps, e) -> Some (PWild :: ps, ELet(x, EVar u, e))
| (PWild :: ps, e)  -> Some (PWild :: ps, e)
| (PInl p :: ps, e) -> Some (p :: ps, e)
| (PInr _ :: _, _)  -> None
| (_, e) -> assert false)

let simplify_inr u =
filtermap (function
| (PVar x :: ps, e) -> Some (PWild :: ps, ELet(x, EVar u, e))
| (PWild :: ps, e)  -> Some (PWild :: ps, e)
| (PInr p :: ps, e) -> Some (p :: ps, e)
| (PInl _ :: _, _)  -> None
| (_, e) -> assert false)


Now we reach coverage checking proper. Note that this is basically just a transcription of the inference rules in my previous post.

let rec cover arms tps =
match arms, tps with
| arms, [] -> snd (List.hd arms)
| arms, (u, One) :: tps ->
let e = cover (simplify_unit u arms) tps in
ELetUnit(EVar u, e)
| arms, (u, Times(tp1, tp2)) :: tps ->
let u1 = gensym() in
let u2 = gensym() in
let arms = simplify_pair u arms in
let e = cover arms ((u1, tp1) :: (u2, tp2) :: tps) in
ELetPair(u1, u2, EVar u, e)
| arms, (u, Zero) :: tps ->
EAbort (EVar u)
| arms, (u, Sum(tp1, tp2)) :: tps ->
let u1 = gensym() in
let u2 = gensym() in
let e1 = cover (simplify_inl u arms) ((u1, tp1) :: tps) in
let e2 = cover (simplify_inr u arms) ((u2, tp2) :: tps) in
ECase(EVar u, u1, e1, u2, e2)
| arms, (u, Other) :: tps ->
cover (simplify_other u arms) tps