In this post, I'll give an OCaml implementation of the pattern match compiler from my previous post. Just for fun, it will also pattern compilation into a series of tests.
We begin by giving some some basic types. The type var
of variables is just strings, and we have a type tp
of
types, which are just units, sums, products, void types, and a default
Other
to handle all non-matchable types (e.g.,
functions). The type pat
is the type of patterns. We have
patterns for variables, wildcards, units, pairs, and left and right
injections.
Finally, we have the type exp
of expressions. We've thrown in
the minimum number of constructs necessary to elaborate pattern matching. Namely,
we have variables, let-binding, unit and pair elimination (both given in the "open-scope"
style), an abort (for the empty type), and a case expression for sums.
type var = string type tp = One | Times of tp * tp | Zero | Sum of tp * tp | Other type pat = PVar of var | PWild | PUnit | PPair of pat * pat | PInl of pat | PInr of pat type exp = EVar of var | ELet of var * exp * exp | ELetUnit of exp * exp | ELetPair of var * var * exp * exp | EAbort of exp | ECase of exp * var * exp * var * expFirst, we give some utility operations.
gensym
does fresh name generation. This is an effect, but a simple one, and should be easy enough to transcribe into (say) monadic Haskell.
let gensym = let r = ref 0 in fun () -> incr r; Printf.sprintf "u%d" !r
The filtermap
operation maps an optional function over a list, returning the non-None
values. I really should work out the general story for this someday -- there's obviously a nice story involving distributive laws at work here.
(* filtermap : ('a -> 'b option) -> 'a list -> 'b list *) let rec filtermap f = function | [] -> [] | x :: xs -> (match f x with | None -> filtermap f xs | Some y -> y :: filtermap f xs)Now, we have the simplification routines -- these correspond to the transformations of $\overrightarrow{ps \Rightarrow e}$ to $\overrightarrow{ps' \Rightarrow e'}$ in the inference rules I gave.
let simplify_unit u = List.map (function | (PVar x :: ps, e) -> (ps, ELet(x, EVar u, e)) | (PWild :: ps, e) -> (ps, e) | (PUnit :: ps, e) -> (ps, e) | (_, e) -> assert false) let simplify_pair u = List.map (function | (PVar x :: ps, e) -> (PWild :: PWild :: ps, ELet(x, EVar u, e)) | (PWild :: ps, e) -> (PWild :: PWild :: ps, e) | (PPair(p1, p2) :: ps, e) -> (p1 :: p2 :: ps, e) | (_, e) -> assert false) let simplify_other u = List.map (function | (PVar x :: ps, e) -> (ps, ELet(x, EVar u, e)) | (PWild :: ps, e) -> (ps, e) | (_, e) -> assert false)
Unlike the simplify_unit
and simplify_pair
routines, the simplification routines for sums return
an option, which we filtermap
over. This is because when we want the inl
patterns, the inr
patterns are well-typed, but not needed. So we need to filter them out, but it's not actually an error to see them in the
branch list.
let simplify_inl u = filtermap (function | (PVar x :: ps, e) -> Some (PWild :: ps, ELet(x, EVar u, e)) | (PWild :: ps, e) -> Some (PWild :: ps, e) | (PInl p :: ps, e) -> Some (p :: ps, e) | (PInr _ :: _, _) -> None | (_, e) -> assert false) let simplify_inr u = filtermap (function | (PVar x :: ps, e) -> Some (PWild :: ps, ELet(x, EVar u, e)) | (PWild :: ps, e) -> Some (PWild :: ps, e) | (PInr p :: ps, e) -> Some (p :: ps, e) | (PInl _ :: _, _) -> None | (_, e) -> assert false)
Now we reach coverage checking proper. Note that this is basically just a transcription of the inference rules in my previous post.
let rec cover arms tps = match arms, tps with | arms, [] -> snd (List.hd arms) | arms, (u, One) :: tps -> let e = cover (simplify_unit u arms) tps in ELetUnit(EVar u, e) | arms, (u, Times(tp1, tp2)) :: tps -> let u1 = gensym() in let u2 = gensym() in let arms = simplify_pair u arms in let e = cover arms ((u1, tp1) :: (u2, tp2) :: tps) in ELetPair(u1, u2, EVar u, e) | arms, (u, Zero) :: tps -> EAbort (EVar u) | arms, (u, Sum(tp1, tp2)) :: tps -> let u1 = gensym() in let u2 = gensym() in let e1 = cover (simplify_inl u arms) ((u1, tp1) :: tps) in let e2 = cover (simplify_inr u arms) ((u2, tp2) :: tps) in ECase(EVar u, u1, e1, u2, e2) | arms, (u, Other) :: tps -> cover (simplify_other u arms) tps
Re. filter_map: there is a natural mapping from ('a option) to ('a list) -- for example if you consider the embedding of the tuples of fixed size 'a^n into 'a list, and see ('a option) as the disjoint union of 'a^0 and 'a^1. Once you've mapped ('a -> 'b option) into ('a -> 'b list), you have the "bind" for the list monad.
ReplyDeleteRegarding the algorithm per se: I feel there is some redundancy in the frequent repetition of a transformation from (PVar x) to (ELet(x, EVar u,, e)) in the simplification functions. Maybe you could rephrase it instead as a rewriting step turning PVar into PWild in a principled way, before doing the simplification?
Hi Gabriel, the original code I wrote had a separate simplification step, but I took it out, since it didn't end up saving any space. I generally only introduce functions either when they come from the structure of the correctness proof, or when it makes the program shorter.
ReplyDelete