Tuesday, September 13, 2022

The Golden Age of PL Research

I was chatting with a PhD student while back, who remarked to me that many senior PL researchers seemed stuck in a defensive crouch. I though that was quite a striking observation, because (a) he was not wrong, but (b) people his age don't have or need that posture because we are currently in a golden age for PL research.

What I mean by that is that there was a period where the research tradition in programming languages nearly died outright in the US, and a lot of our more senior researchers remember this very keenly. Basically, AI winter and cheap microcomputers killed the Lisp machine, and as collateral damage managed to nearly wipe out PL research in America -- many PL researchers in the US ended up moving overseas or changing their research focus.

However, over the course of about a decade (from the mid 1990s to the mid 2000s), the field made the transition from being on life support to being very vigorous. I started grad school towards the end of this transition period, and so I ended up with first hand view of the change.

Here are some of the factors that triggered the transition, both practical and theoretical. (Note that the thing about a first-hand view is that it is not a global view: I am sure other people will have different opinions about what happened!)

  1. Syntactic Type Soundness Before Felleisen and Wright's 1994 paper A Syntactic Approach to Type Soundness appeared, proving type safety was surprisingly hard.

    E.g., if you look at Milner's A Theory of Type Polymorphism in Programming, you will see that what he does to prove type safety is:

    1. Solve a domain equation for a universal domain containing all the features of mini-ML
    2. Give a denotational interpretation of untyped mini-ML
    3. Define a logical relation as a family of retracts of the universal domain which exclude "wrong"
    4. Prove the fundamental theorem of logical relations.

    This is a lot more work than the Wright-Felleisen approach, in which you just write down the typing rules, the reduction relation, and prove a handful of lemmas about it.

    The technical simplicity of the Wright-Felleisen approach meant that it was very easy to apply PL design techniques to nearly arbitrary domains, and that made it possible to quickly build formal models of industrially interesting languages. Igarashi et al's Featherweight Java is probably one of the earliest example of this, and more recently WebAssembly has demonstrated that it is possible to build the formal spec in sync with the language design effort.

  2. Java Java was released in 1996, and its immediate success had a critical impact in popularising garbage collection. Before Java, using languages with garbage collection for application programming was an extremely controversial idea among working programmers. Guy Steele (who co-authored the Java spec) remarked about this on the ll1-discuss mailing list:

    We were not out to win over the Lisp programmers; we were after the C++ programmers. We managed to drag a lot of them about halfway to Lisp.

    Java also ended up dragging programmers halfway to ML! In 1998, Gilad Bracha, Martin Odersky, David Stoutamire and Phil Wadler proposed adding parametric polymorphism to Java, and in 2004 after 5 years of arguments it was added in Java 5. This also made a big impact in popularising advanced type systems features.

  3. Dynamic Languages Languages like Perl, Python, Ruby, and Javascript (in roughly that order) began achieving serious popularity in the late 90s/early 2000s, primarily as tools for building websites. Like Java, all of these languages had garbage collection (even if only reference counting), which helped normalise it.

    But in addition to automatic memory management, these languages all supported first-class functions of one variety or another, and so gave literally millions of programmers their first introduction to functional programming.

    Younger programmers will be surprised by just how controversial an addition first-class functions were! Java omitted it initially, despite its designers being expert Lisp and Smalltalk programmers. Even C#, which was released 6 years after Java, did not support them at first. IMO it was only because of the experience people had with dynamic languages that created the pressure to add them.

    Nowadays, thanks to the influence of dynamic languages on the design zeitgeist, even languges whose design aesthetic is deliberately conservative, such as Go, include first-class functions without a second thought.

  4. SAT and SMT solvers CDCL-based SAT solvers was invented in the mid 1990s, and it revolutionised SAT solving. A few years later, by the early 2000s, the lazy approach to SMT was invented, which made it possible to reliably check whether quantifier-free formulas from all kinds of mixed domains were valid.

    This has proved to be an immense boon for PL research across a wild variety of different research areas, from verification of operatiing systems code like the Verve project, the KLEE symbolic execution engine, and refinement type systems like Liquid Haskell.

  5. Separation logic Hoare logic, invented in 1968, was one of the first systematic approaches to proving the correctness of imperative programs. But despite serious efforts to use it in verification, such as the Euclid system, it never quite caught on.

    The major reason for this was that it was very difficult to reason about pointers and aliasing with Hoare logic. Every Hoare logic assertion talks about the whole state, and so you need to write down explicit disjointness assertions to make clear that modifying one pointer will not change the value of a different pointer. So if you have n pointers, then you need O(n^2) disjointness assertions.

    Separation logic, invented by John Reynolds and Peter O'Hearn, solved this problem, by formalising the intuition that an assertion could really only be talking about part of the heap. This made modular reasoning about imperative programs radically easier, and has been used in numerous verification efforts since.

    One minor fact about separation logic. John C. Reynolds invented separation logic when was 65. At the time that most people start thinking about retirement, he was making yet another giant contribution to the whole field!

  6. Mature Proof Assistants Work on the Coq proof assistant began in 1984, but it went through multiple successive rounds of reimplementation as it influenced and was influenced by competing systems like LEGO, HOL and Isabelle. It probably achieved a recognizably modern form in 2002, when the 7.0 release was made.

    This was a milestone in terms of quality of implementation, since this is when (I think -- I wasn't there!) term normalisation started to happen via a bytecode VM. This made many things (like reflective proof) much faster, which made it feasible to build really large proofs in Coq. Indeed, Georges Gonthier reports that his 2005 mechanised proof of the 4-colour theorem began as an experiment to stress test the new fast VM!

    The result that made CS researchers outside of PL stand up and take notice was not the four color theorem, though. It was Xavier Leroy's correctness proof of a C compiler in the CompCert project, in 2006. People have been talking about program verification since the 1960s, and this was one of the first big programs that was actually verified -- and in a proof assistant, no less.

    Nowadays, you can go to a systems conference like SOSP and find whole sessions about verification, which would have been unthinkable when I was a student.

    Coq is not the only proof assistant, of course, and all its competitor systems (e.g., Isabelle) have seen substantial improvements in their maturity as well.

Some other important, but maybe less vital, factors:

  1. Step-indexing In 2001, McAllester and Appel showed how you could build a logical relations model using "step-indexing" as an alternative to the domain theoretic approach. This is a less general, but more elementary and flexible, approach to modelling the semantics of recursion than techniques like domain theory. The flexibility meant that languages with features like ML-style modules, machine code, and self-modifying code could all be given straightforward models.

    One small thing that makes me smile is that this is now the modern approach to proving type safety for fancy languages -- and that it is very similar to the approach the Wright/Felleisen approach displaced! I feel (i.e., believe but don't have a real argument for) that you need a big enough community or ecosystem to support a technically sophisticated research tradition, and now that PL is a bigger field again we can once more consider using advanced techniques.

  2. Flexible Type Analysis, Crary and Weirich, ICFP 1999 Compared to some of the other papers I've linked to, this paper is mostly forgotten -- according to Google Scholar, it has less than 10 citations in the last five years. However, IMO it really is a pivotal paper.

    What makes it so very important is that after writing this paper, Stephanie Weirich began collaborating with Simon Peyton-Jones. Stephanie, Simon, and their students and collaborators, began systematically extending GHC Haskell with the features described in this paper. And once they finished with that they carried on with many more extensions!

    This has been an immensely productive collaboration. It's why Haskell has grown so many "fancy" type system features, and in turn it has made it possible to actually try out many things in the literature in actual programming. The value of this can't be overstated: a huge number of PL researchers got their first introduction to what type theory can do from the advanced extensions to GHC.

  3. Model Checking This goes under "smaller factors" not because it's actually a smaller factor, but because I am very ignorant about the state of the art! It's long been important in hardware, but in software, the biggest success I know of is Microsoft's Static Driver Verifier, which is used to model check drivers before they are certified for Windows. This is obviously a big deal, but I can't really contextualise this because I don't know much about model checking. However, I am under the impression that SAT/SMT-based bounded model checking supplanted the older BDD-based approach, which would mean that model checking also benefited from the SAT/SMT revolution.

The net effect of all this is that a motivated, ambitious student can aim at problems which simultaneously have significant theoretical depth and substantial practical impact -- stuff that would have been field-shaking results 20 years ago are almost routine now.

For example, Maja Trela, one of my recently-graduated Part III (i.e., Master's) students here at Cambridge, just completed her dissertation under the primary supervision of Conrad Watt. In her dissertation, she used a separation logic library for Isabelle to verify the correctness of an efficient Webassembly interpreter, which is now being used as the reference interpreter by the Bytecode Alliance!