A few weeks ago, Shriram Krishnamurthi asked an interesting question on Twitter:
It's common in PL to have benchmarks for performance. But does anyone know of PL design benchmarks?
This caught my attention because my longest-running research program -- my work on functional reactive programming, with 7 papers so far -- has been explicitly guided by a handful of design benchmarks.
I've never written them down explcitly, so I thought I'd take the opportunity to do so.
The language must validate the full beta- and eta-laws for all of the type constructors in the language.
This is a proxy benchmark, of course. My actual goal is to make languages which are easy to program with and reason about.
However, this is an open-ended and vague goal, and so I needed to find a specific, concrete operationalisation of it. There are many possible proxies for this goal, and the one I picked was to look for programming languages which validate a rich collection of semantically-justified equations. This specific choice was made for three reasons.
First, this goal it pushes me away from spending time on quality-of-life features like rich standard libraries. These features obviously do make a language easier to use, but they do not directly touch upon any FRP-specific problems.
Second, the search for well-behaved equational properties pushes me towards taking features known to be useful, and trying to decompose them into smaller type-theoretic components.
As Jeff Raskin once noted, in user interfaces "intuitive" and "familiar" are basically synonyms, which means it is good to have a design benchmark which does not penalise doing things that are different from the old standard.
Third, déformation professionnelle. This particular goal is the one that the toolbox I've acquired over the years -- expertise in semantics and type theory -- is best-adapted to move me towards.
If my expertise lay in running user studies, writing model checkers, or programming with continuations, then I would probably approach the problem differently! For example, the work on the Garnet toolkit is both extremely good and very influential, and very different in character from anything I have pursued.
Obviously, I could learn how to write a constraint solver or run a user study, but the art is long, and life short. I think it's OK for people to focus on different things, and to learn from each other. :)
I have a small collection of tiny programs which are intended to model various common UI patterns, and the languages I design must express these programs.
The purpose of these programs is basically to prevent over-simplifying away real complexity. If we're trying to optimise a language design for ease of reasoning, the easiest way of "achieving" the goal is to design an impoverished language.
For example, I check whether I can write programs which dynamically create new widgets, to ensure that I can't design something which restricts programmers to a fixed dataflow graph. Likewise, a program with two widgets sending notifications "sideways" to each other keeps me from trying to rely on hierarchical encapsulation. (This is super tempting, but as far as I can tell, no actual program consistently does this.)
The language must admit a simple implementation strategy.
This design principle arose as a reaction to a negative lesson.
My first paper on higher-order FRP, Ultrametric Semantics of Reactive Programs, used a system of dynamic, incremental dependency propagation (a baby version of the stuff Matt Hammer and Umut Acar used to do), and it ended up requiring a really weird implementation strategy for closures. In addition to being able to call them, I also needed to be able to move forward them in time, and this required a really weird set of hacks to go into a closure and move everything in its captured environment forward in time as well.
This was painful to implement, difficult to prove correct, and it leaked memory like a sieve. Trying to deal with these issues led directly to my ICFP 2013 paper on FRP, which is basically a type system for immediate mode GUI programming.
Note that I don't mind if the implementation strategy is challenging to prove correct (eg, my POPL 2021 paper with Simon Spies and Derek Dreyer is about inventing some needed machinery for modelling retained-mode GUI implementations), but the actual running code has to be simple.
This means I haven't looked at any dataflow or constraint stuff in years, but sometimes a design rule is there to prevent you from doing certain things.