Thursday, December 30, 2021

What is a programming language?

One of the more surprising facts about the discipline of programming language theory is that it is actually possible to define what programming languages are in a reasonably mathematically satisfying way.

  1. A language is a presentation of a (generalised) algebraic theory. Basically, think of a language as a set of generators and relations in the style of abstract algebra.

    You need to beef up the universal algebra story a bit to handle variables and binding (e.g., see the work of Fiore and Hamana on higher-order algebraic theories), but the core intuition that a language is a set of generators for terms, plus a set of equations these terms satisfy is exactly the right one.

    For example:

    1. the simply-typed lambda calculus
    2. regular expressions
    3. relational algebra
  2. A model of a a language is literally just any old mathematical structure which supports the generators of the language and respects the equations.

    For example:

    1. we can model the typed lambda calculus using sets for types and mathematical functions for terms,
    2. we can model regular expressions as denoting particular languages (ie, sets of strings)
    3. we can model relational algebra expressions as sets of tuples
  3. A model of computation or machine model is basically a description of an abstract machine that we think can be implemented with physical hardware, at least in principle. So these are things like finite state machines, Turing machines, Petri nets, pushdown automata, register machines, circuits, and so on. Basically, think of models of computation as the things you study in a computability class.

    Nearly all models of computation are transition systems, where you have a set of states, and a transition function which tells you how a computation can change one state into another. There are lots of generalisations of this (relations instead of functions, I/O, system calls, message passing communication, etc.), and the slick way of accounting for this generality is by modelling transition systems as coalgebras (i.e., the set S paired with a coalgebra map step : S → F(S), where F is a functor.

    In addition to this, every model of computation also comes with a notion of observation – in its most basic form, how do you tell when a program has finished executing? Observations can get quite intricate: to model computational complexity, you have to be able to observe runtime and memory usage, and for concurrent programs you can observe intermediate program states (due to the possibility of concurrent interference). More practically, the compiler correctness theorem for something like CompCert says that the sequence of system calls the compiled binary produces will be the same as the one the C abstract machine will produce.

    I don't know a nice algebraic/categorical characterisation of the notion of observation, though probably Alexandra Silva does.

    The Church-Turing thesis characterises which abstract machines we think it is possible to physically implement.

    Some examples of models of computation are:

    1. Turing machines,
    2. finite state machines
    3. RAM machine programs
  4. A language is a programming language when you can give at least one model of the language via a machine model.

    For example:

    1. the types of the lambda calculus can be viewed as partial equivalence relations over Gödel codes for some universal turing machine, and the terms of a type can be assigned to equivalence classes of the corresponding PER.

    2. Regular expressions can be interpreted into finite state machines quotiented by bisimulation.

    3. A set in relational algebra can be realised as equivalence classes of B-trees, and relational algebra expressions as nested loops walking over these trees.

    Note that in all three cases we have to quotient the states of the machine model by a suitable equivalence relation to preserve the equations of the language's theory.

    This quotient is very important, and is the source of a lot of confusion. It hides the equivalences the language theory wants to deny, but that is not always what the programmer wants – e.g., is merge sort equal to bubble sort? As mathematical functions, they surely are, but if you consider them as operations running on an actual computer, then we will have strong preferences! (Conversely, the ability to equate radically different machine configurations is the reason why optimising compilers and refactoring are things.)

    Moreover, we have to set up this quotient so it agrees with the machine model's notion of observation – for example, we want exactly and only the strings accepted by a regular expression to end in accepting states of the finite state machine. This property is called adequacy, and is what justifies the claim that this is a reasonable implementation of the programming language.

In other words, a programming language is a language which has an adequate implementation in a model of computation. Note that this definition does not mandate the implementation strategy!

A common source of confusion arises from the fact that if you have a nice type-theoretic language (like the STLC), then:

  1. the term model of this theory will be the initial model in the category of models, and
  2. you can turn the terms into a machine model by orienting some of the equations the lambda-theory satisfies and using them as rewrites.

These oriented equations are where operational semantics comes from, at least for purely functional languages. As a result we often abuse language to say the simply typed lambda calculus "is" a programming language.

But in fact, the lambda calculus is a presentation of an algebraic theory, and it is the same programming language regardless of whether it is implemented by term rewriting or by compilation to x86 machine code.

5 comments:

  1. "Regular expressions can be interpreted into finite state machines quotiented by bisimulation." It can be done but would it be desirable? We would distinguish two non-deterministic machines that recognize the same language, which seems unusual. Do you mean mutual simulation? Or is the more fine-grained equivalence picked intentionally?

    ReplyDelete
  2. I love this clear outline and clarifications on misconceptions about algebraic theories.

    > the work of Fiore and Hamana on higher-order algebraic theories

    With this do you mean Fiore and Mahmoud's "Second-Order Algebraic Theories"? Or the two authors' general work in this direction over the years?

    I have been reading Ola Mahmoud's PhD thesis recently which has been the most accessible resource I have found so far to understand second-order syntax and theories.

    ReplyDelete
  3. @Burak: I had DFA minimisation (like the Hopcroft-Karp algorithm) in mind, which can be understood as (a) building a bisimulation relation on the states of a given DFA, and then (b) merging all bisimilar states. But I think you are right that this picture is too fine if you want to equate all machines which accept the same language.

    @Kartik: yes, I just meant their work (and the work of their collaborators) in general. If you know Agda, you might be interested in my student Dimitrij Szamozvancev's new POPL paper with Marcelo, in which they build an Agda library encoding all this stuff. Formal Metatheory of Second-Order Abstract Syntax

    ReplyDelete
    Replies
    1. Thanks Neel!

      I actually first learned about that paper from your blog post in November and have since been trying to extend it with some support for substructural assumptions, which I need to encode a quantum algebraic theory.

      Delete