*Constructing Type Systems over an Operational Semantics*, Robert Harper.*A Realizability Interpretation of Martin-Loef Type Theory*, Catarina Coquand.These two papers give a very nice introduction to developing the metatheory of dependent type systems. Harper's paper is about extensional type theory, and Coquand's is about intensional type theory, but IMO they both demonstrate the benefits of using realizability to formulate the semantics of type theory -- they are both

*impressively*simple, given the complexity of the type theories involved.*Categorical Glueing and Logical Predicates for Models of Linear Logic*, Masahito Hasegawa.This is another paper on logical relations, this time for linear logic. This is one of the papers that has made me seriously interested in classical linear logic for the first time. Basically, the "double glueing" construction used to formulate models to CLL looks a bit like the notion of "biorthogonality" or "TT-closure" sometimes to interpet higher-order programming languages, but with some very intriguing differences.

Pitts's

*Parametric Polymorphism and Operational Equivalence*is a good introduction to biorthogonality, though the actual origins of the idea are a bit mysterious to me. At any rate, the idea is that if you have a program value, then its "orthogonal" is the set of continuations which won't crash when supplied the value as an argument. Similarly, if you have a continuation, then its orthogonal is the set of values that won't crash the continuation. So if you take a set of values, and then take their orthogonal, and then take their orthogonal again, you have a new, potentially bigger, set of values. A "biorthogonal" is a set of values which remains the same when you take its double orthogonal. This means you can equivalently think of a set of values as the set of continuations which accept it, which is often quite convenient for defining the meaning of a type.With double glueing, you again think of a type in as a set of values which make it up, and a set of continuations it can supply. But unlike with biorthogonality, there is no procedure for constructing the continuations from the values or vice-versa --- instead, you get to

*define*the values and the continuations as anything you like, as long as everything plays nicely together. This extra degree of freedom is really intriguing, and I'm curious what can be done with it.

# Semantic Domain

## Monday, March 24, 2014

### Papers I'm reading

## Friday, March 7, 2014

### 2014 Midlands Graduate School

The 2014 Midlands Graduate School is coming up. Please attend, or send your students!

**University of Nottingham, 22-26 April 2014**

The Midlands Graduate School (MGS) in the Foundations of Computing Science is a collaboration between researchers at the Universities of Birmingham, Leicester, Nottingham and Sheffield. It was established in 1999. The MGS has two main goals: to provide PhD students with a sound basis for research in the mathematical and practical foundations of computing and to give PhD students the opportunity to make contact with established researchers in the field and their peers who are at a similar stage in their research careers.

This year, the MGS is at the University of Nottingham. It will start on 22 April and finish on 26 April.

# Program

MGS2014 consists of ten courses, each with five hours of lectures plus exercise sessions. One third of the programme offers introductory (or core) courses which all participants attend. The remainder provides advanced (or specialised) courses from which each participant selects a subset depending upon his or her interests and one course given by an invited lecturer (Conor McBride).

## Core Courses

Course Title | Acronym | Lecturer | Affiliation |
---|---|---|---|

Category Theory | CAT | Roy Crole | Leicester |

Denotational Semantics | DEN | Achim Jung | Birmingham |

Typed Lambda Calculus | LAM | Paul Blain Levy | Birmingham |

## Advanced Courses

Course Title | Acronym | Lecturer | Affiliation |
---|---|---|---|

Concurrency, Causality, Reversibility | CCR | Irek Ulidowski | Leicester |

Theory of Randomised Search Heuristics | HEU | Dirk Sudholt, Per Kristian Lehre, Pietro S. Oliveto, Christine Zarges | Birmingham, Nottingham, Sheffield |

Homotopy Type Theory | HOT | Thorsten Altenkirch | Nottingham |

Infinite Data Structures | INF | Venanzio Capretta | Nottingham |

Logical relations and parametricity | PAR | Uday Reddy | Birmingham |

Higher-Order Functional Reactive Programming | REA | Neelakantan Krishnaswami | Birmingham |

# Invited Lecturer

In addition to the standard programme, Conor McBride from the University of Strathclyde will give an invited course on **Dependently Typed Programming** (DTP).

# Registration

The deadline for registration for MGS 2014 is **Friday, 21 March**.
The registration fee is **£440**.

To register for MGS 2014, you need to book this via the University of Nottingham online store. The site will ask you to register, if this is the first time to use it. You can pay using a credit or debit card.

** We recommend to register as early as possible because places will be allocated on a first-come first-serve base and it is quite possible that we run out of space before 21 March.**

## Contact

For any additional information, contact the local organiser, Thorsten Altenkirch, at the address txa@cs.nott.ac.uk.

## Monday, February 3, 2014

### Retiring a joke

I just saw that Andy Pitts has a new draft out, *
An Equivalent Presentation of the Bezem-Coquand-Huber Category of Cubical Sets*, with the following abstract:

Staton has shown that there is an equivalence between the category of presheaves on (the opposite of) finite sets and partial bijections and the category of nominal restriction sets: see [2, Exercise 9.7]. The aim here is to see that this extends to an equivalence between the category of cubical sets introduced in [1] and a category of nominal sets equipped with a "01-substitution" operation. It seems to me that presenting the topos in question equivalently as 01-substitution sets rather than cubical sets will make it easier (and more elegant) to carry out the constructions and calculations needed to build the intended univalent model of intentional constructive type theory.

Nominal methods have been an active area of research in PL theory for a while now, and have achieved a rather considerable level of sophistication. I have occasionally joked that it seems like we had to aim an awful lot of technical machinery at the problem of alpha-equivalence -- so what happens when we realize that there are other equivalence relations in mathematics?

But with this note, it looks like the last laugh is on me!

Pitts shows that you can use nominal techniques to give a simpler presentation of the cubical sets model of univalent type theory. Since HoTT gives us a clean constructive account of quotient types and higher inductive types, all the technical machinery invented to handle alpha-equivalence scales smoothly up to handle any equivalence relation you like.

## Saturday, January 25, 2014

### OBT 2014

I just attended the Off the Beaten Track workshop, which is a POPL workshop where people gather to offer their new, untested, and (ideally) radical ideas. Here are some quick and incomplete reactions.

Chris Martens gave a talk, *Languages for Actions in Simulated Worlds*, in which she described some of her ideas for using linear logic programming to model interactive fiction.

The idea behind this is that you can view the state of a game (say, an interactive fiction) as a set of hypotheses in linear logic, and the valid state transitions as hypotheses of the form $A \multimap B$. So if your game state is
$$
\Gamma = \{ A, X, Y, Z \}
$$
Then matching the rule $A \multimap B$ against the context and applying the $\multimap{}L$ rule will give you the modified context
$$
\Gamma = \{ \mathbf{B}, X, Y, Z \}
$$
Then, a *story* is a proof that the end of the story is derivable

Here are some thoughts I had about her talk:

- A story is actually a
*linearization*of a proof term.The proof term just records the dependencies in the state transitions, and there is no intrinsic order beyond that. There are many possible linearizations for each proof term, and different linearizations yield different stories.

So one question is whether various literary effects (like plot structure) can be seen to arise as properties of the linearization. There has been a lot of work on linearizing proof nets, and I wonder if that has some applications to this.

- Can control operators be used to model flashbacks?
That is, not all stories occur in a linear chronological order. Sometimes narratives jump forwards and backwards in time. Can we somehow use control operators to model time travel? Call/cc is often explained intuitively in terms of time travel, and it would be cool if that could be turned into a

*literal*explanation of what was happening. - In Martens's account, proof search is used by the computer to work out the consequences of user actions. Is there any way we can make the
*player*do some deduction? For example, some stories gradually reveal information to the reader until the reader has a flash of insight about what has really happening.So is there any way to "make the player do proof search", in order to achieve game effects related to hidden information? (I'm not even sure what I mean, honestly.)

- Can we use theorem proving to model fate and prophecies in games?
So, if you have a Lincoln simulator in which John Wilkes Booth kills Abraham Lincoln at the end of the game, then you might want to let the player do anything which does not make this impossible. In current games, this might be done by giving Booth rules-breaking defenses or simply forbidding players from aiming at Booth. But this lacks elegance, not to mention literary value!

It would be better if the available actions to the player to only be those that don't rule out this event -- not just immediately, but eventually into the future. We might imagine that the game could use linear theorem proving, and then forbid an actions if there is any path along which Booth dies $n$ steps in the future. It's still removing choice, but perhaps if we are subtle enough about it, it will seem a bit more natural, as if coincidence is worked subtly to ensure the outcome rather than a game hack.

Note that all of these ideas were spurred by just the very first section of her talk -- she gave me a lot to think about!

Bret Victor gave the keynote talk. On the internet, he has a reputation as a bomb-throwing radical interaction designer, but his in-person persona is very mild. He opened his talk with a paean to the virtues of direct manipulation, and then he moved on to demo some sample pieces of software which worked more like the way he wanted.

His argument, loosely paraphrased, was that people write programs to create interactive behaviors, and that it would be easier to do in a direct manipulation style. By this, he meant that you want a programming environment in which the behavior and the means for creating and modifying them "lived together" -- that is, you use the same conceptual framework for both the relevant behaviors and the operators on them. So if you want to create a visual program, he thought it would be better to offer visual operators to modify pictures rather than code.

He illustrated this with a pair of demo programs. The first of these was a program that let you build interactive animations with timelines and some vector graphics primitives, and the second was a data visualization program that worked similarly, but which replaced the timeline with a miniature spreadsheet.

Both of these programs seemed like extremely mild and conservative designs to me. The basic vocabulary he worked with --- grids, vectors, timelines, spreadsheets --- were all familiar primitives from drawing, animation and data-manipulation programs, and then he augmented it all by letting you parameterize these things with variables, to get you procedural abstractions. He was careful to make variables visualizable using sliders, a la Conal Elliot's tangible values, but beyond that he really seemed to hook things together in the most direct way possible.

Something I was really curious about was how Victor might design an interface that was *not* visual --- e.g., how would you apply his design principles for a blind user? What does it mean for sounds, rather than pictures?

## Friday, December 20, 2013

### PhD Opportunities at the University of Birmingham

My university, the University of Birmingham, is looking for applicants to the CS PhD program. I'm putting our advertisement on my blog, in case you (or your students, if you're a professor) are looking for a graduate program -- well, we're looking for students!

Let me just say that my colleagues are a terrific bunch, who know and do work on all kinds of fantastic things. I am very happy to able to work with them, and can confirm that this is a very exciting group to be a part of.

We invite applications for PhD study at the University of Birmingham.

We are a group of (mostly) theoretical computer scientists who explore fundamental concepts in computation and programming language semantics. This often involves profound and surprising connections between different areas of computer science and mathematics. From category theory to lambda-calculus and computational effects, from topology to constructive mathematics, from game semantics to program compilation, this is a diverse field of research that continues to provide new insight and underlying structure.

- See our webpage, with links to individual researchers, here:
- Information about PhD applications may be found here:
- If you are considering applying, please contact any of us. We will be
very happy to discuss the opportunities available.
- Martin Escardo (topology, computation with infinite objects, constructive mathematics, intuitionistic type theory)
- Dan Ghica (game semantics, heterogeneous computing, model checking)
- Achim Jung (mathematical structures in the foundations of computing: logic, topology, order)
- Neel Krishnaswami (type theory, verification, substructural logic, interactive computation)
- Paul Levy (denotational semantics, lambda-calculus with effects, nondeterminism, category theory, game semantics)
- Uday Reddy (semantics of state, separation logic)
- Eike Ritter (security protocol verification)
- Hayo Thielecke (abstract machines, concurrent and functional programming, software security)
- Steve Vickers (constructive mathematics and topology, category theory and toposes)

## Thursday, November 7, 2013

### Antimirov Derivatives for Regular Expressions

Brzozowski derivatives are one of the shibboleths of functional programming: if you ask someone about implementing regular expressions, and you get back an answer involving derivatives of regular expressions, then you have almost surely identified a functional programmer.

The reason that functional programmers like derivatives so much is that they offer an elegantly algebraic and inductive approach to string matching. The derivative function $\delta_c$ takes a character $c$ and a regular expression $r$, and returns a new regular expression $r'$, which accepts a string $s$ if and only if $r$ accepts $c\cdot s$. This function can be defined as follows: $$ \begin{array}{lcl} \delta_c(c) & = & \epsilon \\ \delta_c(c') & = & \bot \\ \delta_c(\epsilon) & = & \bot \\ \delta_c(r_1 \cdot r_2) & = & \left\{ \begin{array}{ll} \delta_c(r_1) \cdot r_2 \vee \delta_c(r_2) & \mbox{if } r_1 \mbox{ nullable} \\ \delta_c(r_1) \cdot r_2 & \mbox{otherwise} \end{array} \right. \\ \delta_c(\bot) & = & \bot \\ \delta_c(r_1 \vee r_2) & = & \delta_c(r_1) \vee \delta_c(r_2) \\ \delta_c(r\ast) & = & \delta_c(r)\cdot r\ast \end{array} $$

Now we can turn the 1-character derivative into a word derivative as follows: $$ \begin{array}{lcl} \delta_\epsilon(r) & = & r \\ \delta_{c\cdot w}(r) & = & \delta_w(\delta_c(r)) \end{array} $$

Again, this is a simple inductive definition. So to match a string
we just call the 1-character derivative for each character, and then
check to see if the final regular expression is nullable. However,
there's a catch! As you can see from the definition of $\delta_c$,
there is no guarantee that the size of the derivative is bounded. So
matching a long string can potentially lead to the construction of
very large derivatives. This is especially unfortunate, since it is
possible (using DFAs) to match strings using regular expressions in
*constant* space.

In his 1962 paper, Brzozowski showed that the number of derivatives of a regular expression is finite, if you quotient by the associativity, commutativity, and idempotence axioms for for union $r_1 \vee r_2$. That is, suppose we consider regular expressions modulo an equivalence relation $\equiv$ such that: $$ \begin{array}{lcl} r \vee r & \equiv & r \\ r \vee \bot & \equiv & r \\ r \vee r' & \equiv & r' \vee r \\ r_1 \vee (r_2 \vee r_3) & \equiv & (r_1 \vee r_2) \vee r_3 \\ \end{array} $$

Then his theorem guarantees that the set of derivatives is finite. However, computing with derivatives up to equivalence is rather painful. Even computing equality and ordering is tricky, since union can occur anywhere in a regular expression, and without that it's difficult to implement higher-level data structures such as sets of regular expressions.

So for the most part, derivatives have remained a minor piece of functional programming folklore: they are cute, but a good implementation of derivatives is not really simple enough to displace the usual Dragon Book automata constructions.

However, in 1995 Valentin Antimirov introduced the notion of a
*partial derivative* of a regular expression. The idea is that
if you have a regular expression $r$ and a character $c$, then a
partial derivative is a regular expression $r'$ such that if $r'$
accepts a word $s$, then $r$ accepts $c \cdot s$. Unlike a
derivative, this is only a if-then relationship, and not an
if-and-only-if relationship.

Then, rather than taking regular expressions modulo the ACUI
equations, we can construct *sets* of partial derivatives,
which collectively accept the same strings as the Brzozowski
derivative. The idea is that we can use the algebraic properties of
sets to model the effect of the ACUI equations. Below, I give
the partial derivative function:
$$
\begin{array}{lcl}
\alpha_c(c) & = & \setof{\epsilon} \\
\alpha_c(c') & = & \emptyset \\
\alpha_c(\epsilon) & = & \emptyset \\
\alpha_c(r_1 \cdot r_2) & = &
\left\{
\begin{array}{ll}
\comprehend{r \cdot r_2}{ r \in \alpha_c(r_1)} \cup \alpha_c(r_2) & \mbox{if } r_1 \mbox{ nullable} \\
\comprehend{r \cdot r_2}{ r \in \alpha_c(r_1)} & \mbox{otherwise}
\end{array}
\right.
\\
\alpha_c(\bot) & = & \emptyset \\
\alpha_c(r_1 \vee r_2) & = & \alpha_c(r_1) \cup \alpha_c(r_2) \\
\alpha_c(r\ast) & = & \comprehend{ r' \cdot r\ast }{ r' \in \alpha_c(r) }
\end{array}
$$

Partial derivatives can be lifted to words just as ordinary derivatives can be, and it is relatively easy to prove that the set of partial word derivatives of a regular expression is finite. We can show this even without taking a quotient, and so partial derivatives lend themselves even more neatly to an efficient implementation than Brzozowski derivatives do.

I'll illustrate this point by using Antimirov derivatives to construct a DFA-based regular expression matcher in ~50 lines of code. You can find the Ocaml source code here.

First, let's define a datatype for regular expressions.

type re = C of char | Nil | Seq of re * re | Bot | Alt of re * re | Star of re

So `C`

is the constructor for single-character strings,
`Nil`

and `Seq`

correspond to $\epsilon$ and
$r\cdot r'$, and `Bot`

and `Alt`

correspond to
$\bot$ and $r \vee r'$.

Next, we'll define the nullability function, which returns `true`

if the regular expression accepts the empty string, and `false`

if
it doesn't. It's the obvious recursive definition.

let rec null = function | C _ | Bot -> false | Nil | Star _ -> true | Alt(r1, r2) -> null r1 || null r2 | Seq(r1, r2) -> null r1 && null r2Now come some scaffolding code. The

`R`

module is the type
of finite sets of regular expressions, the `M`

module are
finite maps keyed by sets of regexps, and `I`

is the type
of finite sets of `int`

s. The `rmap`

function
maps a function over a set of regexps, building a new regexp. (I don't
know why this is not in the standard `Set`

signature.)
module R = Set.Make(struct type t = re let compare = compare end) let rmap f rs = R.fold (fun r -> R.add (f r)) rs R.empty module M = Map.Make(R) module I = Set.Make(struct type t = int let compare = compare end)The

`aderiv`

function implements the Antimirov derivative
function $\alpha_c$ described above. It's basically just a direct
transcription of the mathematical definition into Ocaml. The `deriv`

function applies the derivative to a whole set of regular expressions and
takes the union.
let rec aderiv c = function | C c' when c = c' -> R.singleton Nil | C _ | Nil | Bot -> R.empty | Alt(r, r') -> R.union (aderiv c r) (aderiv c r') | Seq(r1, r2) -> R.union (rmap (fun r1' -> Seq(r1', r2)) (aderiv c r1)) (if null r1 then aderiv c r2 else R.empty) | Star r -> rmap (fun r' -> Seq(r', Star r)) (aderiv c r) let deriv c rs = R.fold (fun r acc -> R.union (aderiv c r) acc) rs R.empty

Since the set of partial derivatives is finite, this means that the powerset of this set is also finite, and so by treating sets of partial derivatives as states, we can construct a deterministic finite-state automaton for matching regular expressions. Let's define an Ocaml type for DFAs:

type dfa = {size : int; fail : int; trans : (int * char * int) list; final : int list}

Here, `size`

is the number of states, and we'll use
integers in the range `[0,size)`

to label the states. We'll
use `fail`

to label the sink state for non-matching strings,
and take `trans`

to be the list of transitions. The `final`

field is a list of accepting states for the DFA.

Now, we'll need a little more scaffolding. The `enum`

function
is a "functional for-loop" looping from `i`

to `max`

.
We use this to write `charfold`

, which lets us fold over all of the
ASCII characters.

let rec enum f v i max = if i < max then enum f (f i v) (i+1) max else v let charfold f init = enum (fun i -> f (Char.chr i)) init 0 256

We use this to define the `dfa`

function, which constructs
a DFA from a regular expression:

let find rs (n,m) = try M.find rs m, (n,m) with _ -> n, (n+1, M.add rs n m) let dfa r = let rec loop s v t f rs = let (x, s) = find rs s in if I.mem x v then (s, v, t, f) else charfold (fun c (s, v, t, f) -> let rs' = deriv c rs in let (y, s) = find rs' s in loop s v ((x,c,y) :: t) f rs') (s, I.add x v, t, if R.exists null rs then x :: f else f) in let (s, v, t, f) = loop (0, M.empty) I.empty [] [] (R.singleton r) in let (fail, (n, m)) = find R.empty s in { size = n; fail = fail; trans = t; final = f }

The `find`

function takes a set of regular expression and returns
a numeric index for it. To do this, it uses a state `(n, m)`

, where
`n`

is a counter for a gensym, and `m`

is the map we use
to map sets of regular expressions to their indices.

The main work happens in the `loop`

function. The `s`

parameter is the state parameter for `find`

, and `v`

is
the visited set storing the set of states we have previously visited. The `t`

parameter is the list of transitions built to date, and `f`

are the
final states generated so far.

The `rs`

parameter is the current set of regular
expressions. We first look up its index `x`

, and if we have
visited it, we return. Otherwise, we add the current state to the
visited set (and to the final set if any of its elements are
nullable), and iterate over the ASCII characters. For each character
`c`

, we can take the derivative of `rs`

, and
find its index `y`

. Then, we can add the transition
`(x, c, y)`

to `t`

, and loop on the derivative.
Essentially, this does a depth-first search of the spanning tree.

We then kick things off with empty states and the singleton set of
the argument regexp `r`

, and build a DFA from the return
value. Note that the failure state is simply the index corresponding to the
empty set of partial derivatives. The initial state will always be 0, since
the first state we `find`

will be the singleton set `r`

.

Since we have labelled states by integers from 0 to `size`

,
we can easily build a table-based matcher from our `dfa`

type.

type table = { m : int array array; accept : bool array; error : int } let table d = { error = d.fail; accept = Array.init d.size (fun i -> List.mem i d.final); m = (let a = Array.init d.size (fun _ -> Array.make 256 0) in List.iter (fun (x, c, y) -> a.(x).(Char.code c) <- y) d.trans; a) }

Here, the `table`

type has a field `m`

for
the transition table, an array of booleans indicating whether the
state accepts or not, and the error state. We build the table in the
obvious way in the `table`

function, by building an array
of array of integers and initializing it using the list of transitions.

Then, we can give the matching function. The `matches'`

function takes a table `t`

, a string `s`

, an
index `i`

, and a current state `x`

. It will just
trundle along updating its state, as long as we have not yet reached
the end of the string (or hit the error state), and then it will
report whether it ends in an accepting state or not. The
`re_match`

function just calls `matches'`

at
index 0, in state 0.

let rec matches' t s i x = if i < String.length s && x != t.error then matches' t s (i+1) t.m.(x).(Char.code s.[i]) else t.accept.(x) let re_match t s = matches' t s 0 0

And that's it! Obviously more work is needed to make this a real regexp implementation, but the heart of the story is here. It's also possible to improve this implementation a bit further by adding a DFA minimization step, but that's a tale for another day. (DFA minimization is a very pretty little coinductive algorithm, and so makes a nice companion to the inductive story here.)

## Tuesday, July 23, 2013

### What Declarative Languages Are

On his blog, Bob Harper asks what, if anything, a declarative language is. He notes that "declarative" is often used to mean "logic or functional programming", and is (justly) skeptical that this is a useful pairing.

However, there's actually a surprisingly simple and useful definition of declarative language: a declarative language is any language with a semantics has some nontrivial existential quantifiers in it. To illustrate this definition, let's begin by looking at some examples of declarative languages:

- Regular expressions
- Context-free grammars
- Database query languages based on relational algebra (eg, SQL)
- Logic programming languages (eg, Prolog)
- Constraint-based languages for layout (eg, CSS)

The common factor is that all of these languages have a semantics which relies on some existential quantifiers which it is not immediately obvious how to discharge. For example, the semantics of regular expressions can be explained in terms of string membership, by giving a judgement $w \in r$ meaning that the string $w$ is an inhabitant of the regular expression $r$. $$ \begin{array}{c} \frac{} {\epsilon \in \cdot} \\[1em] \frac{} { c \in c} \\[1em] \frac{ w \in r_i \qquad i \in \{1,2\} } { w \in r_1 \vee r_2} \\[1em] \mbox{(no rule for $w \in \bot$)} \\[1em] \frac{\exists w_1, w_2.\; w = w_1 \cdot w_2 \qquad w_1 \in r_1 \qquad w_2 \in r_2 } {w \in r_1 \cdot r_2} \\[1em] \frac{\exists w_1, w_2.\; w = w_1 \cdot w_2 \qquad w_1 \in r \qquad w_2 \in r* } {w \in r*} \end{array} $$

In particular, note the appearance of an existential quantifier in the premises of the sequential composition and Kleene star cases, and note the nondeterministic choice of a branch in the alternation case. So read as a logic program, this semantics is not well-moded.

Of course, to implement a regular expression matcher, you need an operational semantics which is well-moded and functional, which we get by constructing a finite state machine from the regular expression. But now we have a proof obligation to show that the operational semantics agrees with the declarative semantics.

We can make a similar distinction for each of the examples above. Context-free languages also have a declarative membership relation, but are recognized with parsing algorithms such as CYK and Earley parsing. Query languages have a declarative semantics in terms of the relational algebra, where relational composition is hiding an existential, but are implemented in terms of nested loops and indices. Logic programming has a declarative semantics in terms of the model theory of first-order logic, and an implementation in terms of backtracking and unification. Constraint languages have a declarative semantics in terms of simultaneous satisfaction of a collection of equations, with the existentials lurking in the values assigned to the free variables, but are implemented in terms of simplification and propagation through imperative graphs.

This also lets us make the prediction that the least-loved features of any declarative language will be the ones that expose the operational model, and break the declarative semantics. So we can predict that people will dislike (a) backreferences in regular expressions, (b) ordered choice in grammars, (c) row IDs in query languages, (d) cut in Prolog, (e) constraint priorities in constraint languages. And lo and behold, these are indeed the features which programmers are encouraged to avoid as much as possible!

This definition also lets say that functional programming is
*not* a declarative language -- the reduction relation of the
lambda calculus is well-moded, in that we do not need to guess any
part of a term to figure out how to reduce it. (And if you fix an
evaluation order, the reduction relation is even deterministic.) The
same is true for imperative languages like ML or Haskell: now we're
just additionally threading a store through the reduction relation.