I recently read Shriram Krishamurthi and Kathi Fisler's ICER 2021 paper, Developing Behavioral Concepts of Higher-Order Functions. In this paper, they study not the theory of higher-order functions, bur rather the pedagogy of higher-order functions: how do we teach students how to program with them?
In particular, they studied how students understood different higher-order functions -- e.g., map
, filter
, fold
and co. -- by asking them to perform tasks like classifying which functions were similar, which example input/outputs could conceivably be produced by particular HOFs, and so on.
Since this is a blog, I can say that one thing I really liked about this paper is that it confirmed my prejudices! In particular, they found that fold
was much harder for students to understand than any of the other higher-order functions they studied.
This is something I have long believed, both from anecdotal evidence and for mathematical reasons. I'll pass over the anecdotes (since the paper has actual data), and focus instead on what I see as the core mathematical issue: the specification of fold
is dramatically more complex than the specification of map
or filter
.
Both map
and filter
have extremely simple equational specifications. For example, map
is characterised by the following three laws:
map f [] = []
map f (xs ++ ys) = (map f xs) ++ (map f ys)
map f [x] = [f x]
Basically, this says that map takes nils to nils, distributes over appends, and applies f
to elements pointwise. All three of these equations are universally-quantified, unconditional equations, which is about as simple as a specification can get.
The specification for filter is only a little more complex:
filter p [] = []
filter p (xs ++ ys) = (filter p xs) ++ (filter p ys)
filter p [x] = if p(x) then [x] else []
filter
does nothing to nils, distributes over appends, and for each element either keeps it or discards it. Again, All three of these equations are universally-quantified, unconditional equations.
On the other hand, what is the specification for fold
? That is given by its universal property, which reads:
[h [] = nil ∧
(∀x, xs. cons x (h xs) = h (x :: xs))] // Uh-oh!
iff
(fold nil cons = h)
This is not an equational specification at all! It is a bi-implication, and while the right-to-left direction is basically the definition of fold
, the left-to-right direction contains a formula with a logical quantifier in the premise. We know from nearly a century of experience with high school calculus that quantifier alternations are a giant obstacle to understanding: every year thousands of students find themselves utterly baffled by epsilon-delta arguments, and here we have something harder – we have an implication with a universally quantified premise.
So I've always been very careful whenever I teach students about the fold
function, because I think it has a very challenging spec for students to wrap their heads around.
If you're watching out for quantifiers, that also suggests that take-while
will also be one of the more challenging for functions students – takewhile p xs
returns the longest prefix of xs
satisfying the predicate p
, and "longest prefix" is going to be involve another higher-rank quantified formula.
The evidence in the paper seems weakly consistent with this idea, in that they report that take-while
does seem harder for students, but it does not seem to be as much harder as I would have expected.
Anyway, I found this a really interesting paper!