I recently read Shriram Krishamurthi and Kathi Fisler's ICER 2021 paper, Developing Behavioral Concepts of Higher-Order Functions. In this paper, they study not the theory of higher-order functions, bur rather the pedagogy of higher-order functions: how do we teach students how to program with them?
In particular, they studied how students understood different higher-order functions -- e.g., map
, filter
, fold
and co. -- by asking them to perform tasks like classifying which functions were similar, which example input/outputs could conceivably be produced by particular HOFs, and so on.
Since this is a blog, I can say that one thing I really liked about this paper is that it confirmed my prejudices! In particular, they found that fold
was much harder for students to understand than any of the other higher-order functions they studied.
This is something I have long believed, both from anecdotal evidence and for mathematical reasons. I'll pass over the anecdotes (since the paper has actual data), and focus instead on what I see as the core mathematical issue: the specification of fold
is dramatically more complex than the specification of map
or filter
.
Both map
and filter
have extremely simple equational specifications. For example, map
is characterised by the following three laws:
map f [] = []
map f (xs ++ ys) = (map f xs) ++ (map f ys)
map f [x] = [f x]
Basically, this says that map takes nils to nils, distributes over appends, and applies f
to elements pointwise. All three of these equations are universally-quantified, unconditional equations, which is about as simple as a specification can get.
The specification for filter is only a little more complex:
filter p [] = []
filter p (xs ++ ys) = (filter p xs) ++ (filter p ys)
filter p [x] = if p(x) then [x] else []
filter
does nothing to nils, distributes over appends, and for each element either keeps it or discards it. Again, All three of these equations are universally-quantified, unconditional equations.
On the other hand, what is the specification for fold
? That is given by its universal property, which reads:
[h [] = nil ∧
(∀x, xs. cons x (h xs) = h (x :: xs))] // Uh-oh!
iff
(fold nil cons = h)
This is not an equational specification at all! It is a bi-implication, and while the right-to-left direction is basically the definition of fold
, the left-to-right direction contains a formula with a logical quantifier in the premise. We know from nearly a century of experience with high school calculus that quantifier alternations are a giant obstacle to understanding: every year thousands of students find themselves utterly baffled by epsilon-delta arguments, and here we have something harder – we have an implication with a universally quantified premise.
So I've always been very careful whenever I teach students about the fold
function, because I think it has a very challenging spec for students to wrap their heads around.
If you're watching out for quantifiers, that also suggests that take-while
will also be one of the more challenging for functions students – takewhile p xs
returns the longest prefix of xs
satisfying the predicate p
, and "longest prefix" is going to be involve another higher-rank quantified formula.
The evidence in the paper seems weakly consistent with this idea, in that they report that take-while
does seem harder for students, but it does not seem to be as much harder as I would have expected.
Anyway, I found this a really interesting paper!
I think I'm missing something. Your "specifications" for map and filter seem indistinguishable from the way those functions are implemented. Correspondingly, why can't I just present the following analogous "specification" for fold?
ReplyDeletefold z s [] = z
fold z s (x::xs) = s x (fold z s xs)
I don't see why this "specification" is any harder to understand than the ones for map and filter.
Hi, Derek!
DeleteThe specifications for map and filter are not implementations: the ++ symbol means the append function, and you can't pattern-match on function calls like that. Append and nil form the monoid structure of a free monoid, and the equations I give assert that map and filter are monoid homomorphisms: and that's enough to fully characterise them.
The definition of fold you give is actually the right-to-left direction of the bi-implication I wrote. Assume fold nil cons = h, and then after you replace h with fold nil cons, you get exactly the equations you wrote down. fold has a stronger spec than that, though -- it is the unique solution to those equations. So for any h such that h [] = nil and h (x :: xs) = cons x (h xs), we know that h = fold nil cons.
From the paper "A tutorial on the universality and expressiveness of fold," the author defines the universal property of fold like this. If g is a function such that its definition can be written like this:
Deleteg [] = v
g (x:xs) = f x (g xs)
Then g can be rewritten as "foldr f v", and vice versa.
This is similar, but not the same as the definition of foldr.
I was somewhat confused by the explanation of the universal property given by the author of this blog post.
Ah, that explains why I always tend to use foldMap! It has a similarly simple equational specification.
ReplyDeleteOh, but it is really easy to teach fold. You just draw a syntax tree for a list, and explain that fold is going to replace every cons with some function f and the (sole) nil with some value a. Sure, this is not the universal property or anything, but understanding how a function works maybe is not the same as fully grasping its spec...?
ReplyDeleteWhat confused me was that people would often write code like "foldr (\x xs -> )", leading me to believe at first that the second parameter is the rest of the list. It is if you're returning a list, e.g. "map f = foldr (\x xs -> f x : xs) []", but what it's actually doing is abstracting over the recursive call of foldr to itself. It took me a bit of time to really "get" foldr, but now I understand it deeply.
Delete
ReplyDeleteComputer Science | Gyansetu offers top-notch courses that cater to every level of expertise. With knowledgeable instructors and a variety of subjects, from programming to cybersecurity, it's a perfect platform to expand your tech skills and stay ahead in the ever-evolving world of computer science.
For more info:- https://www.gyansetu.in/blogs/how-can-a-b-com-graduate-become-a-data-scientist/