Ever since I learned about them, I've thought of call-by-push-value and focusing (aka polarization) as essentially two different views of the same problem: they both give a fine-grained decomposition of higher-order effectful programs which permits preserving the full βη-theory of the language.
Until this morning, I had thought that the differences were merely cosmetic, with CBPV arising from Paul Levy's analysis of the relationship between denotational semantics and operational semantics, and focusing arising an analysis of the relationship between operational semantics and proof theory (a lot of people have looked at this, but I learned about it from Noam Zeilberger). Both systems decompose a Moggi-style computational monad into a pair of adjoint operators, which mediate between values and computations (in CBPV) and positive and negative types (in focusing). So I thought this meant that “value type” and “positive type” were synonyms, as were “computation type” and “negative type”.
This morning, I realized I was wrong! Focusing and call-by-push-value make precisely the opposite choices in their treatment of variables! To understand this point, let's first recall the syntax of types for a call-by-push-value (on top) and a polarized (on bottom) calculus.
At first glance, these two grammars look identical, save only for the
renamings and
. But this is
misleading! If they are actually the same idea, the reason has to be
much more subtle. The reason for this is that the typing judgements
for these two systems are actually quite different.
In call-by-push-value, the idea is that is a functor which
is left adjoint to
. As a result, values are interpreted in a
category of values
, and computations are interpreted in a
category of computations
. The adjunction between values and
computations means that the hom-set
is ismorphic
to the hom-set
.
This adjunction gives rise to the two basic judgement forms of
call-by-push-value, the value judgement
and the computation judgement
. The idea
is that
and
.
The key bit is in the interpretation of contexts in computations, so let me highlight that:
Note that we interpret contexts as , and so this says that
variables refer to values.
However, in a polarized type theory, we observe that positive types are “left-invertible”, and negative types are “right-invertible”. In proof theory, a rule is invertibile when the conclusion implies the premise. For example, the right rule for implication introduction in intuitionistic logic reads
This is invertible because you can prove, as a theorem, that
is an admissible rule of the system. Similarly, sums have a left rule:
such that the following two rules are admissible:
The key idea behind polarization is that one should specify the calculus modulo the invertible rules. That is, the judgement on the right should fundamentally be a judgement that a term has a positive type, and the hypotheses in the context should be negative. That is, the two primary judgements of a polarized system are the positive introduction judgement
which explains how introductions for positive types work, and the negative elimination (or spine judgement)
which explains how eliminations for negative types work. The
eliminations for positive types are derived and the introductions for
negative types are derived judgements (which end up being rules for
pattern matching and lambda-abstractions) which make cut-elimination
hold, plus a few book-keeping rules to hook these two judgements
together. The critical point is that the grammar for consists
of negative types:
This is because positive types are (by definition) left-invertible, and so there is no reason to permit them to appear as hypotheses. As a result, the context clearly has a very different character than in call-by-push-value.
I don't have a punchline for this post, in the sense of “and therefore the following weird things happen as a consequence”, but I would be astonished if there weren't some interesting consequences! Both focalization and call-by-push-value teach us that it pays large dividends to pay attention to the fine structure of computation, and it's really surprising that they are apparently not looking at the same fine structure, despite apparently arising from the same dichotomy at the type level.