### Strong reduction in big steps

It has been a long while since I updated this blog, let me finally revive it with an easy but (hopefully) fun post, well ehm… at least some usable reference material!

Strong reduction is the ability for the interpreter of a functional language to reduce under function declaration. Usually, in OCaml for example, evaluation stops at the λ boundary; we might like to reduce under them for various reasons:

• partial evaluation (e.g. λx. 2+2+x ⇒ λx. 4+x)
• conversion checking in dependent type systems (is λx. (λy. y) x the same as λx. x?)

I never remember the big-step semantics of strong reduction, so I’m going to write them down here in glorious Unicode-art. Over the years I accumulated many small variants of reduction machines, both weak and strong, call-by-name or call-by-value… Here is a first shot, starting with the canonical, CBV weak head reduction and incrementally building up strong reduction; I’ll be probably extending this list when I remember other variants.

The idea behind strong reduction is to iterate weak head reduction over strong values, as found in Grégoire & Leroy [1] (and probably many other articles that I don’t know of).

[1] : Benjamin Grégoire and Xavier Leroy. A compiled implementation of strong reduction. In International Conference on Functional Programming 2002, pages 235-246. ACM Press, 2002.

The start language is the usual lambda-calculus:

M ::= x | λx. M | M M

This is the kind of reduction performed by functional language interpreters. No need to reduce under a λ, and the whole expression to be interpreted should be closed: the only values are λ.

### Call-by-value

This is über-standard and serves only to set notations:

#### Without closures

The values are:

V ::= λx. M

And we only need value substitution: M [x/V]. I’m not going to talk about them but remember that in this simple case, since a value is always closed, it is correct to implement substitution naïvely,
i.e. not capture-avoiding.

—————-
⊢ λx. M → λx. M

⊢ M → λx. P     ⊢ N → V      ⊢ P [x/V] → V’
————————————————-
⊢ M N → V’

There is no case for variables since M is supposed to be closed. This is right-to-left evaluation of argument. Just invert the two first premisses in the last rule to obtain left-to-right.

#### With closures

We can amortize the cost of performing substitions with a closure. Values change to:

V ::= λx. M [η]
η ::= · | η, x/V

———————
η ⊢ λx. M → λx. M [η]

η(x) = V
———-
η ⊢ x → V

η ⊢ M → λx. P [η’]     η ⊢ N → V      η’, x/V ⊢ P → V’
———————————————————-
η ⊢ M N → V’

To obtain lexical scoping instead of dynamic, we resume the closure saved in the λ.

### Call-by-name

#### Without closures

Values are the same as before, but substitution must be a little richer: M [x/N] where N isn’t necessary a value.

——————–
⊢ λx. M → λx. M

⊢ M → λx. P     ⊢ P [x/N] → V’
———————————–
⊢ M N → V’

#### With closures

Values are still the same, but closures change so as to remember the closure in which a possibly open, unevaluated term was saved:

V ::= λx. M [η]
η ::= · | η, x/M[η]

———————
η ⊢ λx. M → λx. M [η]

η(x) = M[η’]        η’ ⊢ M → V
——————————–
η ⊢ x → V

η ⊢ M → λx. P [η’]        η’, x/N[η] ⊢ P → V
————————————————
η ⊢ M N → V

As a first step toward strong reduction, we modify weak head reduction to be able to cope with free variables. With the machines above, reducing (λx. x) y z results in an error since y’s value is unknown; we now want it to succeed with value y z.

### Call-by-value

#### Without closures

Values now can also be variables, applied to values. They are thus changed to:

V ::= λx. M | U
U ::= U V | x

There is now way to write a redex in this syntax, except under a λ.

——————-
⊢ λx. M → λx. M

⊢ M → λx. P       ⊢ N → V        ⊢ P [x/V] → V’
—————————————————
⊢ M N → V’

⊢ M → U         ⊢ N → V
—————————
⊢ M N → U V

———-
⊢ x → x

There is now a variable case, and a new application case: if the functional part of the application turns out to be a symbolic term U (a variable applied to value arguments) then we just continue the
symbolic application.

#### With closures

Absolutely no surprise here, it’s as above (weak head case) but extended with our two new rules:

V ::= λx. M [η] | U
U ::= U V | x

————————-
η ⊢ λx. M → λx. M [η]

η(x) = V
———-
η ⊢ x → V

η(x) undef
———-
η ⊢ x → x

η ⊢ M → λx. P [η’]        η ⊢ N → V        η’, x/V ⊢ P → V’
———————————————————–
η ⊢ M N → V’

η ⊢ M → U       η ⊢ N → V
——————————
η ⊢ M N → U V

### Call-by-name

#### Without closures

In the weak head case, values didn’t differ between CBV and CBN. They do in the symbolic case: x ((λx. M) N) is not a CBV value (because argument is not evaluated), but it is a CBN one: we don’t evaluate function arguments, so we don’t evaluate symbolic arguments either. Here is the value syntax:

V ::= λx. M | U
U ::= U M | x

—————-
⊢ λx. M → λx. M

⊢ M → λx. P      ⊢ P [x/M] → V
————————————
⊢ M N → V

⊢ M → U
—————–
⊢ M N → U N

———-
⊢ x → x

#### With closures

Amortizing substitution with closures, we need to add closure annotations on λ values as before, but also on each argument of a symbolic spine, since we leave them unevaluated: the evaluation of (λx. a x) b x gives the value a x[x/b[]] x[].

V ::= λx. M [η] | U
U ::= U M[η] | x
η ::= · | η, x/M[η]

———————
η ⊢ λx. M → λx. M [η]

η ⊢ M → λx. P [η’]      η’, x/N[η] ⊢ P → V
———————————————–
η ⊢ M N → V

η ⊢ M → U
———————
η ⊢ M N → U N[η]

η(x) = M[η’]      η’ ⊢ M → V
——————————-
η ⊢ x → V

η(x) undef
————-
η ⊢ x → x

## Strong reduction

We are now looking for an evaluator from λ-terms M to strong values, that is any λ-term without any redex, even under a lambda (canonical terms). We call these W, and are mutually recursively defined with X’s (atomic terms):

W ::= λx. W | X
X ::= x | X W

The idea is to iterate symbolic reduction. There are two mutually recursive judgements: strong reduction itself ⊢ M ⇒ W, weak head reducing M and passing it to a readback procedure ⊢ V ↝ W which in turn traverses weak values to iterate strong reduction on subterms:

### Without closures

#### Strong reduction

⊢ M → V       ⊢ V ↝ W
—————————
⊢ M ⇒ W

⊢ M ⇒ W
——————–
⊢ λx. M ↝ λx. W

⊢ U ↝ X       ⊢ V ↝ W
————————– (CBV)
⊢ U V ↝ X W

⊢ U ↝ X         ⊢ M ⇒ W
—————————– (CBN)
⊢ U M ↝ X W

———–
⊢ x ↝ x

When reading back a lambda, we reduce strongly its body. The difference between call-by-value and call-by-name is in the symbolic application rule: in CBV, arguments are already values, so we just need to read them back; in CBN, arguments are not evaluated at all, so we need to strongly reduce them.

EDIT: There is an inefficiency here that can be problematic. The “symbolic hat” of a term, i.e. the non-reducible, purely applicative tip of a term is traversed twice: once in symbolic head-reduction, once in readback. In call-by-name, head-reduction only traverse applications on the left, leaving the right part untouched; in call-by-value however, the whole symbolic hat is traversed to form values. For instance, the strong reduction of x (y z) (λa. (λb. b) a):

• in CBN, traverses the spine until the symbolic function x, saving arguments (y z) and (λa. …) untouched. Then comes readback, which reads the spine again until the x and then evaluate (for the first time) the arguments, leaving (y z) intact and reducing (λa. (λb. b) a) to (λa. a).
• in CBV, traverses and turns symbolic functional x and arguments into values, reconstructing both arguments. The readback re-traverses them a second time to find λs to go under.

Call-by-name is thus more efficient in this presentation than call-by-value. In call-by-name though, we still traverse symbolic spines on the left twice to find the functional, but that can be fixed by exposing this functional at the head of the application, together with a list of arguments in the natural order (reverse of the usual one): this is the ƛ-calculus.

### With closures

Adding closures to this algorithm, we must only thread them during strong reduction, readback doesn’t need them because each weak value comes with its saved closure (λx. M [η] or U M [η] in CBN), that we restore in the call to strong reduction.

#### Strong reduction

η ⊢ M → V ⊢ V ↝ W
—————————–
η ⊢ M ⇒ W

η ⊢ M ⇒ W
————————
⊢ λx. M [η] ↝ λx. W

⊢ U ↝ X        ⊢ V ↝ W
————————— (CBV)
⊢ U V ↝ X W

⊢ U ↝ X        η ⊢ M ⇒ W
—————————– (CBN)
⊢ U M [η] ↝ X W

———-
⊢ x ↝ x

That’s it! Next time, hopefully very soon, I’ll restart the whole exercise starting with my favorite variant of the λ-calculus: ƛ, a calculus with n-ary application. We’ll see that, although the difference seems anecdotal, it exposes more structure to the reduction and allow for optimizations and interesting variants of the reduction relation.

Doviđenja! (I’m just back from Serbia, Belgrade is the Berlin of the 21th century, I tell you!)