Reversing data structures

by Matthias Puech

A reversed list is not really a list anymore. It is isomorphic to a list, but it is not a list. Let me explain why.

Prelude: zippers and contexts

How do we define in general reversing a data structure? Intuitively, constructors at the bottom of an original value must appear at the top of the resulting value. A good formal definition arises from the zipper of a data structure. The idea of the zipper is to represent a particular point(er) inside a deep value v : t by a pair of a value v' : t, which is what’s underneath the current point, and a value c : t', which is the path from the root of v to the current point, i.e. what’s above current point. It is sometimes referred as the one-hole context or derivative of the data type (see this of course). The type of the context depends on the type the value. A classic example, binary trees:

module type Btree = struct

  type 'a t =
    | Node of 'a t * 'a * 'a t
    | Leaf

  type 'a t' =
    | Top
    | Left of 'a t * 'a * 'a t'
    | Right of 'a t' * 'a * 'a t

  type 'a zipper = 'a t * 'a t'

Type 'a t is the data type of binary trees, 'a t' is the context, and 'a zipper represents positions in a 'a t. The context has 3 possible value: either we are at the Top of the structure (there is no context), or we just went Left of a Node, or Right. These types traditionally come with functions

  val top : 'a t -> 'a zipper
  val up : 'a zipper -> 'a zipper
  val left : 'a zipper -> 'a zipper
  val right : 'a zipper -> 'a zipper

to initialize a zipper and navigate through the data structure. But let’s forget about traversal for now and look closer at the type of contexts t'. To every Leaf of a tree v, there is a corresponding context which allows to reconstruct the whole tree, and it is stored reverse order, from bottom to top. That’s exactly what we want!

Andante: reversing lists

What is the context corresponding to lists? Likewise, we construct it by enumerating all possible ways of making a hole in a list constructor; there is just one:

module List = struct

  type 'a t =
    | Cons of 'a * 'a t
    | Nil

  type 'a t' =
    | Top
    | Down of 'a * 'a t'

As announced, t' is isomorphic to t! And since there is just one leaf in a list, reversing a list is a deterministic operation, from t to t':

  let rec rev_append acc : 'a t -> 'a t' = function
    | Nil -> acc
    | Cons (x, xs) -> rev_append (Down (x, acc)) xs

  let rev l = rev_append Top l

which is usually written from t to t. Let us write both variants of fold, on lists and reversed lists:

  let rec fold_left f acc = function
    | Nil -> acc
    | Cons (x, xs) -> fold_left f (f acc x) xs

  let rec fold_right' f acc = function
    | Top -> acc
    | Down (x, xs) -> f (fold_right' f acc xs) x

These functions are interchangeable: one acts on lists, the other on the reversed list and we have

fold_left f acc l = fold_right' f acc (rev l)

Note that fold_left is tail-recursive, while fold_right' is not. Conversely,

  let rec fold_right f acc = function
    | Nil -> acc
    | Cons (x, xs) -> f (fold_right f acc xs) x

  let rec fold_left' f acc = function
    | Top -> acc
    | Down (x, xs) -> fold_right' f (f acc x) xs

We have

fold_right f acc l = fold_left' f acc (rev l)

and fold_right is not tail-recursive, while fold_left' is. Of course, both usual equations without the primes don’t hold since the types of lists don’t match anymore.

The moral is that if you use fold_left a lot, you will likely prefer lists rather than reversed lists, but if you use fold_right, you’d better store (or convert on-the-fly) your lists in reversed form. Probably no big news.

Variations on the same theme: herds

Let us redo the same thing on a variation of list: a herd (I just made up the name) is a list of elements, with a distinguished element of another type at the bottom:

module Herd = struct

  type ('sheep, 'dog) t =
    | Cons of 'sheep * ('sheep, 'dog) t
    | Nil of 'dog

Its context should make the 'dog lead the way, followed by all the 'sheeps in reverse order:

  type 'sheep t'' =
    | Down of 'sheep * 'sheep t''
    | Top

  type ('sheep, 'dog) t' = Bot of 'dog * 'sheep t''

This time, the type t' of contexts is not isomorphic to t. The rev function is written

  let rec rev_append acc : ('a, 'b) t -> ('a, 'b) t' = function
    | Cons (x, xs) -> rev_append (Down (x, acc)) xs
    | Nil z -> Bot (z, acc)

  let rev l = rev_append Top l

Let’s now write the iterators on herds. As with lists, there are four of them, alternatively tail-recursive and not, and they satisfy the same equations. Because of the 'dog at the tail of the Herd.t (and at the head of its context), the types of these functions differ from the usual lists: they take a supplementary function g that’s used to wrap around the 'dog.

fold_left f g acc l = fold_right' f g acc (rev l)
fold_right f g acc l = fold_left' f g acc (rev l)

  let rec fold_left f g acc = function  (* tail-rec *)
    | Nil z -> g acc z
    | Cons (x, xs) -> fold_left f g (f acc x) xs

  let rec fold_right f g acc = function (* non-tail-rec *)
    | Nil z -> g acc z
    | Cons (x, xs) -> f x (fold_right f g acc xs)

  let fold_left' f g acc = function     (* tail-rec *)
    | Bot (z, l) ->
      let rec aux acc = function
        | Top -> acc
        | Down (x, xs) -> aux (f acc x) xs
      in aux (g acc z) l

  let fold_right' f g acc = function    (* non-tail-rec *)
    | Bot (z, l) ->
      let rec aux acc = function
        | Top -> acc
        | Down (x, xs) -> f (aux acc xs) x
      in g (aux acc l)


  • fold_left f g x (Cons (1, Cons (2, Cons (3, Nil true)) = g (f (f (f x 1) 2) 3) true
  • fold_right f g x (Cons (1, Cons (2, Cons (3, Nil true)) = f (f (f (g x true) 1) 2) 3

Coda: the (lack of) punchline

There is no real punchline here, other than a strict construction of what reversing a list-like structure is by means of context, and what effect it has on the tail-recursivity of iterators. In the next post, I will show how this technique may shed some light on a logical object, namely the term assignment of an intuitionistic sequent calculus.

I wonder if — no, I am sure that — this construction can be made mechanical. I’ve been showed recently by Ian the magic of defunctionalization of the CPS-transform of programs (see this), and how it gives rise to the one-hole contexts. Yet I am not comfortable enough yet with these ideas to have managed to derive the t' from the t mechanically, and observe the inversion of tail-recursivity of the iterators. Maybe someone will?