Syntax!

A research blog about programming languages, formal logics, software development and their interactions, by Matthias Puech.

Typeful disjunctive normal form

This is the answer to last post’s puzzle. I gave an algorithm to put a formula in disjunctive normal form, and suggested to prove it correct in OCaml, thanks to GADTs. My solution happens to include a wealth of little exercises that could be reused I think, so here it is.

I put the code snippets in the order that I think is more pedagogical, and leave to the reader to reorganize them in the right one.

Read the rest of this entry »

Disjunctive normal forms in big steps

This is probably a second-semester functional programming exercise, but I found it surprisingly hard, and could not find a solution online. So at the risk of depriving a TA from a problem for its mid-term exam, here is my take on it, that I painfully put together yesterday.

Given a formula built out of conjunction, disjunction and atoms, return its disjunctive normal form, in big step or natural semantics, that is, not applying repetitively the distributivity and associativity rules, but in a single function run. Before you go any further, please give it a try and send me your solution!

Read the rest of this entry »

Representing pattern-matching with GADTs

Here is a little programming pearl. I’ve been wanting to work on pattern-matching for a while now, and it seems like I will finally have this opportunity here at my new (academic) home, McGill.

Encoding some simply-typed languages with GADTs is now routine for a lot of OCaml programmers. You can even take (kind of) advantage of (some form of) convenient binding representation, like (weak) HOAS; you then use OCaml variables to denote your language’s variables. But what about pattern-matching? Patterns are possibly “deep”, i.e. they bind several variables at a time, and they don’t respect the usual discipline that a variable is bound for exactly its subterm in the AST.

It turns out that there is an adequate encoding, that relies on two simple ideas. The first is to treat variables in patterns as nameless placeholders bound by λ-abstractions on the right side of the arrow (this is how e.g. Coq encodes matches: match E₁ with (y, z) -> E₂ actually is sugar for match E₁ with (_, _) -> fun x y -> E₂); the second is to thread and accumulate type arguments in a GADT, much like we demonstrated in our printf example recently.

The ideas probably extends seamlessly to De Bruijn indices, by threading an explicit environment throughout the term. It stemmed from a discussion on LF encodings of pattern-matching with Francisco over lunch yesterday: what I will show enables also to represent adequately pattern-matching in LF, which I do not think was ever done this way before.

Read the rest of this entry »

Update on Typeful Normalization by Evaluation

In October, I publicized here a new draft on normalization by evaluation, which provoked some very helpful comments and constructive criticisms. Together with Chantal and Olivier, we thus revised the draft profoundly and a revamped version is available on my web site.

Besides the enhanced form and better explanations, we included a completely new section on what it means for NbE to be written in Continuation-Passing Style, that I am particularly excited about. This allowed us to extend our typeful NbE formalization beyond the minimal λ-calculus to sums and call/cc (which is known to be difficult). Taking our code, you can write a program with call/cc and if statements, and partially evaluate it: all redexes will be removed and your code will be specialized. All this, as before, is done typefully, thanks to OCaml’s GADTs: this means that the transformation is guaranteed to map well-typed terms to well-typed normal forms.

Read the rest of this entry »

1 ≠ 0

Learning Coq is usually an enlightening experience. I know it first-hand, it can also be a quite frustrating one, because it seems at first to involve remembering quite a lot of arbitrarily-named tactics, that one might not fully understand the effect of. discriminate is one of them: “if you’re faced with an obvious inequality, just apply discriminate” they say. If you look at the proof term, however, it is far from trivial and involves a “trick”, a kind of “lie” that Pierre has become a specialist of. There is an intermediate step in the comprehension, that we put together with Chantal: decomposing the proof generated by discriminate. Here is a very easy proof that 1 ≠ 0, which does not require to understand neither proof terms nor return clauses in pattern-matching. Pure beginner Coq.
Read the rest of this entry »

Quick, dirty and shallow definitions

Here is a quick hack. A few months ago, I advocated for pointer equality in OCaml (==) as a way to deal with fresh symbols in a toy compiler. Today, I’ll show another application of pointer equality: how to trivially implement a mechanism of definitions, to e.g. pretty-print programs in a more readable way. Once again, this is really easy, but I never heard of such a trick, so here it is. Read the rest of this entry »

New draft on Normalization by Evaluation using GADTs

There is a new draft on my web page! It is called Tagless and Typeful Normalization by Evaluation using Generalized Algebraic Data Types, which is a mouthful, but only describes accurately the cool new OCaml development we elaborated together with Olivier and Chantal. The world of Normalization by Evaluation (NbE) is still relatively new to me, but its wonders keep amazing me. It’s a program specialization algorithm… no, it’s a completeness proof for first-order logic… wait, it’s a technique to turn an interpreter into a compiler! Definitely, Aarhus University, my new home, is not the worst place to learn about it.

Since the introduction of GADTs in OCaml, a whole new realm of applications emerged, the most well-known being to faithfully represent typed languages: it allows to define compact and correct interpreters, and type-preserving program transformations. Ok, if you never saw this, here is a small snippet that should be self-explanatory:

type _ exp =
  | Val : 'a -> 'a exp
  | Eq : 'a exp * 'a exp -> bool exp
  | Add : int exp * int exp -> int exp

let rec eval : type a. a exp -> a = function
  | Val x -> x
  | Eq (e1, e2) -> eval e1 = eval e2
  | Add (e1, e2) -> eval e1 + eval e2
;;
assert (eval (Eq (Add (Val 2, Val 2), Val 4)))

Nice, isn’t it? Our question was: if we can so beautifully go from typed syntax (expressions) to typed semantics (OCaml values), can we do the converse, i.e., go from an OCaml value back to a (normal) expression of the same type? In other words, can we implement NbE in a typeful manner, using GADTs?

The answer is a proud “yes”, but more interestingly, we all learned some interesting GADT techniques and puzzling logical interpretations on the way. Read on to know more!

Tagless and Typeful Normalization by Evaluation using Generalized Algebraic Data Types

We present the first tagless and typeful implementation of normalization by evaluation for the simply typed λ-calculus in OCaml, using Generalized Algebraic Data Types (GADTs). In contrast to normalization by reduction, which is operationally carried out by repeated instances of one-step reduction, normalization by evaluation uses a non-standard model to internalize intermediate results: it is defined by composing a non-standard evaluation function with a reification function that maps the denotation of a term into the normal form of this term. So far, normalization by evaluation has been implemented either in dependently-typed languages such as Coq or Agda, or in general-purpose languages such as Scheme, ML or Haskell. In the latter case, denotations are either tagged or Church-encoded. Tagging means injecting denotations, either implicitly or explicitly, into a universal domain of values; Church encoding is based on the impredicativity of the metalanguage. Here, we show how to obtain not only tagless values, making evaluation efficient, but also typeful reification, guaranteeing type preservation and η-long, β-normal forms. To this end, we put OCaml’s GADTs to work. Furthermore, our implementation does not depend on any particular representation of binders (HOAS, de Bruijn levels or indices) nor on the style (direct style, continuation-passing style, etc.) of the non-standard evaluation function.

PS: My previous draft, Proofs, upside down was accepted for presentation at APLAS 2013. See you in Melbourne!

New draft: Proofs, upside down

There is a new draft on my web page, that should be of interest to those who enjoyed my posts about reversing data structures and the relation between natural deduction and sequent calculus. It is an article submitted to APLAS 2013, and it is called Proofs, upside down. In a nutshell, I am arguing for the use of functional PL tools, in particular classic functional program transformations, to understand and explain proof theory phenomena. Here, I show that there is the same relationship between natural deduction and (a restriction of) the sequent calculus than between this recursive function:

let rec tower_rec = function
  | [] -> 1
  | x :: xs -> x ∗∗ tower_rec xs

let tower xs = tower_rec xs

written in “direct style”, and that equivalent, iterative version:

let rec tower_acc acc = function
  | [] -> acc
  | x :: xs -> tower_acc (x ∗∗ acc) xs

let tower xs = tower_acc 1 (List.rev xs)

written in “accumulator-passing style”. And that relationship is the composition of CPS-transformation, defunctionalization and reforestation, the well-known transformations we all came to know and love!

I hope you enjoy it. Of course, any comment will be much appreciated, so don’t hesitate to drop a line below!

Proofs, upside down
A functional correspondence between natural deduction and the sequent calculus
It is well-known in proof theory that sequent calculus proofs differ from natural deduction proofs by “reversing” elimination rules upside down into left introduction rules. It is also well-known that to each recursive, functional program corresponds an equivalent iterative, accumulator-passing program, where the accumulator stores the continuation of the iteration, in “reversed” order. Here, we compose these remarks and show that a restriction of the intuitionistic sequent calculus, LJT, is exactly an accumulator-passing version of intuitionistic natural deduction NJ. More precisely, we obtain this correspondence by applying a series of off-the-shelf program transformations à la Danvy et al. on a type checker for the bidirectional λ-calculus, and get a type checker for the λ-calculus, the proof term assignment of LJT. This functional correspondence revisits the relationship between natural deduction and the sequent calculus by systematically deriving the rules of the latter from the former, and allows to derive new sequent calculus rules from the introduction and elimination rules of new logical connectives.

malloc() is the new gensym()

Teaching an introductory course to “compilation” this semester (actually it was called Virtual Machines, but it was really about compiling expressions to stack machines), I realized something I hadn’t heard before, and wish I had been told when I first learned OCaml many years ago. Here it is: as soon as you are programming in a functional language with physical equality (i.e. pointer equality, the (==) operator in OCaml), then you are actually working in a “weakly impure” language, and you can for example implement a limited form of gensym. What? gensym is this classic “innocuously effectful” function returning a different symbol—usually a string—each time it is called. It is used pervasively to generate fresh variable names, in compilers notably. How? well, you actually don’t have much to do, except let the runtime call malloc: it will return a “fresh” pointer where to store your data. malloc and the garbage collector together ensures this freshness condition, and you can then compare two pointers with (==). As a bonus, you can even store data along your fresh symbol.

In this post, I’ll exploit that simple idea to develop an assembler for a little stack machine close to that of OCaml.

Read the rest of this entry »

My thesis is out!

At last! The definitive, final and comprehensive version of my thesis manuscript is out. I defended it on April 8 in Bologna, Italy, and received both titles of “Dottore di ricerca” and “Docteur” in Computer Science, with great pride and relief. What an adventure! You can find my manuscript on my web page, precisely here; it’s called Certificates for incremental type-checking, and after much hesitation, I chose a blue cover for its printed version (it was a tough choice). It is already a little bit obsolete since I compulsively worked on that material even after its submission to avoid the baby blues, but I will nonetheless advertise it here, and eventually write about my advances in future posts. In short, if you are interested in proof certificates, manipulation of proof objects in a functional language, spine-form LF, incremental type-checking, contextual type theory, or the relationship between natural deduction and the sequent calculus, you might be interested in some parts of my manuscript.

Read the rest of this entry »

Follow

Get every new post delivered to your Inbox.

Join 390 other followers