A research blog about programming languages, formal logics, software development and their interactions, by Matthias Puech.

malloc() is the new gensym()

Teaching an introductory course to “compilation” this semester (actually it was called Virtual Machines, but it was really about compiling expressions to stack machines), I realized something I hadn’t heard before, and wish I had been told when I first learned OCaml many years ago. Here it is: as soon as you are programming in a functional language with physical equality (i.e. pointer equality, the (==) operator in OCaml), then you are actually working in a “weakly impure” language, and you can for example implement a limited form of gensym. What? gensym is this classic “innocuously effectful” function returning a different symbol—usually a string—each time it is called. It is used pervasively to generate fresh variable names, in compilers notably. How? well, you actually don’t have much to do, except let the runtime call malloc: it will return a “fresh” pointer where to store your data. malloc and the garbage collector together ensures this freshness condition, and you can then compare two pointers with (==). As a bonus, you can even store data along your fresh symbol.

In this post, I’ll exploit that simple idea to develop an assembler for a little stack machine close to that of OCaml.

Read the rest of this entry »


My thesis is out!

At last! The definitive, final and comprehensive version of my thesis manuscript is out. I defended it on April 8 in Bologna, Italy, and received both titles of “Dottore di ricerca” and “Docteur” in Computer Science, with great pride and relief. What an adventure! You can find my manuscript on my web page, precisely here; it’s called Certificates for incremental type-checking, and after much hesitation, I chose a blue cover for its printed version (it was a tough choice). It is already a little bit obsolete since I compulsively worked on that material even after its submission to avoid the baby blues, but I will nonetheless advertise it here, and eventually write about my advances in future posts. In short, if you are interested in proof certificates, manipulation of proof objects in a functional language, spine-form LF, incremental type-checking, contextual type theory, or the relationship between natural deduction and the sequent calculus, you might be interested in some parts of my manuscript.

Read the rest of this entry »

Strong reduction in big steps

It has been a long while since I updated this blog, let me finally revive it with an easy but (hopefully) fun post, well ehm… at least some usable reference material!

Strong reduction is the ability for the interpreter of a functional language to reduce under function declaration. Usually, in OCaml for example, evaluation stops at the λ boundary; we might like to reduce under them for various reasons:

  • partial evaluation (e.g. λx. 2+2+x ⇒ λx. 4+x)
  • conversion checking in dependent type systems (is λx. (λy. y) x the same as λx. x?)

I never remember the big-step semantics of strong reduction, so I’m going to write them down here in glorious Unicode-art. Over the years I accumulated many small variants of reduction machines, both weak and strong, call-by-name or call-by-value… Here is a first shot, starting with the canonical, CBV weak head reduction and incrementally building up strong reduction; I’ll be probably extending this list when I remember other variants.

Read the rest of this entry »

Building terms incrementally: expressivity

A few weeks ago, I showed a very simple model for constructing and storing typed combinators incrementally. By incremental, I meant that type-checking was a sequential and iterative process: instead of giving it a whole term, type-checked at once (the usual, batch approach), you could split it up into pieces and commit each of them individually. Each commit would give you a unique identifier for the piece of term you just provided, and if the user bothers to write down these names, he can then elaborate new terms on top of them, possibly in a non-linear fashion. That way, we can model bottom-up term construction, and incrementally modify a term.

We noted also some defects to this approach, two of which we are going to address today: one is the absence of top-down construction (a very common way of building terms incrementally), the other is the burden of remembering names for successive versions. The interesting part is that we are going to simulate them inside our model, taking advantage of the extensibility of the signature.

Read the rest of this entry »

Reverse natural deduction and get sequent calculus

This is a follow-up on my previous post. It should be readable by itself if you just take a quick peek at herds.

Today, we are going to write type-checkers. And rewrite them. Again and again. Eventually, I’ll put in evidence that what seemed to be a programming hack in the last post turns out to be the difference between two well-known equivalent formulations of first-order logic, your good ol’ natural deduction and sequent calculus. Read the rest of this entry »

Reversing data structures

A reversed list is not really a list anymore. It is isomorphic to a list, but it is not a list. Let me explain why. Read the rest of this entry »

Building terms, incrementally

Here is the first post of a series about incremental verification in programming languages. This problem has become more-or-less the main subject of my Ph.D, and this is an exercise in explaining it from scratch, exploring all the possible alternatives.

Eventually, the goal is to have a clear foundation for supporting better typing incrementality in programming and proof development tools. By incrementality, I mean the ability to have feedback of type errors upon any change during the development. This requires the ability to check changes in programs or proofs and not have to retype a whole development or file as it is the case too often. Read the rest of this entry »

An OCaml hack: recover the abstraction of abstract types

Here is a scoop: in OCaml, your abstract types aren’t really abstract (unfortunately). This is because some magical functions in the standard library don’t respect the abstraction of data types:, Hashtbl.hash, Pervasive.(=) — yes, the very equality you use everyday — etc. The problem is that if you contribute to a large project written in OCaml and you want to take advantage of abstract types, these functions become your worst enemies because they are breaking the abstraction you were carefully designing. You’ll probably end up as I did finding and removing by hand every occurrence of these magical functions in your code. The following is for you then. Read the rest of this entry »

Set the root directory of a project in Emacs with dir-locals.el

Hi, and welcome to my blog. I’ll try to write here about the stuff I do, use or discover during my work as a Ph.D in Computer Science. There will hopefully be a bit of science (proof theory), some code (functional programming), some tips and tricks for developers.

For a gentle start, I’d like to share a nice little trick I put together for Emacs. I am developing in a big project (Coq) with multiple subdirectories, which compilation is managed by GNU make. When I open a file in a subdirectory and then hit M-x compile, make complains because I am now in the subdirectory and not at the root of the project anymore. So I took the habit for hitting very fast M-x cd .. every time I open a new file in a subdirectory. Annoying… Read the rest of this entry »