# Compiler and correctness

Back from the Open Source Experience 2021 meeting, I got the chance to learn
more about Pharo – a pure object-oriented programming language and a powerful
environment *à la* Smalltalk – and about Coq – a formal proof management
system. Even, I had the opportunity to discuss with Yves Bertot and
I had been so lucky that he explained me the demo one-to-one. To be honnest, I
am still rehashing many concepts and this experience is feeding my willing to
learn more.

Once at home, I want to try by myself. Ah, because I am running Guix as a computational environment manager, the story starts with a yak shaving: close this old bug#46016 about Proof General. Other said, I want to interact with Coq from my regular text editor Emacs. Therefore, let fix the annoyance with patch#51755. And, since we are at it, although it is not necessary, let also add the Coq package semantics with the patch#51896. Now we are ready, let's go!

**Our aim with this post is to sketch some ideas about compilation and
correctness using a very basic -- _{not} to say trivial_– toy implementation.** How
to test the implementation? How to know the implementation is correct? And
we get it, our aim is to use Coq.

Because we want to keep everything as simple as possible, the compiler is
about the simplest language ever: a calculator computing only additions but
with parenthesis, e.g., `(1+2)+(3+(4+0))`

. Again, keeping everything as
simple as possible, we are using the simplest computational model: a stack
machine with two instructions (`PUSH`

and `ADD`

). Therefore, we need:

- a parser reading the input and returning a tree,
- a compiler transforming the tree to instructions for the stack machine,
- an emulator running the stack machine.

## Our didactic goal

In our very restricted toy framework, the pipeline of a computation is defined by this composition –data flowing from right to left–,

`pop . emulator . compiler . push`

where

`push`

returns a tree,`compiler`

transforms the tree to a sequence of instructions,`emulator`

executes this list of instructions,`pop`

returns the last elements; we expect one value.

In this post, we are going to implement various `emulator`

and `compiler`

.
The aim is to discuss how it is possible to ensure the correctness of the
composition `emulator . compiler`

.

## Implementation using Haskell

We choose Haskell and not OCaml – only because Tuareg (Emacs OCaml mode) is not working for us as expected. Anyway, let start!

### Parsing with Haskell

Using the builtin Haskell parsing utilities, the `parser`

function reads,

parser :: String -> Maybe STree parser input = case parse expr "" input of Left _ -> Nothing Right e -> Just e

where `expr`

of type `Parser STree`

. Well, please read all the details in the
file `Parser.hs`

. Keeping simplicity in mind, the data type for the tree is
just:

data STree = Val Integer | Add STree STree

where the leaf is named `Val`

and carries the value of an integer; the
constructor `Add`

is a node and by referring recursively to `STree`

it defines
the tree . For instance, `(1 + 2) + 3`

is thus represented by the tree,

Add / \ Add Val 3 / \ Val 1 Val 2

where `Val`

are leaves and `Add`

are nodes. So far, so good!

### Emulator

The emulator (virtual machine) runs two instructions (`PUSH`

and `ADD`

), the
binary, here, is a list of instructions, and the stack is a list of integers.
The Haskell code implementing these words reads,

data Instr = PUSH Integer | ADD type Binary = [Instr] type Stack = [Integer]

Therefore, the emulator takes a binary, returns a stack and it executes (computes) starting with an empty stack,

emulator :: Binary -> Stack emulator b = exec b []

For the interested reader, the details of `exec`

is provided here.

### Compiler

The input of the compiler a tree `STree`

and it produces a binary (a list of
instructions), i.e., it reads,

compiler :: STree -> Binary compiler source = comp source []

where `comp`

is an auxiliary function dealing with each case of the tree: if
it is a leaf (`Val`

), then the instruction is to `PUSH`

the value in the stack
(`bytecode`

), otherwise if it is a node (`Add`

), then compile recursively the
subtrees and set the instruction `ADD`

to stack. These words are implemented
in Haskell with,

comp :: STree -> Binary -> Binary comp (Val n) bytecode = PUSH n : bytecode comp (Add x y) bytecode = comp x $ comp y $ ADD : bytecode

Nothing fancy!

### And…

…to complete a working example, the functions `push`

and `pop`

are here the
identity function (do nothing) and take the first element. Well, adding some
other minor code,

`cmd`

for the command line interface and parsing the input string,`go`

for gluing the compilation and running parts,`emit`

for debugging purpose,

and we end with,

$ ./calc-no-coq -x '(1+2)+3' = 6 $ ./calc-no-coq -c '(1+2)+3' Instructions: "PUSH 1, PUSH 2, ADD, PUSH 3, ADD, ;"

Neat!

## Implementation using Coq

As you see just above, we named the previous `calc-no-coq`

because the title
of this post is about «compilation and correctness», thus now we are going to
speak about correctness. The perfect tool is Coq the proof assistant. Before
proving, let use Coq as a programming language and implement these compiler
and emulator in Coq, then extract to Haskell. Other said, we reuse the same
pipeline as before,

`pop . emulator . compiler . push`

where `emulator`

and `compiler`

are going to be implemented in Coq, the `pop`

and `push`

will be adapted accordingly, i.e.,

`push`

converts from the Haskell tree datatype`STree`

to the Coq tree datatype`Tree`

, i.e, signature`STree -> Coq.Tree`

,`pop`

converts from the Coq stack datatype to Haskell integer, i.e., signature`Coq.Stack -> Integer`

This `Coq.Tree`

is defined by the module `Lang`

,

Inductive Tree : Type := Val (n : nat) | Add (x : Tree) (y : Tree).

which looks similar to definition of `STree`

, and `Coq.Stack`

is identically
defined as above, i.e., a list of number,

Definition Stack := list nat.

So far, so good!

### Emulator

Similarly to the previous implementation, we straightforwardly adpat the data structure for the instructions and the binary (list of instructions),

Inductive Instr:Type := PUSH (n : nat) | ADD. Definition Binary := list Instr.

The emulator executes a bytecode (`b : Binary`

) and returns a stack,

Fixpoint exec (b : Binary) (s : Stack) : Stack := match b, s with PUSH n :: rest, stack => exec rest (n :: stack) | ADD :: rest, m :: n :: stack => exec rest (m + n :: stack) | [], stack => stack | _, stack => stack end. Definition emulator (b : Binary) := exec b [].

where `exec`

is the function really computing; this function takes two
arguments: a list of instructions (`Binary`

) and stack and then returns
another stack.

### Compiler

The compiler requires a tree (named above `Coq.Tree`

) and an auxiliary function,

Fixpoint comp (source : Tree) (bytecode : Binary) : Binary := match source with Lang.Val n => PUSH n :: bytecode | Lang.Add x y => comp x (comp y (ADD :: bytecode)) end. Definition compiler (source : Tree) := comp source [].

where all is strongly similar with the previous implementation.

### And…

…to a complete working example, we just need to extract from Coq to Haskell with,

Require Export Lang Compiler Emulator. Require Extraction. Extraction Language Haskell. Extraction "Generated.hs" compiler emulator.

additing some Haskell glue code, as `push`

and `pop`

, and command-line and
parser, we get,

$ ./calc -x '(1+2)+3' = 6 $ ./calc -c '(1+2)+3' Instructions: "PUSH 1, PUSH 2, ADD, PUSH 3, ADD, ;" $ ./calc -c '1+(2+3)' Instructions: "PUSH 1, PUSH 2, PUSH 3, ADD, ADD, ;"

Note the order of the instructions depending on the parenthesis order, i.e., depending on the tree, for instance,

(1+2)+3 1+(2+3) Add Add / \ / \ Add Val 3 Val 1 Add / \ / \ Val 1 Val 2 Val 2 Val 3

Neat! However, let avoid to go out of rail and thus let recap!

- we are interested by the
`emulator`

and`compiler`

parts, - for both implementations, the command-line is done by the same Haskell
implementation (
`cmd`

); idem for parsing (`parser`

).

Now, we want to have a proof that the implementations of `emulator`

and
`compiler`

are correct. For the implementation in Haskell, we cannot do much,
except run QuickCheck (large number of examples picked randomly). For the
implementation in Coq, well Coq is a proof assistant after all, so could we
prove something about the correctness?

## Proving the `emulator`

and `compiler`

are correct

In place of a large test suite, we can reason about the logic of the functions
`emulator`

and `compiler`

. As we wrote in *Test = examples or theorem?*,
instead of ruling out many examples to convince us about the correctness, we
could try to prove a theorem. Which one?

We would like to prove that compiling the code and running it using the emulator is equal to interpret it using an (abstract) evaluator. It means prove this theorem,

Theorem correctness: forall (code : Tree), emulator (compiler code) = eval code :: [].

where `eval`

is an abstract interpreter implemented in Coq but never
effectively doing a single real computation. Such interpreter is easily
defined by the recursive function,

Fixpoint eval (source:Tree) := match source with Val n => n | Add x y => (eval x) + (eval y) end.

A formal proof of the theorem `correctness`

requires other theorems, lemma or
corollary. Note that for Coq, theorem, lemma and corollary are the same but
they help the human reader.

How do we prove this theorem `correctness`

assisted by Coq?

### Unwrap the emulator

The emulator is a layer on the top of the execution model – the function
`exec`

. Instead of the theorem `correctness`

, we could think about the
theorem,

Theorem correctness': forall (code : Tree) (stack : Stack), exec (compiler code) stack = eval code :: stack.

and if this theorem `correstness'`

holds, then the proof of the theorem
`correctness`

reads,

Corollary correctness'': forall (code : Tree), exec (compiler code) [] = eval code :: []. Proof. intro code. rewrite correctness'. reflexivity. Qed. Theorem correctness: forall (code : Tree), emulator (compiler code) = eval code :: []. Proof. intro code. unfold emulator. rewrite correctness''. reflexivity. Qed.

where the corollary is a straightforward application of the (admitted for now)
theorem. Then, the proof of `correctness`

just unfolds the function of
interest `emulator`

and applies (`rewrite`

) the previous proved corollary.

How do we prove this theorem `correctness'`

?

### Unwrap the compiler

The correctness is now dependent on the theorem `correctness'`

. Because
`compiler`

is a wrapper around the function `comp`

, we could think about the
lemma,

Lemma correctness_comp: forall (code : Tree) (bytecode : Binary) (stack : Stack), exec (comp code bytecode) stack = exec bytecode (eval code :: stack).

The key point of the proof is the use of the additing commutativity proof
`Nat.add_comm`

, \(\forall n, m : n + m = m + n\). Such could appear
straightforward, but it is not because it depends on how the numbers are
internally represented. For efficiency purpose, a computer somehow represents
only a finite set of natural numbers, commonly named integer. Instead Coq is
using, by default, the mathematical construction of natural numbers defining a
zero, noted `O`

and a successor, i.e., the data structure,

Inductive nat : Type := O | S n.

where e.g., 3 is thus `S (S (S 0))`

. That's why, the proof applies
`Nat.add_comm`

. Aside this consideration, the proof mainly unfolds and folds
the definitions; see below.

Using this `correctness_comp`

lemma, then the corollary `correctness_comp'`

just considers the case of an empty binary,

Corollary correctness_comp': forall (code:Tree) (stack:Stack), exec (comp code []) stack = eval code :: stack.

### And…

…ending the proof is just unfolding the functions of interest and applying
(`rewrite`

) the previous proved results. The complete proof of correctness is
therefore (presented linearly as in math textbook),

Lemma correctness_comp: forall (code:Tree) (bytecode:Binary) (stack:Stack), exec (comp code bytecode) stack = exec bytecode (eval code :: stack). Proof. intros code. induction code. - simpl. reflexivity. - unfold comp. fold comp. intros bytecode stack. rewrite IHcode1. rewrite IHcode2. unfold eval. fold eval. unfold exec. fold exec. rewrite Nat.add_comm. reflexivity. Qed. Corollary correctness_comp': forall (code:Tree) (stack:Stack), exec (comp code []) stack = eval code :: stack. Proof. intros code stack. rewrite correctness_comp. unfold exec. reflexivity. Qed. Theorem correctness': forall (code:Tree) (stack:Stack), exec (compiler code) stack = eval code :: stack. Proof. intros code stack. unfold compiler. rewrite correctness_comp'. reflexivity. Qed. Corollary correctness'': forall (code:Tree), exec (compiler code) [] = eval code :: []. Proof. intro code. rewrite correctness'. reflexivity. Qed. Theorem correctness: forall (code:Tree), emulator (compiler code) = eval code :: []. Proof. intro code. unfold emulator. rewrite correctness''. reflexivity. Qed.

Done! Well, almost. We could also prove that the behaviour of the
interpreter `eval`

is correct, i.e., the properties of the addition for
mathematical intergers are fully satisfied.

## Opinionated afterword

A Coq's proof is somehow a compilation. Therefore, similarly as Thompson's Reflections on Trusting Trust, what is the seed we have to trust? How to inspect this seed? What is the size / complexity of this seed?

For sure, realistic compilers usable for critical embedded software using formal verification are around, as CompCert – incredible piece!

Last, whatever the trust we have in the compiler, it runs on the top of a
kernel (operating system). What trust do we have in this binary kernel? How
this binary kernel is it built? Maybe, all the bootstrap chain could starts
from stage0 (Hex0) on some formally verified microkernel as seL4 or on a Linux
kernel compiled by CompCert from GNU Hurd microkernel^{1}.

## Footnotes:

^{1}

Assuming Coq would run Hurd. ;-)