Test = examples or theorem ?
Recently, I had a conversation with a friend about testing. The job of my friend is about the tests of system components from controlled conditions to real environment. Around an incredible meal and a tasty wine, it was the occasion to compare our understanding of testing. Mine is using the lens of maths. Let recap here!
(Is it an excuse for presenting the proof assistant Coq?)
Use-case presentation
The aim is to be didactic, so let a pick really simple use-case. Let consider
an arbitrary list of natural numbers, say [5,4,3,2,1]
, and we want to be
able to:
- insert a number, say 3, i.e., resulting to
[5,4,3,3,2,1]
; - count the occurrences of one number, say 3, i.e., resulting to the answer 2.
For the insertion, let consider that the list is read from left to right and the number is inserted when the next one is smaller. This choice could be flipped.
These functions are used by our fictional application and our goal is to test that its implementation is correct.
Weird C
implementation
Because the syntax of C
is very common, let fix the ideas using the C
programming language. The code we write will compile and run but obviously it
will not be C
-compliant for efficiency (fool the one who uses recursion with
C
!) – the aim is to ease the reading with less common language as Haskell
or Coq.
So, what do we need for the implementation? Easy, the very common linked list
data structure and the functions insert
and count
. In addition, let also
implement some helper functions. A question is the representation of the
natural numbers, let keep it simple and just use integer. The linked list reads,
typedef struct List { Int head; struct List *tail; } List;
where Int
means an alias of int
(typedef int Int;
), using the Haskell
typecase. The function insert
takes a natural number, say n
, and a list,
say lst
and returns another list. An implementation using recursion reads,
List* insert (Int n, List* lst) { if (lst == NULL) { return Cons (n, NULL); } else { Int a = lst -> head; List *tail = lst -> tail; if (a <= n) { return Cons (n, lst); } else { return Cons (a, insert (n, tail)); }}}
where List* Cons (Int n, List* lst)
is a function for helping the
construction of new list. The function count
takes a natural number, say
n
, and a list, say lst
, and returns a natural number. A recursive
implementation reads,
Int count (Int n, List* lst) { if (lst == NULL) { return 0; } else { Int a = lst -> head; List* tail = lst -> tail; Int r = count (n, tail); if (n == a) { return 1 + r; } else { return r; }}}
So far, so good! We can test these two functions individually or we can also test them in combination. Somehow, we are looking for a property… Well, if we insert a number to a list, then the this number appears one more time. Other said, we have the test,
void test (Int n, List* lst) { if (count (n, insert (n, lst)) == 1 + count (n, lst)) { printf("test ok.\n"); } else { printf("test ko!\n"); }}
And now, we needs examples for testing. For instance, we can implement a
function, say generate
, that generates one list, and uses it for one test,
void main() { List *lst = generate (5); test (3, lst); }
Hum?! But how do we know that this test covers well the real application? We have tested only one example, after all. Ok, we could test many more examples and complexify the generator. That's the direction of QuickCheck and other testing frameworks.
Here, we are trying to convince ourselves that many examples could be enough to be confidant. People with maths background know that the examples are often between the definition and the theorem. What acts as definition and as theorem?
Coq as a programming language
Coq is a proof assistant, i.e., an interactive theorem prover. Wow! it makes sense with the paragraph just above. However, we want a pragmatical implementation and test it – not fly in abstract maths. Let stay on track and use Coq for the implementation and the tests.
First thing first, quick presentation of Coq. For instance, it reads,
Require Import PeanoNat. Notation Int := nat. Definition y := 42. Definition f x := x + 1. Compute f y.
where it seems self explanatory; y
is defined as the value 42
and f
is
defined as a function. The last instruction returns,
= 43 : Int
and an attentive reader could remark that the «mathematical» parenthesis as in
\(f(y)\) are omitted; the notation is inspired by ML and derivative. The first
instruction imports utilities for working with natural numbers and we will
discuss the second instruction Notation
in another section.
A recursive function cannot be defined using the keyword Definition
and
instead, we have to use the keyword Fixpoint
. For instance, the factorial
reads,
Fixpoint fac n := match n with 0 => 1 | S n' => (S n') * fac n' end.
where S n'
means the successor of n'
, i.e., S n' = n' + 1
; see the
discussion about Notation
for more details. The important point is the
recursion: fac
is defined using itself, i.e, fac
and thus it is a
Fixpoint
. The keyword match
allows to deconstruct the argument (here n
)
and check its pattern (it could be somehow thought as a if-branch; but the
pattern matching is more powerful). And the evaluation of Compute fac 5.
returns 120
.
What is missing? The ability to define the linked list data structure. It is
done using the keyword Inductive
and it reads,
Inductive List : Type := NULL | Cons (head : Int) (tail : List).
where List
somehow defines a new type with 2 constructors: NULL
without
any argument, and Cons
which is defined recursively by a simple number
(head
) and another List
(tail
). Note that using Coq (and similarly with
OCaml or Haskell), it is possible – not to say strongly recommended – to
pattern match on these « constructors ».
Now, we have defined the datastructure for the linked list, let implement the
insert
function,
Fixpoint insert (n : Int) (lst : List) : List := match lst with NULL => Cons n NULL | Cons a tail => if a <=? n then Cons n lst else Cons a (insert n tail) end.
which takes two arguments, n
of type Int
and lst
of type List
and
returns a List
. The match
acts as the if-branch in the C
implementation. And reading both side by side, they appears similar, right?
The function count
reads,
Fixpoint count (n : Int) (lst : List) : Int := match lst with NULL => 0 | Cons a tail => let r := count n tail in if n =? a then 1 + r else r end.
where all should be almost self explanatory. Well, Coq is a programming language and thus it can be used for programming.
Wait, we were discussing about testing, so what about the tests?
Coq as a proof assistant
Now, let jump into the core point of the post. Having maths in mind, we would
like to compute, i.e., specify a value for n
and for the list lst
, for
instance,
Compute count 5 (insert 5 (Cons 1 (Cons 2 (Cons 3 NULL)))).
which returns 1
as expected and it also 1 + 0
since the value 5
does not
appear in this list of the three values 1, 2 and 3. Let simplify this
comparison and rewrite in Coq the functions generate
and test
,
Fixpoint generate (n : Int) : List := match n with 0 => Cons n NULL | S m => Cons (S m) (generate m) end. Definition test (n : Int) (lst : List) := count n (insert n lst) =? 1 + count n lst.
Nothing fancy in these definitions, right? Again, we can compute,
Compute test 3 (generate 5).
which returns true
. From math in mind, instead we would like to write an
example, for instance,
Example test_3_with_5 : test 3 (generate 5) = true.
however, Coq returns instead,
1 subgoal (ID 4) ============================ test 3 (generate 5) = true
Again, having maths in mind, we would write a Proof
just by computing the
few terms. Similarly, in Coq, we write,
Proof. simpl. reflexivity. Qed.
which computes the left hand side and the right hand side, last comparing that they are correctly equal.
Ok, we have the intuition that this property (specification?) holds: for all natural number and for all list of natural numbers, counting the number of occurrences of one specific number when this specific number is inserted to this list is equal to the number of occurrences of this specific number in this very same list plus one. Perhaps it is a theorem… in Coq, it reads,
Theorem count_incr_by_insert : forall (n : Int) (l : List), count n (insert n l) = 1 + count n l.
and Coq returns,
1 subgoal (ID 6) ============================ forall (n : Int) (l : List), count n (insert n l) = 1 + count n l
which requires a Proof
. It is far beyond this post to expose the Coq’s
tactics for proving. Here a proof could be,
Proof. intros n l; induction l. simpl. (* Search (_ =? _). *) rewrite Nat.eqb_refl. reflexivity. simpl. case (head <=? n). simpl. rewrite Nat.eqb_refl. reflexivity. simpl. case (n =? head). rewrite IHl; reflexivity. rewrite IHl; reflexivity. Qed.
Wow! We show that the both implementations insert
and count
associated
with the custom data structure List
are consistent (using the logic
implemented by Coq).
Extraction
We do not dive into all the extraction features of Coq. For instance, these functions could be converted as Haskell or OCaml and thus compiled to efficient native machine code. Here, for instance, using,
Require Extraction. Extraction Language Haskell. Extraction "Generated.hs" insert count.
then it extracts to the Haskell code,
data List = NULL | Cons Nat List insert :: Nat -> List -> List insert n lst = case lst of { NULL -> Cons n NULL; Cons a tail -> case leb a n of { True -> Cons n lst; False -> Cons a (insert n tail)}} count :: Nat -> List -> Nat count n lst = case lst of { NULL -> O; Cons a tail -> let {r = count n tail} in case eqb n a of { True -> add (S O) r; False -> r}}
where it is not truly an idiomatic Haskell code but the code compiles.
Ah, why does the signature use the type Nat
and not the type Int
? The
question is thus: how to define mathematically a natural number for computing?
A computer is able to represent a finite set of natural numbers, usually named
integer. Outside this range (finite set), the natural numbers cannot be
represented by the computer; to be precise, it is because a computational
efficiency for the processing unit that these integers are used.
From a mathematical point of view, these integers do not satisfy all the rules
of the natural numbers. Instead, the natural numbers are defined as a zero,
noted e.g., O
, and a successor, i.e., the number one is defined as the
successor of zero, noted S 0
, and so on. For instance, the extraction of
the addition reads,
data Nat = O | S Nat add :: Nat -> Nat -> Nat add n m = case n of { O -> m; S p -> S (add p m)}
instead of the builtin n + m
for the Haskell type Int
. The number 5
is
thus S (S (S (S (S 0))))
and such addition implies many computations or
memory. Other said, the definition sacrifices the performance when computing.
The issue also applies for the real numbers and the floating-point numbers;
even probably worse.
Opinionated afterword
Usually, we have the specification, the implementation and the test. Then many efforts are provided to check the consistency of these three components. If we apply the lens of maths, then the components become,
- specification = definition and theorem
- test = proof
- implementation = extraction