# External Axiomatizations: a Trip Into SPARK’s Internals

## by Claire Dross –

There are cases expressing all the specification of a package in SPARK is

- impossible, for example if you need to link them to elements of the mathematical world, like trigonometry functions.
- cumbersome, especially if they require concepts that cannot easily be described using contracts, like transitivity, counting, summation...
- or simply inefficient, for big and complex data structures like containers for example.

In these cases, a user can provide directly a manually written Why3 translation for an Ada package P using a feature named external axiomatizations. The provided translation must mimic automatic translation on every visible entity E of the package, providing all the expected declarations in a single module named P__e. Here is how it can be done on the permutation package used for the proof of a selection sort implementation in our previous post.

To come up with a skeleton of the Why3 translation, it is convenient to use GNATprove to generate automatically the WhyML modules for each entity of our Ada package. First, we only keep in the public part of the specification of the Perm package the declarations needed for the verification of client code. In our case, we move the actual implementation of the permutation function to the private part and we use SPARK_Mode to disallow analysis on this part:

```
package Perm with SPARK_Mode is
subtype Index is Integer range 1 .. 100;
type Nat_Array is array (Index range <>) of Natural;
function Is_Perm (A, B : Nat_Array) return Boolean;
private
pragma SPARK_Mode (Off);
function Is_Perm (A, B : Nat_Array) return Boolean is ...
```

We then launch GNATprove on this package. It produces a WhyML file named perm.mlw. This file is quite big, even for such a small package. It contains approximatively 1000 lines of code and 60 modules. This is mainly due to the fact that GNATprove also generates declarations for entities from the Ada standard library (predefined types in particular, like Integer or String) and other imported Ada packages if any. The modules from the entities declared in Perm can be easily recognized though, as they are prefixed by Perm__.

Now, we need to copy the declarations generated by GNATprove for the visible entities of Perm. For this, we search the perm.mlw file for modules whose name is starting with Perm__. There are 11 such modules in our tiny example

```
module Perm__index
module Perm__nat_array
module Perm__index__axiom
module Perm__nat_array__axiom
module Perm__is_perm__a
module Perm__is_perm__a__axiom
module Perm__is_perm__b
module Perm__is_perm__b__axiom
module Perm__is_perm
module Perm__is_perm__axiom
module Perm__is_perm__subprogram_def
```

The first two are interesting, they are those produced from our two type declarations. Here is what they look like:

```
(* Module for axiomatizing type "index" defined at perm.ads:2, created in Gnat2Why.Types.Translate_Type *)
module Perm__index
use import "_gnatprove_standard".Main
use "_gnatprove_standard".Integer
use import "int".Int
type index
function first :int = 1
function last :int = 100
predicate in_range (x : int) = 1 <= x <= 100
(* Copy of the model of modules for static discrete types *)
clone export "ada__model".Static_Discrete with
type t = index,
function first = first,
function last = last,
predicate in_range = in_range
end
(* Module for axiomatizing type "nat_array" defined at perm.ads:3, created in Gnat2Why.Types.Translate_Type *)
module Perm__nat_array
use import "_gnatprove_standard".Main
use Standard__natural
use Standard__integer
use Perm__index
use "_gnatprove_standard".Integer
use import "int".Int
use "_gnatprove_standard".Array__1
(* Copy of the model of modules for unconstrained array types *)
clone export "ada__model".Unconstr_Array with
type component_type = Standard__natural.natural,
type base_type = Standard__integer.integer,
function to_int = Standard__integer.to_int,
predicate in_range_base = Standard__integer.in_range,
predicate index_dynamic_property = Perm__index.dynamic_property
type nat_array = __t
(* Copy of the axiom for array bitwise comparison *)
clone export "ada__model".Array_Comparison_Axiom with
type component_type = Standard__natural.natural,
function to_int = Standard__natural.to_int
end
```

These modules use templates for type declarations of discrete types and arrays that are instantiated using Why3's clone mechanism with the appropriate elements (the bounds of the type for discrete types and types for indexes and components for arrays). We can see that there are two other modules generated for these types, with suffix __axiom. There is such an additional module for every translated entity. They are used to break circularity of declarations in some cases. For our type declarations, they are empty.

The next four modules are not interesting for us. They declare the parameters of the function Is_Perm and are only necessary to check absence of runtime errors in the declaration of Is_Perm. Then, we have the modules for the declaration of Is_Perm:

```
(* Module for possibly declaring a logic function for "is_perm" *)
module Perm__is_perm
use import "_gnatprove_standard".Main
use import int.Int
use import Perm__nat_array
function is_perm (a b : nat_array) : bool
end
(* Module for declaring a program function (and possibly an axiom) for "is_perm" *)
module Perm__is_perm__axiom
use import "_gnatprove_standard".Main
use import int.Int
use import Perm__nat_array
use Perm__is_perm
val is_perm (a : nat_array) (b : nat_array) : bool
ensures { result = Perm__is_perm.is_perm a b }
end
```

We have two translations for Is_Perm, a logic function and a program function. Indeed, in WhyML, we can only use complete logic functions in assertions, that is, functions without contracts and without read of global variables. The first translation is used for that purpose. In our manual translation, we collapse these two modules into one, named Perm__is_perm. To avoid name collision, we must rename the logic function to is_perm__logic. We do not copy the third module for Is_Perm, named Perm__is_perm__subprogram_def, which is used to check absence of runtime errors in its specification. Here is the resulting module for Is_Perm:

```
module Perm__is_perm
use import "_gnatprove_standard".Main
use import int.Int
use import Standard__natural
use import Perm__nat_array
function is_perm__logic (a b : nat_array) : bool
val is_perm (a : nat_array) (b : nat_array) : bool
ensures { result = is_perm__logic a b }
end
```

Here, we have neither a definition nor a postcondition for Is_Perm in the visible part of the specification. Thus, the automatic translation of Is_Perm is left uninterpreted. If we want to use this axiomatization to prove our selection sort implementation, we will need to give some information about the function is_perm__logic. Like we did in Ada in our previous post, we choose to give this definition using an intermediate function returning the number of occurrences of a variable into an array:

```
module Perm__is_perm
use import "_gnatprove_standard".Main
use import int.Int
use import Standard__natural
use import Perm__nat_array
use import "_gnatprove_standard".Array__1
(* Number of occurrences of v in a *)
function occ (v: natural) (a: nat_array) : int
axiom occ_empty:
forall v: natural, a: nat_array.
last a < first a -> occ v a = 0
axiom occ_set:
forall e v: natural, a : nat_array, i: int.
(* b is the result of updating a at index i with v *)
let b = of_array (set (to_array a) i v) (first a) (last a) in
first a <= i <= last a ->
(get (to_array a) i <> e ->
(v <> e -> occ e b = occ e a) /\
(v = e -> occ e b = (occ e a) + 1)) /\
(get (to_array a) i = e ->
(v <> e -> occ e b = (occ e a) - 1) /\
(v = e -> occ e b = occ e a))
function is_perm__logic "inline" (a b : nat_array) : bool =
forall v: natural. occ v a = occ v b
val is_perm (a : nat_array) (b : nat_array) : bool
ensures { result = is_perm__logic a b }
end
```

We left occ uninterpreted. For our proofs, we only need two additional axioms. The first one states that the number of occurrences of any element in an empty array is 0 and the second one describes how the number of occurrences of a variable in an array is modified when the array is updated.

We store the module Perm__is_perm along with the two automatically generated modules for our type declarations in a file named perm.mlw and we put this file in a subdirectory named _theories in our proof directory (specified in the project file, see SPARK 2014 User's Guide). We then add a line at the top of our Perm package to state that it has a manual translation:

```
pragma Annotate (GNATprove, External_Axiomatization);
```

We are now ready to use the external axiomatization mechanism to prove our selection sort implementation:

```
procedure Swap (Values : in out Nat_Array;
X : in Index;
Y : in Index)
with
Pre => (X in Values'Range and then
Y in Values'Range and then
X /= Y),
Post => Is_Perm (Values'Old, Values)
is
Temp : Natural;
begin
Temp := Values (X);
Values (X) := Values (Y);
Values (Y) := Temp;
end Swap;
procedure Selection_Sort (Values : in out Nat_Array) with
Post => Is_Perm (Values'Old, Values)
is
Smallest : Positive; -- Index of the smallest value in the unsorted part
begin
if Values'Length = 0 then
return;
end if;
for Current in Values'First .. Values'Last - 1 loop
-- Get the index of the smallest value in the unsorted part...
Smallest := Index_Of_Minimum (Values (Current .. Values'Last));
if Smallest /= Current then
Swap (Values => Values,
X => Current,
Y => Smallest);
end if;
-- ...and swap it with the current index.
pragma Loop_Invariant (Is_Perm (Values'Loop_Entry, Values));
end loop;
end Selection_Sort;
```

To prove this file, GNATprove uses the provided translation for the Perm package. Thanks to our axioms, everything is discharged automatically with a timeout of 5 seconds. Compared to the previous attempt we did at verifying this implementation, this is more efficient and requires less manual annotations in the client code. Still, as we have seen, coming up with the axiomatization requires both a knowledge of the WhyML language and a minimal understanding of GNATprove's mechanisms and is therefore reserved to advanced users.

More generally, here are some advantages provided by the external axiomatization feature and some drawbacks of using it:

Advantages:

- Automated proof is in general more efficient with a manual translation than with the automatic one as it can be optimized.
- Using WhyML instead of SPARK as an input language for our axiomatization also has advantages. In particular, if we need additional lemmas for our functions, like the occ_set axiom, we can use axioms to provide them. Compared to lemma subprograms, they require much less manual annotation in the client code as they are used automatically by the underlying solver.
- Finally, existing axiomatization are provided for many common concepts in the Why3 library. Using external axioms, these axiomatizations can sometime be reused, even if they will often need to be adapted to support SPARK 2014 encoding of data structures.

Drawbacks:

- As we have already said, coming up with the axiomatization requires both a knowledge of the WhyML language and an understanding of GNATprove's automatic translation.
- If we are using an external axiomatization for a package, we cannot access in SPARK the underlying implementation of types and subprograms to guide the proof through Assert statements. For example, in our example, we cannot talk about the number of occurrences of a value in an array in SPARK as occ is a Why3 function.
- Finally, this feature must be used with care as it allows to introduce unsoundness in the verification, for example by writing a false axiom in one of our modules. Thus, external axiomatizations must be reviewed with great care during the verification activity.