# Proving Memory Operations - A SPARK Journey

## byQuentin Ochem – Jan 08, 2019

The promise behind the SPARK language is the ability to formally demonstrate properties in your code regardless of the input values that are supplied - as long as those values satisfy specified constraints. As such, this is quite different from static analysis tools such as our CodePeer or the typical offering available for e.g. the C language, which trade completeness for efficiency in the name of pragmatism. Indeed, the problem they’re trying to solve - finding bugs in existing applications - makes it impossible to be complete. Or, if completeness is achieved, then it is at the cost of massive amount of uncertainties (“false alarms”). SPARK takes a different approach. It requires the programmer to stay within the boundaries of a (relatively large) Ada language subset and to annotate the source code with additional information - at the benefit of being able to be complete (or sound) in the verification of certain properties, and without inundating the programmer with false alarms.

With this blog post, I’m going to explore the extent to which SPARK can fulfill this promise when put in the hands of a software developer with no particular affinity to math or formal methods. To make it more interesting, I’m going to work in the context of a small low-level application to gauge how SPARK is applicable to embedded device level code, with some flavor of cyber-security.

# The problem to solve and its properties

As a prelude, even prior to defining any behavior and any custom property on this code, the SPARK language itself defines a default property, the so-called absence of run-time errors. These include out of bounds access to arrays, out of range assignment to variables, divisions by zero, etc. This is one of the most advanced properties that traditional static analyzers can consider. With SPARK, we’re going to go much further than that, and actually describe functional requirements.

Let’s assume that I’m working on a piece of low level device driver whose job is to set and move the boundaries of two secure areas in the memory: a secure stack and a secure heap area. When set, these areas come with specific registers that prevent non-secure code from reading the contents of any storage within these boundaries. This process is guaranteed by the hardware, and I’m not modeling this part. However, when the boundaries are moved, the data that was in the old stack & heap but not in the new one is now accessible. Unless it is erased, it can be read by non-secure code and thus leak confidential information. Moreover, the code that changes the boundaries is to be as efficient as possible, and I need to ensure that I don’t erase memory still within the secure area.

What I described above, informally, is the full functional behavior of this small piece of code. This could be expressed as a boolean expression that looks roughly like: $dataToErase = (oldStack \cup oldHeap) \cap \overline{(newStack \cup newHeap)}$. Or in other words, the data to erase is everything that was in the previous memory ($$oldStack \cup oldHeap$$) and ($$\cap$$) not in the new memory ($$\overline{(newStack \cup newHeap}$$).

Another way to write the same property is to use a quantifier on every byte of memory, and say that on every byte, if this byte is in the old stack or the old heap but not in the new stack or the new heap, it should be erased: $\forall b \in memory, ((isOldStack(b) \lor isOldHeap (b)) \land \neg (isNewStack (b) \lor isNewHeap (b)) \iff isErased (b))$ Which means that for all the bytes in the memory ($$\forall b \in memory$$) what’s in the old regions ($$isOldStack(b) \lor isOldHeap (b)$$) but not in the new ones ($$\neg (isNewStack (b) \lor isNewHeap (b))$$) has to be erased ($$\iff isErased (b)$$).

We will also need to demonstrate that the heap and the stack are disjoint.

Ideally, we’d like to have SPARK make the link between these two ways of expressing things, as one may be easier to express than the other.

When designing the above properties, it became quite clear that I needed some intermediary library with set operations, in order to be able to express unions, intersections and complement operations. This will come with its own set of lower-level properties to prove and demonstrate.

Let’s now look at how to define the specification for this memory information.

# Memory Specification and Operations

The first step is to define some way to track the properties of the memory - that is whether a specific byte is a byte of heap, of stack, and what kind of other properties they can be associated with (like, has it been erased?). The interesting point here is that the final program executable should not have to worry about these values - not only would it be expensive, it wouldn’t be practical either. We can’t easily store and track the status of every single byte. These properties should only be tracked for the purpose of statically verifying the required behavior.

There is a way in SPARK to designate code to be solely used for the purpose of assertion and formal verification, through so-called “ghost code”. This code will not be compiled to binary unless assertions are activated at run-time. But here we’ll push this one step further by writing ghost code that can’t be compiled in the first place. This restriction imposed on us will allow us to write assertions describing the entire memory, which would be impossible to compile or run.

The first step is to model an address. To be as close as possible to the actual way memory is defined, and to have access to Ada’s bitwise operations, we’re going to use a modular type. It turns out that this introduces a few difficulties: a modular type wraps around, so adding one to the last value goes back to the first one. However, in order to prove absence of run-time errors, we want to demonstrate that we never overflow. To do that, we can define a precondition on the “+” and “-” operators, with an initial attempt to define the relevant preconditions:

function "+" (Left, Right : Address_Type) return Address_Type
is (Left + Right)
with Pre => Address_Type'Last - Left >= Right;

is (Left - Right)
with Pre => Left >= Right;

The preconditions  verify that Left + Right doesn’t exceed Address_Type’Last (for “+”), and that Left - Right is above zero (for “-”). Interestingly, we could have been tempted to write the first precondition the following way:

with Pre => Left + Right <= Address_Type'Last; 

However, with wrap-around semantics inside the precondition itself, this would  always be true.

There’s still a problem in the above code, due to the fact that “+” is actually implemented with “+” itself (hence there’s is an infinite recursive call in the above code). The same goes for “-”. To avoid that, we’re going to introduce a new type “Address_Type_Base” to do the computation without contracts, “Address_Type” being the actual type in use. The full code, together with some size constants (assuming 32 bits), then becomes:

Word_Size    : constant := 32;
Memory_Size  : constant := 2 ** Word_Size;
type Address_Type_Base is mod Memory_Size; -- 0 .. 2**Word_Size - 1

with Pre => Address_Type'Last - Left >= Right;

with Pre => Left >= Right;


Armed with the above types, it’s now time to get started on the modeling of actual memory. We’re going to track the status associated with every byte. Namely, whether a given byte is part of the Stack, part of the Heap, or neither; and whether that byte has been Scrubbed (erased). The prover will reason on the entire memory. However, the status tracking will never exist in the program itself - it will just be too costly to keep all this data at run time. Therefore we’re going to declare all of these entities as Ghost (they are here only for the purpose of contracts and assertions), and we will never enable run-time assertions. The code looks like:

type Byte_Property is record
Stack    : Boolean;
Heap     : Boolean;
Scrubbed : Boolean;
end record
with Ghost;

type Memory_Type is array (Address_Type) of Byte_Property
with Ghost;

Memory : Memory_Type
with Ghost;

Here, Memory is a huge array declared as a global ghost variable. We can’t write executable code with it, but we can write contracts. In particular, we can define a contract for a function that sets the heap between two address values. As a precondition for this function, the lower value has to be below or equal to the upper one. As a postcondition, the property of the memory in the range will be set to Heap. The specification looks like this:

procedure Set_Heap (From, To : Address_Type)
with
Pre => To >= From,
Post => (for all B in Address_Type => (if B in From .. To then Memory (B).Heap else not Memory (B).Heap)),
Global => (In_Out => Memory);


Note that I’m also defining a global here which is how Memory is processed. Here it’s modified, so In_Out.

While the above specification is correct, it’s also incomplete. We’re defining what happens for the Heap property, but not the others. What we expect here is that the rest of the memory is unmodified. Another way to say this is that only the range From .. To is updated, the rest is unchanged. This can be modelled through the ‘Update attribute, and turn the postcondition into:

Post => (for all B in Address_Type =>
(if B in From .. To then Memory (B) = Memory'Old (B)'Update (Heap => True)
else Memory (B) = Memory'Old (B)'Update (Heap => False))),


Literally meaning “The new value of memory equals the old value of memory (Memory’Old) with changes (‘Update) being Heap = True from From to To, and False elsewhere“.

Forgetting to mention what doesn’t change in data is a common mistake when defining contracts. It is also a source of difficulties to prove code, so it’s a good rule of the thumb to always consider what’s unchanged when checking these postconditions. Of course, the only relevant entities are those accessed and modified by the subprogram. Any variable not accessed is by definition unchanged.

Let’s now get to the meat of this requirement. We’re going to develop a function that moves the heap and the stack boundaries, and scrubs all that’s necessary and nothing more. The procedure will set the new heap boundaries between Heap_From .. Heap_To, and stack boundaries between Stack_From and Stack_To, and is defined as such:

procedure Move_Heap_And_Stack
(Heap_From, Heap_To, Stack_From, Stack_To : Address_Type)


Now remember the expression of the requirement from the previous section:

$\forall b \in memory, ((isOldStack(b) \lor isOldHeap (b)) \land \neg (isNewStack (b) \lor isNewHeap (b)) \iff isErased (b))$

This happens to be a form that can be easily expressed as a quantifier, on the Memory array described before:

(for all B in Address_Type =>
(((Memory'Old (B).Stack or Memory'Old (B).Heap)
and then not
(Memory (B).Stack or Memory (B).Heap))
= Memory (B).Scrubbed));


The above is literally the transcription of the property above, checking all bytes B in the address range, and then stating that if the old memory is Stack or Heap and the new memory is not, then the new memory is scrubbed, otherwise not. This contract is going to be the postcondition of Move_Heap_And_Stack. It’s the state of the system after the call.

Interestingly, for this to work properly, we’re going to need also to verify the consistency of the values of Heap_From, Heap_To, Stack_From and Stack_To - namely that the From is below the To and that there’s no overlap between heap and stack. This will be the precondition of the call:

((Stack_From <= Stack_To and then Heap_From <= Heap_To) and then
Heap_From not in Stack_From .. Stack_To and then
Heap_To not in Stack_From .. Stack_To and then
Stack_From not in Heap_From .. Heap_To and then
Stack_To not in Heap_From .. Heap_To);

That’s enough for now at this stage of the demonstration. We have specified the full functional property to be demonstrated. Next step is to implement, and prove this implementation.

# Low-level Set Library

In order to implement the service, it would actually be useful to have a lower level library that implements set of ranges, together with Union, Intersect and Complement functions. This library will be called to determine which regions to erase - it will also be at the basis of the proof. Because this is going to be used at run-time, we need a very efficient implementation. A set of ranges is going to be defined as an ordered disjoint list of ranges. Using a record with discriminant, it looks like:

type Area is record
end record
with Predicate => From <= To;

type Area_Array is array(Natural range <>) of Area;

type Set_Base (Max : Natural) is record
Size  : Natural;
Areas : Area_Array (1 .. Max);
end record;


Note that I call this record Set_Base, and not Set. Bear with me.

You may notice above already a first functional predicate. In the area definition, the fields From and To are described such as From is always below To. This is a check very similar to an Ada range check in terms of where it applies - but on a more complicated property. For Set, I’m also going to express the property I described before, that Areas are ordered (which can can be expressed as the fact that the To value of an element N is below the From value of the element N + 1) and disjoint (the From of element N minus the To of element N + 1 is at least 1). There’s another implicit property to be specified which is that the field Size is below or equal to the Max size of the array. Being able to name and manipulate this specific property has some use, so I’m going to name it in an expression function:

function Is_Consistent (S : Set_Base) return Boolean is
(S.Size <= S.Max and then
((for all I in 1 .. S.Size - 1 =>
S.Areas (I).To < S.Areas (I + 1).From and then
S.Areas (I + 1).From - S.Areas (I).To > 1)));

Now comes the predicate. If I were to write:

type Set_Base (Max : Natural) is record
Size  : Natural;
Areas : Area_Array (1 .. Max);
end record
with Predicate => Is_Consistent (Set_Base);


I would have a recursive predicate call. Predicates are checked in particular when passing parameters, so Is_Consistent would check the predicate of Set_Base, which is a call to Is_Consistent, which would then check the predicate and so on. To avoid this, the predicate is actually applied to a subtype:

subtype Set is Set_Base
with Predicate => Is_Consistent (Set);


As it will later turn out, this property is very fundamental to the ability for proving other properties. At this stage, it’s already nice to see some non-trivial property being expressed, namely that the structure is compact (it doesn't waste space by having consecutive areas that could be merged into one, or said otherwise, all areas are separated by at least one excluded byte).

The formal properties expressed in the next steps will be defined in the form of inclusion - if something is included somewhere then it may or may not be included somewhere else. This inclusion property is expressed as a quantifier over all the ranges. It’s not meant to be run by the program, but only for the purpose of property definition and proof. The function defining that a given byte is included in a set of memory ranges can be expressed as follows:

function Includes (B : Address_Type; S : Set) return Boolean
is (for some I in 1 .. S.Size =>
B in S.Areas (I).From .. S.Areas (I).To)
with Ghost;


Which means that for all the areas in the set S, B is included in the set if B is included in at least one (some) of the areas in the set.

I’m now going to declare a constructor “Create” together with three operators, “or”, “and”, “not” which will respectively implement union, intersection and complement. For each of those, I need to provide some expression of the maximum size of the set before and after the operation, as well as the relationship between what’s included in the input and in the output.

The specification of the function Create is straightforward. It takes a range as input, and creates a set where all elements within this range are contained in the resulting set. This reads:

function Create (From, To : Address_Type) return Set
with Pre => From <= To,
Post => Create'Result.Max = 1
and then Create'Result.Size = 1
and then (for all B in Address_Type =>
Includes (B, Create'Result) = (B in From .. To));


Note that interestingly, the internal implementation of the Set isn’t exposed by the property computing the inclusion. I’m only stating what should be included without giving details on how it should be included. Also note that as in many other places, this postconditon isn’t really something we’d like to execute (that would be possibly a long loop to run for large area creation). However, it’s a good way to model our requirement.

Let’s carry on with the “not”. A quick reasoning shows that at worst, the result of “not” is one area bigger than the input. We’ll need a precondition checking that the Size can indeed be incremented (it does not exceed the last value of the type). The postcondition is that this Size has been potentially incremented, that values that were not in the input Set are now in the resulting one and vice-versa. The operator with its postcondition and precondition reads:

function "not" (S : Set) return Set
with
Pre => Positive'Last - S.Size > 0,
Post =>
(for all B in Address_Type =>
Includes (B, "not"'Result) /= Includes (B, S))
and then "not"'Result.Size <= S.Size + 1;


The same reasoning can be applied to “and” and “or”, which leads to the following specifications:

function "or" (S1, S2 : Set) return Set
with
Pre => Positive'Last - S1.Size - S2.Size >= 0,
Post => "or"'Result.Size <= S1.Size + S2.Size
and Is_Consistent ("or"'Result)
and (for all B in Address_Type =>
(Includes (B, "or"'Result)) =
((Includes (B, S1) or Includes (B, S2))));

function "and" (S1, S2 : Set) return Set
with
Pre => Positive'Last - S1.Size - S2.Size >= 0,
Post => "and"'Result.Size <= S1.Size + S2.Size
and (for all B in Address_Type =>
(Includes (B, "and"'Result)) =
((Includes (B, S1) and Includes (B, S2))));


Of course at this point, one might be tempted to first prove the library and then the user code. And indeed I was tempted and fell for it. However, as this turned out to be a more significant endeavor, let’s start by looking at the user code.

# Move_Heap_And_Stack - the “easy” part

Armed with the Set library, implementing the move function is relatively straightforward. We’re using other services to get the heap and stack boundaries, then creating the set, using the proper operators to create the list to scrub, and then scrubbing pieces of memory one by one, then finally setting the new heap and stack pointers.

procedure Move_Heap_And_Stack
(Heap_From, Heap_To, Stack_From, Stack_To : Address_Type)
is
Prev_Heap_From, Prev_Heap_To,
begin
Get_Stack_Boundaries (Prev_Stack_From, Prev_Stack_To);
Get_Heap_Boundaries (Prev_Heap_From, Prev_Heap_To);

declare
Prev : Set := Create (Prev_Heap_From, Prev_Heap_To) or
Create (Prev_Stack_From, Prev_Stack_To);
Next : Set := Create (Heap_From, Heap_To) or
Create (Stack_From, Stack_To);
To_Scrub : Set := Prev and not Next;
begin
for I in 1 .. To_Scrub.Size loop
Scrub (To_Scrub.Areas (I).From, To_Scrub.Areas (I).To);
end loop;

Set_Stack (Stack_From, Stack_To);
Set_Heap (Heap_From, Heap_To);
end;
end Move_Heap_And_Stack;


Now let’s dive into the proof. As a disclaimer, the proofs we’re going to do for now on are hard. One doesn’t need to go this far to take advantage of SPARK. As a matter of fact, defining requirements formally is already taking good advantage of the technology. Most people only prove data flow or absence of run-time errors which is already a huge win. The next level is some functional key properties. We’re going one level up and entirely proving all the functionalities. The advanced topics that are going to be introduced in this section, such as lemma and loop invariants, are mostly needed for these advanced levels.

The first step is to reset the knowledge we have on the scrubbing state of the memory. Remember that all of the memory state is used to track the status of the memory, but it does not correspond to any real code. To reset the flag, we’re going to create a special ghost procedure, whose sole purpose is to set these flags:

procedure Reset_Scrub
with
Post =>
(for all B in Address_Type =>
Memory (B) = Memory'Old(B)'Update (Scrubbed => False)),
Ghost;


In theory, it is not absolutely necessary to provide an implementation to this procedure if it’s not meant to be compiled. Knowing what it’s supposed to do is good enough here. However, it can be useful to provide a ghost implementation to describe how it would operate. The implementation is straightforward:

procedure Reset_Scrub is
begin
Memory (B).Scrubbed := False;
end loop;
end Reset_Scrub;

We’re going to hit now our first advanced proof topic. While extremely trivial, the above code doesn’t prove, and the reason why it doesn’t is because it has a loop. Loops are something difficult for provers and as of today, they need help to break them down into sequential pieces. While the developer sees a loop, SPARK sees three different pieces of code to prove, connected by a so-called loop invariant which summarizes the behavior of the loop:

procedure Reset_Scrub is
begin
Memory (B).Scrubbed := False;
[loop invariant]


Then:

         [loop invariant]
B := B + 1;
Memory (B).Scrubbed := False;
[loop invariant]

And eventually:

         [loop invariant]
end loop;
end;


The difficulty is now about finding this invariant that is true on all these sections of the code, and that ends up proving the postcondition. To establish those, it’s important to look at what needs to be proven at the end of the loop. Here it would be that the entire array has Scrubbed = False, and that the other fields still have the same value as at the entrance of the loop (expressed using the attribute ‘Loop_Entry):

(for all X in Address_Type =>
Memory (X) = Memory'Loop_Entry(X)'Update (Scrubbed => False))


Then to establish the loop invariant, the question is how much of this is true at each step of the loop. The answer here is that this is true up until the value B. The loop invariant then becomes:

(for all X in Address_Type'First .. B =>
Memory (X) = Memory'Loop_Entry(X)'Update (Scrubbed => False))


Which can be inserted in the code:

procedure Reset_Scrub is
begin
Memory (B).Scrubbed := False;
pragma Loop_Invariant
(for all X in Address_Type'First .. B =>
Memory (X) =
Memory'Loop_Entry(X)'Update (Scrubbed => False))
end loop;
end Reset_Scrub;


Back to the main code. We can now insert a call to Reset_Scrub before performing the actual scrubbing. This will not do anything in the actual executable code, but will tell the prover to consider that the ghost values are reset. Next, we have another loop scrubbing subsection:

for I in 1 .. To_Scrub.Size loop
Scrub (To_Scrub.Areas (I).From, To_Scrub.Areas (I).To);
end loop;


Same as before, the question is here what is the property true at the end of the loop, and breaking this property into what’s true at each iteration. The property true at the end is that everything included in the To_Scrub set has been indeed scrubbed:

(for all B in Address_Type =>
Includes (B, To_Scrub) = Memory (B).Scrubbed);

This gives us the following loop with its invariant:

for I in 1 .. To_Scrub.Size loop
Scrub (To_Scrub.Areas (I).From, To_Scrub.Areas (I).To);

-- everything until the current area is properly scrubbed
pragma Loop_Invariant
(for all B in Address_Type'First .. To_Scrub.Areas (I).To =>
Includes (B, To_Scrub) = Memory (B).Scrubbed);
end loop;

So far, this should be enough. Establishing these loop invariants may look a little bit intimidating at first, but with a little bit of practise they become rapidly straightforward.

Unfortunately, this is not enough.

# Implementing Unproven Interaction with Registers

As this is a low level code, some data will not be proven in the SPARK sense. Take for example the calls that read the memory boundaries:

procedure Get_Heap_Boundaries (From, To : out Address_Type)
with Post => (for all B in Address_Type => (Memory (B).Heap = (B in From .. To)))
and then From <= To;

procedure Get_Stack_Boundaries (From, To : out Address_Type)
with Post => (for all B in Address_Type => (Memory (B).Stack = (B in From .. To)))
and then From <= To;


The values From and To are possibly coming from registers. SPARK wouldn’t necessary be able to make the link with the post condition just because the data is outside of its analysis. In this case, it’s perfectly fine to just tell SPARK about a fact without having to prove it. There are two ways to do this, one is to deactivate SPARK from the entire subprogram:

 procedure Get_Heap_Boundaries
with SPARK_Mode => Off
is
begin
-- code
end Get_Heap_Boundaries;

In this case, SPARK will just assume that the postcondition is correct. The issue is that there’s no SPARK analysis on the entire subprogram, which may be too much. An alternative solution is just to state the fact that the postcondition is true at the end of the subprogram:

procedure Get_Heap_Boundaries
is
begin
-- code

pragma Assume (for all B in Address_Type => (Memory (B).Heap = (B in From .. To)));
end Get_Heap_Boundaries;

In this example, to illustrate the above, the registers will be modeled as global variables read and written from C - which is outside of SPARK analysis as registers would be.

# Move_Heap_And_Stack - the “hard” part

Before diving into what’s next and steering away readers from ever doing proof, let’s step back a little. We’re currently set to doing the hardest level of proof - platinum. That is, fully proving a program’s functional behavior. There is a lot of benefit to take from SPARK prior to reaching this stage. The subset of the language alone provides more analyzable code. Flow analysis allows you to easily spot uninitialized data. Run-time errors such as buffer overflow are relatively easy to clear out, and even simple gold property demonstration is reachable by most software engineers after a little bit of training.

Full functional proof - that is, complex property demonstration - is hard. It is also not usually required. But if this is what you want to do, there’s a fundamental shift of mindset required. As it turns out, it took me a good week to understand that. For a week as I was trying to build a proof from bottom to top, adding various assertions left and right, trying to make things fits the SPARK expectations. To absolutely no results.

And then, in the midst of despair, the apple fell on my head.

The prover is less clever than you think. It’s like a kid learning maths. It’s not going to be able to build a complex demonstration to prove the code by itself. Asking it to prove the code is not the right approach. The right approach is to build the demonstration of the code correctness, step by step, and to prove that this demonstration is correct. This demonstration is a program in itself, with its own subprograms, its own data and control flow, its own engineering and architecture. What the prover is going to do is to prove that the demonstration of the code is correct, and as the demonstration is linked to the code, the code happens to be indirectly proven correct as well.

Now reversing the logic, how could I prove this small loop? One way to work that out is to describe scrubbed and unscrubbed areas one by one:

• On the first area to scrub, if it doesn’t start at the beginning of the memory, everything before the start has not been scrubbed.

• When working on any area I beyond the first one, everything between the previous area I - 1 and the current one has not been scrubbed

• At the end of an iteration, everything beyond the current area I is unscrubbed.

• At the end of the last iteration, everything beyond the last area is unscrubbed

The first step to help the prover is to translate all of these into assertions, and see if these steps are small enough for the demonstration to be proven - and for the demonstration to indeed prove the code. At this stage, it’s not a bad idea to describe the assertion in the loop as loop invariants as we want the information to be carried from one loop iteration to the next. This leads to the following code:

for I in 1 .. To_Scrub.Size loop
Scrub (To_Scrub.Areas (I).From, To_Scrub.Areas (I).To);

pragma Loop_Invariant
(if I > 1 then (for all B in Address_Type range To_Scrub.Areas (I - 1).To + 1 .. To_Scrub.Areas (I).From - 1 =>
not Memory (B).Scrubbed));

pragma Loop_Invariant
then (for all B in Address_Type'First .. To_Scrub.Areas (1).From - 1 => not Memory (B).Scrubbed));

pragma Loop_Invariant
(if To_Scrub.Areas (I).To < Address_Type'Last then
(for all B in To_Scrub.Areas (I).To + 1 .. Address_Type'Last => not Memory (B).Scrubbed));

pragma Loop_Invariant
(for all B in Address_Type'First .. To_Scrub.Areas (I).To => Includes (B, To_Scrub) = Memory (B).Scrubbed);
end loop;

pragma Assert
(if To_Scrub.Size >= 1 and then To_Scrub.Areas (To_Scrub.Size).To < Address_Type'Last then
(for all B in To_Scrub.Areas (To_Scrub.Size).To + 1 .. Address_Type'Last => not Memory (B).Scrubbed));


Results are not too bad at first sight. Out of the 5 assertions, only 2 don’t prove. This may mean that they’re wrong - this may also mean that SPARK needs some more help to prove.

Let’s look at the first one in more details:

pragma Loop_Invariant
then (for all B in Address_Type'First .. To_Scrub.Areas (1).From - 1 => not Memory (B).Scrubbed));


Now if we were putting ourselves in the shoes of SPARK. The prover doesn’t believe that there’s nothing scrubbed before the first element. Why would that be the case? None of these bytes are in the To_Scrub area, right? Let’s check. To investigate this, the technique is to add assertions to verify intermediate steps, pretty much like what you’d do with the debugger. Let’s add an assertion before:

pragma Assert
then (for all B in Address_Type'First .. To_Scrub.Areas (1).From - 1 => not Includes (B, To_Scrub)));


That assertion doesn’t prove. But what would this be true? Recall that we have a consistency check for all sets which is supposed to be true at this point, defined as:

function Is_Consistent (S : Set_Base) return Boolean is
(S.Size <= S.Max and then
((for all I in 1 .. S.Size - 1 =>
S.Areas (I).To < S.Areas (I + 1).From and then
S.Areas (I + 1).From - S.Areas (I).To > 1)));


So looking at the above, if all areas are after the first one, there should be nothing before the first one. If Is_Consistent is true for To_Scrub, then the assertion ought to be true. Yet SPARK doesn’t believe us.

When reaching this kind of situation, it’s good practise to factor out the proof. The idea is to create a place where we say “given only these hypotheses, can you prove this conclusion?”. Sometimes, SPARK is getting lost in the wealth of information available, and just reducing the number of hypothesis to consider to a small number is enough for get it to figure out something.

Interestingly, this activity of factoring out a piece of proof is very close to what you’d do for a regular program. It’s also easier for the developer to understand small pieces of code than a large flat program. The prover is no better than that.

These factored out proofs are typically referred to as lemmas. They are Ghost procedures that prove a postcondition from a minimal precondition. For convention, we’ll call them all Lemma something. The Lemma will look like:

procedure Lemma_Nothing_Before_First (S : Set) with
Ghost,
Pre => Is_Consistent (S),
Post =>
(if S.Size = 0 then (for all B in Address_Type => not Includes (B, S))
elsif S.Areas (1).From > Address_Type'First then
(for all B in Address_Type'First .. S.Areas (1).From - 1 => not Includes (B, S)));


Stating that if S is consistent, then either it’s null (nothing is included) or all elements before the first From are not included.

Now let’s see if reducing the scope of the proof is enough. Let’s just add an empty procedure:

procedure Lemma_Nothing_Before_First (S : Set) is
begin
null;
end Lemma_Nothing_Before_First;


Still no good. That was a good try though. Assuming we believe to be correct here (and we are), the game is now to demonstrate to SPARK how to go from the hypotheses to the conclusion.

To do so we need to take into account one limitation of SPARK - it doesn’t do induction. This has a significant impact on what can be deduced from one part of the hypothesis:

(for all I in 1 .. S.Size - 1 =>
S.Areas (I).To < S.Areas (I + 1).From and then
S.Areas (I + 1).From - S.Areas (I).To > 1)


If all elements “I” are below the “I + 1” element, then I would like to be able to check that all “I” are below all the “I + N” elements after it. This ability to jump from proving a one by one property to a whole set is called induction. This happens to be extremely hard to do for state-of-the-art provers. Here lies our key. We’re going to introduce a new lemma that goes from the same premise, and then demonstrates that it means that all the areas after a given one are greater:

procedure Lemma_Order (S : Set) with
Ghost,
Pre => (for all I in 1 .. S.Size - 1 => S.Areas (I).To < S.Areas (I + 1).From),
Post =>
(for all I in 1 .. S.Size - 1 => (for all J in I + 1 .. S.Size => S.Areas (I).To < S.Areas (J).From));


And we’re going to write the demonstration as a program:

procedure Lemma_Order (S : Set)
is
begin
if S.Size = 0 then
return;
end if;

for I in 1 .. S.Size - 1 loop
for J in I + 1 .. S.Size loop
pragma Assert (S.Areas (J - 1).To < S.Areas (J).From);
pragma Loop_Invariant (for all R in I + 1 .. J => S.Areas (I).To < S.Areas (R).From);
end loop;

pragma Loop_Invariant
((for all R in 1 .. I => (for all T in R + 1 .. S.Size => S.Areas (R).To < S.Areas (T).From)));
end loop;
end Lemma_Order;


As you can see here, for each area I, we’re checking that the area I + [1 .. Size] are indeed greater. This happens to prove trivially with SPARK. We can now prove Lemma_Nothing_Before_First by applying the lemma Lemma_Order. To apply a lemma, we just need to call it as a regular function call. Its hypotheses (precondition) will be checked by SPARK, and its conclusion (postcondition) added to the list of hypotheses available to prove:

procedure Lemma_Nothing_Before_First (S : Set) is
begin
Lemma_Order (S);
end Lemma_Nothing_Before_First;


This now proves trivially. Back to the main loop, applying the lemma Lemma_Nothing_Before_First looks like:

for I in 1 .. To_Scrub.Size loop
Scrub (To_Scrub.Areas (I).From, To_Scrub.Areas (I).To);
Lemma_Nothing_Before_First (To_Scrub);

pragma Loop_Invariant
(if I > 1 then
(for all B in Address_Type range To_Scrub.Areas (I - 1).To + 1 .. To_Scrub.Areas (I).From - 1 => not Memory (B).Scrubbed));

pragma Loop_Invariant
then (for all B in Address_Type'First .. To_Scrub.Areas (1).From - 1 => not Memory (B).Scrubbed));

pragma Loop_Invariant
(if To_Scrub.Areas (I).To < Address_Type'Last then
(for all B in To_Scrub.Areas (I).To + 1 .. Address_Type'Last => not Memory (B).Scrubbed));

pragma Loop_Invariant
(for all B in Address_Type'First .. To_Scrub.Areas (I).To => Includes (B, To_Scrub) = Memory (B).Scrubbed);
end loop;

pragma Assert
(if To_Scrub.Size >= 1 and then To_Scrub.Areas (To_Scrub.Size).To < Address_Type'Last then
(for all B in To_Scrub.Areas (To_Scrub.Size).To + 1 .. Address_Type'Last => not Memory (B).Scrubbed));


And voila! One more loop invariant now proving properly.

# More madness, in a nutshell

At this point, it’s probably not worth diving into all the details of this small subprogram - the code is available here. There’s just more of the same.

The size of this small function is relatively reasonable. Now let’s give some insights on a much more difficult problem: the Set library.

A generous implementation brings about 250 lines of code (it could actually be less if condensed, but let’s start with this). That’s a little bit less than a day of work for implementation and basic testing.

For the so called silver level - that is absence of run-time errors, add maybe around 50 lines of assertions and half a day of work. Not too bad.

For gold level, I decided to prove one key property. Is_Consistent, to be true after each operator. Maybe a day of work was needed for that one. Add another 150 lines of assertions maybe. Still reasonable.

Platinum is about completely proving the functionality of my subprogram. And that proved (pun intended) to be a much much more difficult experience. See this link and this link for other similar experiences. As a disclaimer, I am an experienced Ada developer but had relatively little experience with proof. I also selected a problem relatively hard - the quantified properties and the Set structure are quite different, and proving quantifiers is known to be hard for provers to start with.With that in mind, the solution I came up with spreads over almost a thousand lines of code - and consumed about a week and a half of effort.

I’m also linking here a solution that my colleague Claire Dross came up with. She’s one of our most senior expert in formal proof, and within a day could prove the two most complex operators in about 300 lines of code (her implementation is also more compact than mine).

The above raises a question - is it really worth it? Silver absolutely does - it is difficult to bring a case against spending a little bit more effort in exchange for the absolute certainty of never having a buffer overflow or a range check error. There’s no doubt that in the time I spent in proving this, I would have spent much more in debugging either testing, or worse, errors in the later stages should this library be integrated in an actual product. Gold is also a relatively strong case. The fact that I only select key properties allows only concentrating on relatively easy stuff, and the confidence of the fact that they are enforced and will never have to be tested clearly outweighs the effort.

I also want to point out that the platinum effort is well worth it on the user code in this example. While it looks tedious at first sight, getting these properties right is relatively straightforward, and allows gaining confidence on something that can’t be easily tested; that is, a property on the whole memory.

Now the question remains - is it worth the effort on the Set library, to go from maybe two days of code + proof to around a week and a half?

I can discuss it either way but having to write 700 lines of code to demonstrate to the prover that what I wrote is correct keeps haunting me. Did I really have these 700 lines of reasoning in my head when I developed the code? Did I have confidence in the fact that each of those was logically linked to the next? To be fair, I did find errors in the code when writing those, but the code wasn’t fully tested when I started the proof. Would the test have found all the corner cases? How much time would such a corner case take to debug if found in a year? (see this blog post for some insights on hard to find bugs removed by proof).

Some people who safely certify software against e.g. avionics & railway standards end up writing 10 times more lines of tests than code - all the while just verifying samples of potential data. In that situation, provided that the properties under test can be modelled by SPARK assertions and that they fit what the prover knows how to do, going through this level of effort is a very strong case.

Anything less is open for debate. I have to admit, against all odds, it was a lot of fun and I would personally be looking forward to taking the challenge again. Would my boss allow me to is a different question. It all boils down to the cost of a failure versus the effort to prevent said failure. Being able to make an enlightened decision might be the most valuable outcome of having gone through the effort.

Posted in