Quantifying over Elements of a Container
by Claire Dross –
Containers holding several items of the same type such as arrays, lists, or sets are a common occurrence in computer programs. Stating a property over such containers often involves quantifying over the elements they contain.
As seen in a previous post, in SPARK 2014, it is possible to allow quantification over any such container type using an Iterable aspect.
This aspect provides the primitives of a container type that will be used to iterate over its content. For example, if we write:
type Container is private with
Iterable => (First => First,
Next => Next,
Has_Element => Has_Element);
where
function First (S : Set) return Cursor;
function Has_Element (S : Set; C : Cursor) return Boolean;
function Next (S : Set; C : Cursor) return Cursor;
then quantification over containers can be done using the type Cursor. For example, we could state:
(for all C in S => P (Element (S, C)))
to say that S only contains elements for which a property P holds. For execution, this expression is translated as a loop using the provided First, Has_Element, and Next primitives. For proof, it is translated as a logic quantification over every element of type Cursor. To restrict the property to cursors that are actually valid in the container, the provided function Has_Element is used. For example, the property stated above becomes:
(for all C : Cursor => (if Has_Element (S, C) then P (Element (S, C)))
Like for the standard Ada iteration mechanism, it is possible to allow quantification directly over the elements of the container by providing in addition an Element primitive to the Iterable aspect. For example, if we write:
type Container is private with
Iterable => (First => First,
Next => Next,
Has_Element => Has_Element
Element => Element);
where
function Element (S : Set; C : Cursor) return Element_Type;
then quantification over containers can be done directly on its elements. For example, we could rewrite the above property into:
(for all E of S => P (E))
For execution, quantification over elements of a container is translated as a loop over its cursors. In the same way, for proof, quantification over elements of a container is no more than syntactic sugar for quantification over its cursors. For example, the above property is translated using quantification over cursors :
(for all C : Cursor => (if Has_Element (S, C) then P (Element (S, C)))
Depending on the application, this translation may be too low-level and introduce an unnecessary burden on the automatic provers. As an example, let us consider a package for functional sets:
package Sets with SPARK_Mode is
type Cursor is private;
type Set (<>) is private with
Iterable => (First => First,
Next => Next,
Has_Element => Has_Element,
Element => Element);
function Mem (S : Set; E : Element_Type) return Boolean with
Post => Mem'Result = (for some F of S => F = E);
function Intersection (S1, S2 : Set) return Set with
Post => (for all E of Intersection'Result => Mem (S1, E) and Mem (S2, E))
and (for all E of S1 =>(if Mem (S2, E) then Mem (Intersection'Result, E)));
(...)
Sets contain elements of type Element_Type. The most basic operation on sets is membership test, here provided by the Mem subprogram. Every other operation, such as intersection here, is then specified in terms of members. Iteration primitives First, Next, Has_Element, and Element, that take elements of a private type Cursor as an argument, are only provided for the sake of quantification.
Following the scheme described previously, the postcondition of Intersection is translated for proof as:
(for all C : Cursor =>
(if Has_Element (Intersection'Result, C) then
Mem (S1, Element (Intersection'Result, C))
and Mem (S2, Element (Intersection'Result, C))))
and
(for all C1 : Cursor =>
(if Has_Element (S1, C1) then
(if Mem (S2, Element (S1, C1)) then
Mem (Intersection'Result, Element (S1, C1)))))
Using the postcondition of Mem, this can be refined further into:
(for all C : Cursor =>
(if Has_Element (Intersection'Result, C) then
(for some C1 : Cursor =>
Has_Element (S1, C1) and Element (Intersection'Result, C) = Element (S1, C1))
and (for some C2 : Cursor =>
Has_Element (S2, C2) and Element (Intersection'Result, C) = Element (S2, C2)))))
and
(for all C1 : Cursor =>
(if Has_Element (S1, C1) then
(if (for some C2 : Cursor =>
Has_Element (S2, C2) and Element (S1, C1) = Element (S2, C2)))
then(for some C : Cursor =>
Has_Element (Intersection'Result, C)
and Element (Intersection'Result, C) = Element (S1, C1))))))
Though perfectly valid, this translation may produce complicated proofs, especially when verifying complex properties over sets. For SPARK formalization of the new ada-trait-based containers, we introduced a new GNATprove annotation, that can be used to change the way for ... of quantification is translated. It can be used to provide GNATprove with a "Contains" function, that will be used for quantification. For example, on our sets, we could write:
function Mem (S : Set; E : Element_Type) return Boolean;
pragma Annotate (GNATprove, Iterable_For_Proof, "Contains", Mem);
With this annotation, the postcondition of Intersection is translated in a simpler way, using logic quantification directly over elements:
(for all E : Element_Type =>
(if Mem (Intersection'Result, E) then Mem (S1, E) and Mem (S2, E)))
and (for all E : Element_Type =>
(if Mem (S1, E) then
(if Mem (S2, E) then Mem (Intersection'Result, E))))
Note that care should be taken to provide an appropriate function contains, which returns true if and only if the element E is present in S. This assumption will not be verified by GNATprove.
The annotation Iterable_For_Proof can also be used in another case. Operations over complex data structures are sometimes specified using operations over a simpler model type. In this case, it may be more appropriate to translate for ... of quantification as quantification over the model's cursors. As an example, let us consider a package of linked lists that is specified using a sequence that allows accessing the element stored at each position:
package Lists with SPARK_Mode is
type Sequence is private with
Ghost,
Iterable => (...,
Element => Get);
function Length (M : Sequence) return Natural with Ghost;
function Get (M : Sequence; P : Positive) return Element_Type with
Ghost,
Pre => P <= Length (M);
type Cursor is private;
type List is private with
Iterable => (...,
Element => Element);
function Position (L : List; C : Cursor) return Positive with Ghost;
function Model (L : List) return Sequence with
Ghost,
Post => (for all I in 1 .. Length (Model'Result) =>
(for some C in L => Position (L, C) = I));
function Element (L : List; C : Cursor) return Element_Type with
Pre => Has_Element (L, C),
Post => Element'Result = Get (Model (L), Position (L, C));
function Has_Element (L : List; C : Cursor) return Boolean with
Post => Has_Element'Result = (Position (L, C) in 1 .. Length (Model (L)));
procedure Append (L : in out List; E : Element_Type) with
Post => length (Model (L)) = Length (Model (L))'Old + 1
and Get (Model (L), Length (Model (L))) = E
and (for all I in 1 .. Length (Model (L))'Old =>
Get (Model (L), I) = Get (Model (L'Old), I));
function Init (N : Natural; E : Element_Type) return List with
Post => length (Model (Init'Result)) = N
and (for all F of Init'Result => F = E);
(...)
Elements of lists can only be accessed through cursors. To specify easily the effects of position-based operations such as Append, we introduce a ghost type Sequence of unbounded arrays indexed by positive integers that is used to represent logically the content of the linked list in specifications. The sequence associated to a list can be constructed using the Model function. Following the usual translation scheme for quantified expressions, the last line of the postcondition of Init is translated for proof as:
(for all C : Cursor => (if Has_Element (Init'Result, C) then Element (Init'Result, C) = E));
Using the definition of Element and Has_Element, it can then be refined further into:
(for all C : Cursor => (if Position (Init'Result, C) in 1 .. Length (Model (Init'Result))
then Get (Model (Init'Result), Position (Init'Result, C)) = E));
To be able to link this property with other properties specified directly on models, like the postcondition of Append, it needs to be lifted to iterate over positions instead of cursors. This can be done using the postcondition of Model that states that there is a valid cursor in L for each position of of its model. This lifting requires a lot of quantifier reasoning from the prover, thus making proofs more difficult.
The GNATprove Iterable_For_Proof annotation can be used to provide GNATprove with a "Model" function, that will be called to translate quantification on complex containers toward quantification on their model. For example, on our lists, we could write:
function Model (L : List) return Sequence;
pragma Annotate (GNATprove, Iterable_For_Proof, "Model", Entity => Model);
With this annotation, the postcondition of Init is translated directly as a quantification on the elements of the result's model:
(for all I : Positive => (if I in 1 .. Length (Model (Init'Result))
then Get (Model (Init'Result), I) = E));
Like with the previous annotation, care should be taken to define the model function such that it always return a model containing exactly the same elements as L.
On the formal version of ada-trait-based containers, tuning the way quantification is translated for proof is really important. Indeed, their specification is container intensive, as up to three different functional containers are used as models of imperative containers, and also quite complex as it must allow reasoning efficiently on containers at different levels. As an example, a "Contains" Iterable_For_Proof annotation is used to avoid introducing a parasitic type of cursors in the functional model of imperative sets. It is combined with a "Model" annotation on imperative set so that quantification on elements of a set are translated directly as a high level, content based, property on functional sets rather than as a low level, cursor aware quantification. As a result, if S in an imperative set, the property:
(for all E of S => P (E))
is translated for proof as:
(for all E : Element_Type => (if Mem (Model (S), E) then P (E)))
which both makes the work of provers easier and allows a clean separation between low-level, cursor aware properties and high-level, content based specifications.