Why I created spark_unbound #
In this blog entry, I try to cover my experience using Ada/SPARK so far, why I created the
spark_unbound library, and what this library is about.
Discovering Ada/SPARK and formal verification #
I discovered Ada when going through publications of one of my professors at my university. Knowing a bit of C and embedded projects due to my work at B&R, Ada’s syntax seemed to solve a lot of problems I was facing. Creating specific types, in-out parameters and much more were very helpful. However, with hyped languages like Rust, and Ada’s smaller open source community, it is at the moment hard for me to choose Ada for hobbyist projects that are not run in safety critical systems or on embedded platforms. The recent announcement of AdaCore, trying to improve the interoperability between Rust and Ada, will definitely help to choose Ada in the future.
The big advantage Ada has in my opinion over other languages is SPARK. SPARK, being a subset of Ada, allows you to formally verify your project to be free of runtime errors. From what I have found, SPARK is the most convenient way to write formally verified software and is really well integrated in Ada and GNAT Studio. One might think that it is crazy to recommend SPARK for open source and/or hobbyist projects, even though there are even fewer projects written in SPARK. However, I would argue that having a formally verified open source project increases the likelihood of being used by big companies, as it can significantly help to prevent supply chain attacks. That’s assuming that necessary lower level projects/libraries are formally verified. Sure, having lots of tests is good, but writing good tests is hard and tedious (especially if you need a lot of mocks), and they must be in sync with the current source code. Otherwise, those tests are mere gimmicks and I think that it is harder to evaluate a project with lots of tests compared to a formally verified one. Note that I do not consider projects that have neither tests nor proofs, as they are doomed to fail over time. With formal verification as it is integrated in SPARK, the normal source code you write already takes away the need to write detailed unit tests. This helps to undertake low level changes during development with way more confidence than unit tests could ever give you.
Having pre-/post-conditions and asserts directly embedded in the source code also felt very natural to me while programming as I always try to narrow down what is allowed at a certain line and with other languages that just means a lot of if-statements.
The problem with fewer open source projects remains though, but since there are so few, you have a lot of room to create your own basic low level project. Bigger projects will have to wait for now, or be prepared to take way longer than expected.
How to unbind SPARK #
Searching through those few open source projects written in SPARK, I discovered that all of them are bounded at least at one point. That seemed odd to me as some of those projects were parsing JSON files and those may range from one or two objects to several thousands, including nesting. So why this limitation?
This is probably because heap allocation has not been possible in SPARK up to 2019, as pointers where not possible before. With the introduction of pointers in SPARK, using the heap for unbounded programs became possible.
Since I could not find an open source project that used these new possibilities to provide unbounded data structures and the formal containers should not be used in production due to their performance, I decided to create
Preparing to handle
Dynamic allocation in Ada is done using the
new keyword. This is also the case in SPARK. At this point you might ask yourself, how does SPARK handle cases where there is not enough memory left to allocate?
In short, it doesn’t. Instead, a
Storage_Error gets thrown.
But I thought SPARK assures absence of runtime errors and does not allow handling exceptions?
SPARK can assure the absence of all runtime errors except
Storage_Error and since exception handling is not included in SPARK, every call to
new is tricky to handle, since it might throw this error.
The solution I chose was proposed by Fabien Chouteau in the linked GitHub issue. SPARK allows excluding certain code regions from analysis and in those, it is then possible to handle exceptions.
With that solution in place, SPARK programs are truly free of runtime errors (when additionally using GNATstack to monitor the needed stack size).
Below is the function body template taken from the issue on how a
Storage_Error may be handled:
function Alloc return Name is pragma SPARK_Mode (Off); -- Only the allocation has to be in SPARK off begin return new Object; exception when Storage_Error => return null; end Alloc;
Current state of spark_unbound #
Overall structure #
All provided packages are generic, and I tried to put as few constraints on user types as possible. At the moment,
spark_unbound only provides an
Unbound_Array type that mimics the
Vector type in
Ada.Containers.Vectors and the
Safe_Alloc package that is a wrapper to handle the
Storage_Error as discussed above. The
Safe_Alloc package provides two child packages for definite and array types. Because exception handling cannot be proven by SPARK, AUnit has been used to validate the functionality of
Below is an example on how to use
with Spark_Unbound.Safe_Alloc; procedure Test is type Alloc_Record is record V1 : Integer; V2 : Natural; V3 : Positive; end record; type Record_Acc is access Alloc_Record; package Record_Alloc is new Spark_Unbound.Safe_Alloc.Definite(T => Alloc_Record, T_Acc => Record_Acc); Rec_Acc : Record_Acc; begin Rec_Acc := Record_Alloc.Alloc; -- Note: No `new` is set before -- check if Rec_Acc is NOT null and then do something Record_Alloc.Free(Rec_Acc); end Test;
Unbound Array #
Unbound_Array in the package
Spark_Unbound.Arrays mimics the functionality of the
Vector type in
Ada.Containers.Vectors. Internally, an array is allocated to fit all appended elements. If this array gets too small, a new one is allocated and all elements are copied to the new one. At the moment, not all procedures and functions that can be proven, have been implemented. Inserting and deleting elements that are not at the end of the array is not yet possible.
Procedures and functions that cannot be proven, have been removed and will not be implemented. For example, the add operator
+ could fail, when the underlying array would need to be resized to fit all elements, but there is not enough contiguous memory.
All available procedures and functions are proven by GNATprove to achieve platinum level.
Below is an example on how to use
with Spark_Unbound.Arrays; procedure Test is package UA_Integer is new Spark_Unbound.Arrays(Element_Type => Integer, Index_Type => Positive); Test_UA : UA_Integer.Unbound_Array := UA_Integer.To_Unbound_Array(Initial_Capacity => 3); Success : Boolean; begin -- Fill Array UA_Integer.Append(Test_UA, 1, Success); UA_Integer.Append(Test_UA, 2, Success); UA_Integer.Append(Test_UA, 3, Success); -- Now Append() needs to resize UA_Integer.Append(Test_UA, 4, Success); end Test;
Note: You should check for
Success after every call to
Append(). Otherwise, you won’t be able to prove correct use.
Repository and workflows #
spark_unbound is available at GitHub with an automatically generated documentation at mhatzl.github.io/spark_unbound/ using GNATdoc.
To get this documentation, the configuration for GNATdoc was modified to only take
--- as documentation comments. The documentation is generated using a GitHub action and hosted using GitHub pages.
package Documentation is for Doc_Pattern use "^-"; -- This considers comments beginning with "---" to be documentation -- Needed to ignore commented Ghost functions that would break GNATdoc end Documentation;
Besides the documentation, two more workflows are defined:
- Build and prove the crate
- Run the AUnit tests
Next steps #
At first, I would like to address the following points, but I am quite stuck on how to solve them.
Making the fields of the
I was not able to write some proofs with the fields being private.
Getting rid of proof warning “memory accessed through objects of access type”
This should be due to the allocated array not being initialized on allocation. This is not a problem if one only uses the provided functions, but it bothers me.
Further goals are then to
- make the crate thread safe
- implement more procedures for
Unbound_Arrayto get close to
- implement the
- add a B‑Tree structure
I am also considering two changes to the current implementation:
Since the structure mimics the
Ada.Containers.Vectors, renaming it to
Vectorfeels more natural.
At the moment,
Append()tries to double the allocated memory every time the procedure is called when
Capacity = Lengthand if this doubled memory cannot be allocated, the requested memory is reduced linearly. Therefore,
Append()only fails if
current capacity + 1cannot be allocated.
While this squeezes out every last free memory region, it adds a significant performance overhead in those edge cases. My solutions would be to either fail if doubling is not possible, or reduce the requested memory in a logarithmic way. Any feedback regarding this is greatly appreciated.
spark_unbound helps to bring more people to SPARK, either using it directly or as a template.