Studies of Contracts in Practice

by Yannick Moy – Mar 31, 2014

Two recent research papers focus on how program contracts are used in practice in open source projects, in three languages that support contracts (Eiffel obviously, Java with JML contracts and C# with Code Contracts).

Spoiler: the conclusions are the same as those drawn by Patrice Chalin in his 2006 research paper Are Practitioners Writing Contracts?. At least for the paper that did it right.

Now, more details. The researcher Patrice Chalin conducted in 2006 a study of how contracts were used in 85 projects in Eiffel. Although most projects were open source, the majority of the code analyzed was proprietary (55%), thanks to the setup of the study which did not require the researcher to have access to the code of projects. Instead, he advertised his study to the community of Eiffel users, and he provided a script for users to run on their code, so that those projects who participated simply sent him the metrics computed by the script on their code. From the remaining 45% of the code, a majority (33%) came from open source projects not related to the two main companies providing tool and library support for Eiffel development (Eiffel Software and GoboSoft), the latter representing a minority of the code analyzed (11%). Chalin separated the three groups of code (proprietary, open source, standard libraries and tools), as he expected the results to be different between the three groups. He said in particular:

We separated out EiffelStudio (libraries and samples) into its own category because we expected it to have the highest proportion of assertions.

The results confirmed his expectations: proprietary code had a ratio of 4.2% lines of assertions compared to lines of code, open source code has a ratio of 5.8%, and standard libraries and tools had the highest ratio of 6.7%. He also found a majority of preconditions (50% of all assertions), followed by postconditions (40%), and only 3% of plain inline assertions. Very simple checks for non-nullness of pointers were the most used assertions (35%).

Now, fast forward to 2014. In the first article Case Studies and Tools for Contract Specifications, Schiller & others have investigated how contracts are used in 90 open source projects in C# using Code Contracts. They found a majority of preconditions (68%), followed by postconditions (26%). Basic checks for the presence of data are the most used assertions (75%), of which the majority are non-nullness checks (66%). To explain why 2/3 of all assertions are non-nullness checks, the authors write:

The large number of nullness contracts relative to the other contract types suggests that nullness contracts may be “crowding out” application-specific contracts — that is, the developers’ limited resources (time, lines of code, etc.) are being consumed by writing nullness contracts.

My own experience is also that it's much more work to use contracts (both for run-time checking or formal verification) in languages like C or Java that lack the strong typing constraints of Ada, like non-nullness of pointers, but also scalar ranges, and now with Ada 2012 the richer predicates and invariants.

The authors make recommendations for those who create languages and tools using contracts, that I found particularly interesting:

We recommend that contract language and tool designers take three complementary actions: (1) introduce tooling to reduce annotation burden, (2) make suggestions an integral part of tooling, and (3) curate best practices by establishing design/specification patterns. Reducing annotation burden is especially important to provide value to developers in the near-term — tools for static checking, refactoring, and testing with contracts are still relatively immature.

In our work on SPARK 2014, point (1) has been at the center of our preoccupations since day 1, influencing the language, the tool architecture, and the tool features. For example, for the language, AdaCore has been key to including expression functions in Ada 2012 (and then SPARK 2014), because they allow giving in a concise form both the implementation and the postcondition of simple query functions; for tool architecture, the first phase of GNATprove is a global analysis of effects, which makes Global contracts facultative; for tool features, we are working currently on inlining of local subprograms, which will make functional contracts optional for these too.

We also have a solution for point (2), provided by our tool CodePeer, which generates preconditions and postconditions on Ada programs:

The article provides interesting ideas for the day when we want to add the possibility to automatically insert part of the contracts generated by CodePeer in the code. In their contract inserter Add-In in Visual Studio, the developer can take one of four actions for each contract, or group of contracts obtained by filtering:

Insert as Code Contract
Insert as documentation (using an English-like output)
Ignore, because the contract is not true
Ignore, because the property is an implementation detail

Regarding point (3), we've also recognized this need early on for loop invariants: with CNAM researchers, we defined a set of loop patterns commonly used, for which we have defined the best possible loop invariant. A colleague of mine is currently working on doumenting these, plus others gathered in the code of industrial projects in SPARK, in the SPARK Toolset User's Guide.

In the second article Contracts in Practice, Estler & others have investigated how contracts are used in 21 open-source projects using Eiffel, Java with JML contracts and C# with Code Contracts. But there is a probleme here with the 21 projets selected, partly recognized by the authors themselves:

A related observation is that the developers of several of the study’s projects are supporters of using formal specifications.

In fact, a quick Web search shows that only 1 project (Schweet) out of the 21 is not developed by an active supporter of formal specifications. Ironically, it's also the smallest one. All 8 projects in Eiffel are part of Eiffel Studio or GoboSoft libraries, what Chalin called the standard libraries and tools. All 6 projects in Java were developed in the context of tools for JML (JavaFE, RCC and ESCJava are all developed by KindSoftware, while DirectVCGen, Umbra and Logging were all developed at ETH Zurich). Of the 7 projects in C#, 4 are developed by researchers from Microsoft Research RiSE group developing Code Contracts (Boogie, CCI, Dafny, QuickGraph), and 2 more (Labs Framework and Rxx) are developed by an independent developer who developed his own version of contracts in .NET before Code Contracts were available.

However interesting these projects may be, they are hardly representative of how projects may use contracts in practice. Hence, the conclusions drawn from these projects (for example, that "preconditions and postconditions are used equally frequently across most projects", or that in a majority of the projects, "1/3 or more of the routines have some specification (pre- or postconditions)") are not likely to apply to the majority of projects, where adding contracts is a means to reach some objective (of quality, or documentation, or something else) instead of an objective in itself.

Posted in #Language #Formal Verification #Contracts

About Yannick Moy

Yannick Moy is Head of the Static Analysis Unit at AdaCore. Yannick contributes to the development of SPARK, a software source code analyzer aiming at verifying safety/security properties of programs. He frequently talks about SPARK in articles, conferences, classes and blogs (in particular blog.adacore.com). Yannick previously worked on source code analyzers for PolySpace (now The MathWorks) and at Université Paris-Sud.