An Expedition into Libadalang
by Martyn Pike –
I’ve been telling Ada developers for a while now that Libadalang will open up the possibility of more-easily writing Ada source code analysis tools. (You can read more about Libadalang here and here and can also access the project on Github.)
Along these lines, I recently had a discussion with a customer about whether there were any tools for detecting uses of access types in their code which got me thinking about possible ways to detect the use of Access Types in a set of Ada source code files.
GNATcheck doesn't currently have a rule that prohibits the use of access types. Also, SPARK 2014 recently added support for Access Types, whereas previously they were banned. So earlier versions of GNATprove could detect them quite effectively, the latest and future versions may not.
I decided to architect a solution to this problem and determined there were several implementation options open to me:
- Use ‘grep’ on a set of Ada sources to find instances of the "access" Ada keyword
- Use gnat2xml and then use ‘grep’ on the resulting output to search for certain tags
- Use gnat2xml and then write an XML-aware search utility to search for certain tags
- Use Libadalang to write my own Ada static analysis program
Option 1 and 2 just feel too easy and would defeat the purpose of this blog post.
Option 3 is perhaps a good topic for another post related to using XML/Ada, however I decided to put my money where my mouth is and go with Option 4!
While I wrote this program in Ada, I could have written it in Python.
So here is the program:
with Ada.Text_IO; use Ada.Text_IO;
with Libadalang.Analysis; use Libadalang.Analysis;
with Libadalang.Common; use Libadalang.Common;
with Ada.Strings.Fixed;
with Ada.Strings;
procedure ptrfinder1 is
LAL_CTX : constant Analysis_Context := Create_Context;
begin
Read_Standard_Input:
while not End_Of_File(Standard_Input)
loop
Process_Ada_Unit:
declare
Filename : constant String := Get_Line;
Unit : constant Analysis_Unit := LAL_CTX.Get_From_File(Filename);
function Process_Node(Node : Ada_Node'Class) return Visit_Status is
begin
if Node.Kind in Ada_Access_Def
| Ada_Access_To_Subp_Def_Range
| Ada_Base_Type_Access_Def
| Ada_Anonymous_Type_Access_Def_Range
| Ada_Type_Access_Def_Range
then
Put_Line(
Ada.Strings.Fixed.Trim(
Source => Filename & ":" & Node.Sloc_Range.Start_Line'Img,
Side => Ada.Strings.Left
)
);
end if;
return Into;
end Process_Node;
begin
if not Unit.Has_Diagnostics then
Unit.Root.Traverse(Process_Node'Access);
end if;
end Process_Ada_Unit;
end loop Read_Standard_Input;
end ptrfinder1;
I designed the program to read a series of fully qualified absolute filenames from standard input and process each of them in turn. This approach made the program much easier to write and test and, as you'll see, allowed the program to be integrated effectively with other tools.
Let's deconstruct the code a little....
For each provided filename, the program creates a Libadalang Analysis_Unit for that filename.
Read_Standard_Input:
while not End_Of_File(Standard_Input)
loop
Process_Ada_Unit:
declare
Filename : constant String := Get_Line;
Unit : constant Analysis_Unit :=
LAL_CTX.Get_From_File(Filename);
As long as it has no issues, the Ada unit is traversed and the Process_Node subprogram is executed for each detected node.
if not Unit.Has_Diagnostics then
Unit.Root.Traverse(Process_Node'Access);
end if;
The Process_Node subprogram checks the Kind field of the detected Ada_Node'Class parameter to see if it is any of the access type related nodes. If so, the program outputs the fully qualified filename, a ":" delimiter, and the line number of the detected node.
function Process_Node(Node : Ada_Node'Class) return Visit_Status is
begin
if Node.Kind in Ada_Access_Def
| Ada_Access_To_Subp_Def_Range
| Ada_Base_Type_Access_Def
| Ada_Anonymous_Type_Access_Def_Range
| Ada_Type_Access_Def_Range
then
Put_Line(
Ada.Strings.Fixed.Trim(
Source => Filename & ":" & Node.Sloc_Range.Start_Line'Img,
Side => Ada.Strings.Left
)
);
end if;
return Into;
end Process_Node;
At the end of the Process_Node subprogram, the returned value allows the traversal to continue.
To make the program a more useful tool within a development environment based on GNAT Pro, I integrated it with the piped output of the 'gprls' program.
gprls is a tool that outputs information about compiled sources. It gives the relationship between objects, unit names, and source files. It can also be used to check source dependencies as well as various other characteristics.
My program can then be invoked as part of a more complex command line:
$ gprls -s -P test.gpr | ./ptrfinder1
Given the following content of test.gpr:
project Test is
For Languages use ("Ada");
for Source_Dirs use (".");
for Object_Dir use "obj";
end Test;
Plus an Ada source code file called inc_ptr1.adb (in the same directory as test.gpr) containing the following:
procedure Inc_Ptr1 is
type Ptr is access all Integer;
begin
null;
end Inc_Ptr1;
The resulting output from the integration of gprls and my program is:
/home/pike/Workspace/access-detector/test/inc_ptr1.adb:3
This output correctly identified the access type usage on line 3 of inc_ptr1.adb.
But how do I know that my program or indeed Libadalang has functioned correctly?
I decided to stick in principle to the UNIX philosophy of "Do One Thing and Do it Well" and write a second program to verify the output of my first program using a simple algorithm.
This second program is given a filename and line number and verifies that the keyword "access" appears on the specified line number.
Of course, I could also have embedded this verification into the first program, but to illustrate a point about diversity I chose not to.
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Directories; use Ada.Directories;
with Ada.Strings; use Ada.Strings;
with Ada.Strings.Fixed; use Ada.Strings.Fixed;
with Ada.IO_Exceptions;
procedure ptrfinder2 is
begin
Read_Standard_Input:
while not End_Of_File(Standard_Input)
loop
Process_Standard_Input:
declare
Std_Input : constant String := Get_Line;
Delimeter_Position : constant Natural := Index(Std_Input,":");
Line_Number_As_String : constant String := Std_Input(Delimeter_Position+1..Std_Input'Last);
Line_Number : constant Integer := Integer'Value(Line_Number_As_String);
Filename : constant String := Std_Input(Std_Input'First..Delimeter_Position-1);
The_File : File_Type;
Verified : Boolean := False;
begin
if Ada.Directories.Exists(Filename) and then Line_Number > 1
then
Open(File => The_File, Mode => In_File, Name => Filename);
Locate_Line:
for I in 1..Line_Number loop
Verified := Index(Get_Line(The_File)," access ") > 0;
exit when Verified or else End_Of_File(The_File);
end loop Locate_Line;
Close(File => The_File);
end if;
if Verified then
Put_Line("Access Type Verified on line #" & Line_Number_As_String & " of " & Filename);
else
Put_Line("Suspected Access Type *NOT* Verified on line #" & Line_Number_As_String & " of " & Filename);
end if;
end Process_Standard_Input;
end loop Read_Standard_Input;
end ptrfinder2;
I can then string the first and second program together:
$ gprls -s -P test.gpr | ./ptrfinder1 | ./ptrfinder2
This produces the output:
Access Type Verified on line #3 of /home/pike/Workspace/access-detector/test/inc_ptr1.adb
It goes without saying that a set of Ada sources with no Access Type usage will result in no output from either the first or second program.
This expedition into Libadalang has reminded me how extremely effective Ada can be at writing software development tools.
The two programs described in this blog post were built and tested on 64-bit Ubuntu 19.10 using GNAT Pro and Libadalang. They are also known to build successfully with the 64-bit Linux version of GNAT Community 2019.
The source code can be downloaded and built from GitHub.