Cross-referencing Ada with Libadalang

by Pierre-Marie de Rodat , Raphaël Amiard – Dec 18, 2017

Libadalang has come a long way since the last time we blogged about it. In the past 6 months, we have been working tirelessly on name resolution, a pretty complicated topic in Ada, and it is finally ready enough that we feel ready to blog about it, and encourage people to try it out.

WARNING: While pretty far along, the work is still not finished. It is expected that some statements and declarations are not yet resolved. You might also run into the occasional crash. Feel free to report that on our github!

In our last blog post, we learned how to use Libadalang’s lexical and syntactic analyzers in order to highlight Ada source code. You may know websites that display source code with cross-referencing information: this makes it possible to navigate from references to declarations. For instance elixir, Free Electrons’ Linux source code explorer: go to a random source file and click on an identifier. This kind of tool makes it very easy to explore an unknown code base.

So, we extended our code highlighter to generate cross-references links, as a showcase of Libadalang’s semantic analysis abilities. If you are lazy, or just want to play with the code, you can find a compilable set of source files for it at Libadalang’s repository on GitHub (look for ada2web.adb). If you are interested in how to use name resolution in your own programs, we will use this blog post to show how to use Libadalang’s name resolution to expand our previous code highlighter.

Note that if you haven’t read the previous blog post, we recommend you to read it as below, we assume familiarity with topics from it.

Where are my source files?

Unlike lexical and syntactic analysis, which process source files separately, semantic analysis works on a set of source files, or more precisely on a source files plus all its dependencies. This is logical: in order to understand an object declaration in foo.ads, one needs to know about the corresponding type, and if the type is declared in another source file (say bar.ads), both files are required for analysis.

By default, Libadalang assumes that all source files are in the current directory. That’s enough for toy source files, but not at all for real world projects, which are generally spread over multiple directories in a complex nesting scheme. Libadalang can’t know about the files layout of all Ada projects in the world, so we created an abstraction that enables anyone to tell it how to reach source files: the Libadalang.Analysis.Unit_Provider_Interface interface type. This type has exactly one abstract primitive: Get_Unit which, given a unit name and a unit kind (specification or body?) calls Analysis_Context’s Get_From_File or Get_From_Buffer to create the corresponding analysis unit.

In the context of a source code editor (for instance), this allows Libadalang to query a source file even if this file exists only in memory, not in a real source file, or if it’s more up-to-date in memory. Using a custom unit provider in Libadalang is easy: dynamically allocate a concrete implementation of this interface, then pass it to the Unit_Provider formal in Analysis_Context’s constructor: the Create function. Libadalang will take care of deallocating this object when the context is destroyed.

declare
   UP  : My_Unit_Provider_Access :=
      new My_Unit_Provider_Type …;
   Ctx : Analysis_Context := Create (Unit_Provider => UP);
   --  UP will be queried when performing name resolution
begin
   --  Do useful things, and then when done…
   Destroy (Ctx);
end;

Nowadays, a lot of Ada projects use GPRbuild and thus have a project file. That’s fortunate: project files give us exactly the information Libadalang needs: where are source files, what’s their naming scheme. Because of this, Libadalang provides a tagged type that implements this interface to deal with project files: Project_Unit_Provider_Type, from the Libadalang.Unit_Files.Projects package. In order to do this, one first need to load the project file using GNATCOLL.Projects:

declare
   Project_File : GNATCOLL.VFS.Virtual_File;
   Project      : GNATCOLL.Projects.Project_Tree_Access;
   Env          : GNATCOLL.Projects.Project_Environment_Access;
   UP           : Libadalang.Analysis.Unit_Provider_Access;
   Ctx          : Libadalang.Analysis.Analysis_Context; 
begin
   --  First load the project file
   Project := new Project_Tree;
   Initialize (Env);

   --  Initialize Project_File, set the target, create
   --  scenario variables, …
   Project.Load (Project_File, Env);

   --  Now create the unit provider and the analysis context.
   --  Is_Project_Owner is set to True so that the project
   --  is deallocated when UP is destroyed.
   UP := new Project_Unit_Provider_Type’
     (Create (Project, Env, True));
   Ctx := Create (Unit_Provider => UP);

   --  Do useful things, and then when done…
   Destroy (Ctx);
end;

Now that Libadalang knows where the source files are, we can ask it to resolve names!

Let’s jump to definitions

Just like in the highligther, most of the website generator will consist of asking Libadalang to parse source files (Get_From_File), checking for lexing/parsing errors (Has_Diagnostics, Diagnostics) and then dealing with AST nodes and tokens in analysis units. The new bits here are turning identifiers into hypertext links to redirect to their definition. As for highlighting classes, we do this token annotation with an array and a tree traversal:

Unit : Analysis_Unit := …;
--  Analysis unit to process

Xrefs : array (1 .. Token_Count (Unit)) of Basic_Decl :=
   (others => No_Basic_Decl);
--  For each token, the declaration to which the token should
--  link or No_Basic_Decl for no cross-reference.

function Process_Node
  (Node : Ada_Node’Class) return Visit_Status;
--  Callback for AST traversal. For string literals and
--  identifiers, annotate the corresponding in Xrefs to the
--  designated declaration, if found.

With these declarations, we can do the annotations easily:

Root (Unit).Traverse (Process_Node’Access);

But how Process_Node does its magic? That’s easy too:

function Process_Node
  (Node : Ada_Node’Class) return Visit_Status is
begin
   --  Annotate only tokens for string literals and
   --  identifiers.
   if Node.Kind not in Ada_String_Literal | Ada_Identifier
   then
      return Into;
   end if;

   declare
      Token : constant Token_Type :=
         Node.As_Single_Tok_Node.F_Tok;
      Idx   : constant Natural := Natural (Index (Token));
      Decl  : Basic_Decl renames Xrefs (Idx);
   begin
      Decl := Node.P_Referenced_Decl;
   exception
       when Property_Error => null;
   end;
end Process_Node;

String literal and identifier nodes both inherit from the Single_Tok_Node abstract node, hence the conversion to retrieve the underlying token. Then we locate which cell in the Xrefs array they correspond to. And finally we fill it with the result of the P_Referenced_Decl primitive. This function tries to fetch the declaration corresponding to Node. Easy I said!

What’s the exception handler for, you might ask, though? What we call AST node properties (all functions whose name starts with P_) can raise Property_Error exceptions. These can happen if Libadalang works on invalid Ada sources and cannot find query results. As name resolution is still actively developed, it can happen that this exception is raised even for valid source code: if that happens to you, please report this bug! Note that if a property raises an exception that is not a Property_Error, this is another kind of bug: please report it too!

Bind it all together

Now we have a list of Basic_Decl nodes to create hypertext links, but how can we do that? The trick is to get the name of the source file that contains this declaration, plus its source location:

Decl_Unit : constant Analysis_Unit := Decl.Get_Unit;
Decl_File : constant String := Get_Filename (Decl_Unit);
Decl_Line : constant Langkit_Support.Slocs.Line_Number :=
   Decl.Sloc_Range.Start_Line;

Then you can turn this information into a hypertext link. For example, if you generate X.html for the X source file (foo.ads.html for foo.ads, …) and generate LY HTML anchors for line number Y:

Line_No : constant String :=
   Natural'Image (Natural (Decl_Line));
Href : constant String :=
   Decl_File & ".html#L"
   & Line_No (Line_No'First + 1 .. Line_No'Last);

Some amount of plumbing is still needed to have a complete website generator:

get a list of all source files to process in the loaded project, using GNATCOLL.Projects’ API;
actually output HTML code: code from the previous blog can be reused and updated to do this;
generate an index HTML file as an entry point for navigation.

But as usual, covering all these topics will get out of the scope of this blog post and will make an unreasonable long essay. So thank you once more for reading this post to the end!

Posted in #Libadalang #Ada

About Pierre-Marie de Rodat

Pierre-Marie joined AdaCore in 2013, after he got an engineering degree at EPITA (IT engineering school in Paris). He mainly works on GNATcoverage, GCC, GDB and Libadalang.

About Raphaël Amiard

Raphaël Amiard is a software engineer at AdaCore working on tooling and compiler technologies. He joined AdaCore in 2013, after an internship on AdaCore's IDEs. His main interests are compiler technologies, language design, and sound/music making.