Ada, SPARK, and Mathematics

It’s back!

I wrote a while back about completing a programming competition with the programming language Ada. I’ve also been interested in SPARK, a dialect of Ada where the compiler can prove a program’s formal correctness. Much of the 2014 revision of SPARK is based on Design by Contract (DbC), a programming methodology first described by Bertrand Meyer, designer of the Eiffel programming language.

Eiffel and Design by Contract

Eiffel?

In fact, Eiffel was an earlier love of mine. When I got stuck on my PhD dissertation, I wrote an Eiffel program to explore the problems I was studying, and it was while studying one of its outputs that I discovered an unexpected phenomenon that led to my first paper (and, thus, my PhD). That was back when the SmallEiffel / SmartEiffel was a going thing; alas, its main developers abandoned it, and for various reasons I decided to switch to Python and C++, depending on need.

Design by Contract?

In Eiffel, the idea of DbC is that a feature specifies not only its inputs and output, but also certain “contracts” on said inputs and output.

We see some limited examples of this in more popular programming languages such as Java, Kotlin, and Rust, where the programmer can “decorate” a function’s parameter as @NonNull. This is a kind of contract, where the client can use the function not only if he passes in the correct reference type, but also if that reference is not null. It’s a requirement on the input.

Eiffel made this sort of thing more explicit and more enforceable by introducing the keywords requires and ensures. In the following Eiffel program excerpt, a client can use the feature A only when the parameters are positive. In return, A “promises” that its result will also be positive.

feature { ANY }

   f ( a : INTEGER ; b : INTEGER ) : INTEGER
   require
      a > 0 and then b > 0
   do
      -- imagine some excellent code here
   ensure
      Result > 0
   end

Eiffel has a lot of other nice features, as well, such as provable null safety (“void safety” in Eiffel terminology). I’m reasonably sure that Eiffel had this before Rust came along, but Eiffel doesn’t receive as much attention as Rust.

Two drawbacks

One drawback is that Eiffel checks contracts at run-time, rather than compile-time. The methodology is to run a “debug version” of the system which checks contracts, allowing the development team to find errors. The violation of contracts is a huge help to determining where problems occur. Once the team feels confident that it has worked out the problems, it can release a version with contracts disabled, but the “debug” version will remain very slow as contracts are checked. If I recall correctly, the testers can refine this, enabling and disabling contracts in different parts of the code, which may mitigate this somewhat.

A second drawback is that Eiffel is a “pure” object-oriented language which requires a lot of boilerplate code. Every program must be defined as a class of some sort.

Not only Eiffel!

The idea and elegance of DbC remained impressed on my mind, so when I discovered Ada 2012 a few years back, I was delighted to find that they had added contracts to the language. I’m aware that some other languages also offer some support for DbC, and I’ve experimented with Kotlin’s implementation, but it isn&srquo;t as nice as Eiffel’s or Ada’s, and nowhere near SPARK’s.

Rather notably, Rust, the current “hot” language for “safe and secure programming”, does not implement DbC natively. A crate exists to add on this functionality, but the crate’s implementation is similar to Eiffel’s, in that it explicitly states that contracts are not checked at compile-time.

SPARK’s implementation

In principle, Ada could imitate Eiffel and check contracts at runtime on a “debug” build. I’m not sure it does that, since I haven’t yet gotten around to checking. I’ve been too fascinated with the possibilities offered by SPARK.

A distinctive feature of SPARK’s implementation of DbC, as opposed to Eiffel’s implementation, is that

The checker can prove a program satisfies contracts at compile-time,
or, if not,
indicate where such a proof falters!

It seems that SPARK does not have a “compiler” in the traditional sense. SPARK has a proof checker; you run a separate program on your code to check that the specified contracts are specified. When if the proof checker signs off on the code, you don’t yet have an executable. To compile SPARK code, one uses the usual Ada compiler.

I ran the examples below using AdaCore’s “GPS”, an Ada-capable IDE that also works with other languages, such as C++. It’s relatively to call the proof checker and set its options that way. It’s also possible to use the command line: gnatprove will run the proof checker, and gnatmake will build an executable. I’ll spare the reader the details of project files, which lie beyond the scope of this article.

Example

Here’s an extract of a work-in-progress Ada package that specifies monomial arithemetic via a package (in Ada jargon). It does not implement most features, though; most require a package body). However, Is_Divisible is a simple enough function that we can demonstrate “inline implementation” of “expression functions”. (I think I have the Ada jargon correct there.)

package Monomials with SPARK_Mode => On is
-- we can turn off SPARK_Mode when needed

type Exponent_Vector is array (Natural range <>) of Natural;
-- this will store exponents

type Monomial (Number_of_Variables: Positive) is record
-- a monomial is little more than an exponent vector,
-- but this can be expanded if need be

    Exp: Exponent_Vector(1..Number_of_Variables);
    Is_Initialized: Boolean;

end record;

type Term (Number_Of_Variables: Positive) is record
-- a term is the product of a monomial and a coeff

    Coeff: Natural;
    Mon: Monomial(Number_Of_Variables);

end record;

procedure Init(T: out Monomial; A: in Exponent_Vector) with
      Relaxed_Initialization => T,
      Pre => (
          A'First = 1 and A'First <= A'Last and A'Last = T.Number_of_Variables
      ),
      Post => (
          T.Is_Initialized and then T'Initialized and then A = T.Exp
      )
;
-- initalizes T's exponents to the values in A
-- notice how the postcondition guarantees initialization!

procedure Set_Exponent(T: in out Monomial; I: in Positive; To: Natural) with
   Pre => I <= T.Number_Of_Variables,
   Post => T.Exp(I) = To;
-- sets the ith exponent of T to the value of To

function Is_Divisible(First, Second: in Monomial) return Boolean
is ( for all I in Second.Exp'Range => First.Exp(I) >= Second.Exp(I) )
      with Pre => (
                      First.Is_Initialized and then Second.Is_Initialized and then
                      First.Exp'First = Second.Exp'First and then
                         First.Exp'Last = Second.Exp'Last
                  ),
      Post => (
                  Is_Divisible'Result =
                     (
                            for all I in 1..Second.Number_Of_Variables =>
                               First.Exp(I) >= Second.Exp(I)
                     )
     );
-- returns True if and only if First is divisible by Second
-- notice how the precondition requires initialization!

function "/" (Num: in Monomial; Den: in Monomial) return Monomial with
      Pre => (
                 Num.Is_Initialized and then Den.Is_Initialized and then
                 Num.Exp'First = Den.Exp'First and then
                    Num.Exp'Last = Den.Exp'Last and then
                       Is_Divisible( Num, Den )
      ),
      Post => (
         "/"'Result.Exp'Last = Num.Exp'Last and then
         "/"'Result.Number_Of_Variables = Num.Number_Of_Variables and then
         ( for all I in 1..Den.Number_Of_Variables =>
            "/"'Result.Exp(I) = Num.Exp(I) - Den.Exp(I) )
      )
;
-- returns the quotient of Monomial and Den
-- notice how the precondition requires divisibility and initialization!

procedure Divide_By(T: in out Monomial; U: in Monomial) with
   Pre => (
      T.Is_Initialized and then U.Is_Initialized and then
      T.Number_Of_Variables = U.Number_Of_Variables and then
      T.Exp'First <= T.Exp'Last and then
      Is_Divisible(T, U)
   ),
   Post => (
      T.Number_Of_Variables = T'Old.Number_Of_Variables and then
      ( for all I in 1..T.Number_Of_Variables =>
            T.Exp(I) = T'Old.Exp(I) - U.Exp(I) )
   );
-- divides T by U and stores the result in T
-- notice how the precondition requires divisibility and initialization!

procedure Put(T: in Monomial);

end Monomials;

The full package has a lot more, but already you can see a few interesting points of Ada and SPARK.

Ada allows the programmer to turn SPARK on and off.
Ada requires the programmer to be very specific about types:
- it does not usually convert automatically from one type to another, only from subtypes to supertypes, like Natural to Integer, and
- type compatibility is nominal (by name) rather than structural (by the type’s structure).
Arrays don’t have to start at 0.
- Ada allows the programmer to specify the start and end of every array, which is much more convenient than most other languages. It allows indexing to be appropriate for the problem, rather than convenient for the compiler.
- Ada programmers can index array by enumerations as well as integers. I’ve found this to be very useful in many situations, though here it isn’t terribly helpful.
Preconditions and postconditions can be highly non-trivial, but can also be expressed rather “naturally”. Most of the postconditions above contain a for all expression which verifies everything in the specified range.

I had quite a bit of trouble getting the contracts to pass muster on Is_Divisible, if I recall correctly. I think I had originally stated simply that First’s number of variables had to equal Second’s number of variables, but the proof checker would fail. Let’s look at that function again.

function Is_Divisible(First, Second: in Monomial) return Boolean
is ( for all I in Second.Exp'Range => First.Exp(I) >= Second.Exp(I) )
      with Pre => (
                      First.Is_Initialized and then Second.Is_Initialized and then
                      First.Exp'First = Second.Exp'First and then
                         First.Exp'Last = Second.Exp'Last
                  ),
      Post => (
                  Is_Divisible'Result =
                     (
                            for all I in 1..Second.Number_Of_Variables =>
                               First.Exp(I) >= Second.Exp(I)
                     )
     );

Remember that arrays can start and end anywhere. The function’s implementation loops through Second’s exponents, and checks whether First’s exponent of the same index is no smaller than Second’s. That needed to fail, because if First and Second started and ended in different locations, that loop would have gone outside First’s range.

In short, SPARK saved my bacon here.
Most other languages would not!

Using it

Here’s a short program I wrote to test the code.

with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;
with Monomials; use Monomials;

procedure Main with SPARK_Mode => On is

  O: constant Weight_Vector := Grevlex(5);

  T, U: Monomial(5);
  V   : Monomial := (
                     Is_Initialized      => True,
                     Number_Of_Variables => 5,
                     Exp                 => ( 0, 0, 0, 0, 0 )
                    );

  A: constant Exponent_Vector(1..5) := ( 1, 2, 3, 1, 0);
  B: constant Exponent_Vector(1..5) := ( 1 => 1, others => 0 );

begin

   -- initialize T and U to the exponents indicated in A and B
   Init(T, A);
   Init(U, B);
   
   -- let V be the product of T and U
   if ( for all I in T.Exp'Range => Natural'Last - U.Exp(I) >= T.Exp(I) ) then
      V := T * U;
   end if;
   Put(V); New_Line;
   
   -- check whether T is divisible by U
   if Is_Divisible(T, U) then Put( T / U );
   else Put(T); Put(" is not divisible by "); Put(U); New_Line(1);
   end if;
   New_Line;
   
   -- output a line showing that V is indeed the product of T and U
   Put(V); Put(" = "); Put(T); Put(" * "); Put(U); Put("? ");
   New_Line;
   
end Main;

Suppose I comment out the first few lines, where I initialize T and U, and skip directly to testing if T is divisible by U. The proof checker will fail with several errors, the most interesting for my purposes being the following:

rr:cc    medium: precondition might fail
           cannot prove First.Is_Initialized

SPARK detects that the program has failed to satisfy a precondition. Sloppy initialization has been a frequent source of bugs in just about any language, at least for me.

Won’t other solutions handle this problem?

You might wonder if information hiding and constructors suffice. It’s true that I have exposed my types’ components. I could in fact make them private, and with the requirement of default construction that most object-oriented languages impose — as well as many non-object-oriented languages — that would have eliminated this particular error. So, sure, DbC is not strictly necessary for this case.

But that’s an ad hoc solution to a particular instance of the larger problem, which is that developers often don’t think about a subprogram’s requirements, especially subtle cases easily forgotten. (Is that pointer null? could it be dangling? Is that integer you’re about to divide by zero?) In many cases, developers may not even know the the requirements! I’ve seen far too many examples of poorly documented, even non-documented code, to imagine that we don’t need DbC. As far as I can tell, SPARK has the best implementation of DbC, hands down.

I concede that a programmer unlikely to document code is unlikely to write good contracts for the code. However, if I write contracts, and if the build system can check them, then even a bad programmer will have to respect them.

It works for generics, too!

Here’s another example of contracts, this time illustrating how they can handle contracts. The two packages listed specify, then implement the Euclidean algorithm. Notice how preconditions and postconditions appear only in the specification of Gcd. We do not repeat them in the implementation.

generic

   -- this is a “generic” package specification
   -- everything that appears between the keywords “generic”
   -- and “package” specifies parameters the client must specify
   -- in this case, the client must specify “Ring_Element”,
   -- a function to test equality, and a function to computer a remainder

   type Ring_Element is private;
   -- specifies the type of objects for which we want to compute gcd's;
   -- algebraically speaking, this should be a Euclidean Ring.

   with function "=" ( Left, Right: Ring_Element ) return Boolean is <>;
   -- should returns True iff Left and Right have the same value.
   -- the clause “is <>” indicates that the client need not specify “=”
   -- if one is commonly available

   with function "rem" ( Dividend, Divisor: Ring_Element )
                        return Ring_Element is <>;
   -- returns the remainder of dividing the Dividend by the Divisor;
   -- to guarantee termination, the result should be "smaller" than Divisor.
   -- how precisely this is accomplished can depend on the ring,
   -- and need not be specified.

   Zero: Ring_Element;
   -- needs to be the ring's additive identity

package Euclidean is

   function Gcd(A, B: Ring_Element) return Ring_Element
      with
         Pre => ( A /= Zero and B /= Zero ),
         -- The “Pre”condition indicates that the inputs should be nonzero
         Post => ( Gcd'Result /= Zero )
         -- The “Post”condition indicates that the result will be nonzero
   ;
   -- returns a greatest common divisor (gcd) of A and B

end Euclidean;

package body euclidean is

   function Gcd(A, B: Ring_Element) return Ring_Element
   is

   -- initialize local variables
      M: Ring_Element := A;
      N: Ring_Element := B;

   begin

      -- loop until the smaller is zero
      while N /= Zero loop

         declare R: Ring_Element := M rem N;
         begin
            M := N;
            N := R;
         end;

      end loop;

      -- the compiler will give an error if the following statement is missing
      return M;

   end Gcd;

end euclidean;

Here’s an example program that imports our Euclidean package, (with’s it in Ada jargon), instantiates a concrete version of the generic, then uses it to compute the greatest common divisor of two numbers.

-- @summary demonstration of SPARK with a generic package
--
-- @author John Perry
--
-- @description
-- Test program for our generic package.

-- packages we need

with Euclidean;
with Ada.Text_IO;

procedure Main

   with SPARK_Mode => On
   -- setting to On means SPARK will check this procedure when we run GNAT prove

is

   package IO is new Ada.Text_IO.Integer_IO ( Num => Integer );
   -- instantiating the generic Integer_IO package to make integer output

   package Euclidean_ZZ is new Euclidean
      (
       Ring_Element => Integer,
       Zero         => 0
      );
   -- Euclidean_ZZ instantiates our Euclidean package for the integers.
   -- The package requires definition of the ring element type
   -- and a zero element.
   -- It technically requires "=" or "rem" as well, but Ada allows one to define
   -- default values for generic parameters, in which case
   -- the compiler will substitute those values.

begin

   -- print the gcd of -24 and -16, followed by a new line
   IO.Put( Euclidean_ZZ.Gcd( -24, -16 ) );
   Ada.Text_IO.New_Line;

   -- if you uncomment line 46, GNAT prove will report a problem:
   --
   --     medium: precondition might fail
   --       cannot prove B /= 0
   --
   -- IO.Put( Euclidean_ZZ.Gcd( -24,0 ) );

   -- I could avoid fully qualifying the statements above
   -- if I resorted to a "use" statement;
   -- i.e., "use Euclidean_ZZ" and/or "use Ada.Text_IO.Integer_IO"
   -- (the latter defaults to the Integer type).

end Main;

If I un-comment the line IO.Put( Euclidean_ZZ.Gcd( -24, 0 ) ); the checker will fail.

From my perspective as a mathematician and programmer who has often struggled to debug code, that is immensely useful. It took me a while to figure out how to phrase the pre- and postconditions in such a way that SPARK could prove correctness, which might seem odd, since they look relatively straightforward (and obvious, to someone familiar with the material). The difficulty is that I wanted to do this for a generic type, rather than simply for the integers, and the error messages did not make it clear to me how I should specify the generic parameters.

It’s true that the Euclidean algorithm is not a good candidate for this; even if the parameters are both zero, the algorithm I’ve given will terminately correctly even when they are. I have also implemented a package for monomial arithmetic that uses contracts to check for correctness, but that’s a bit too long to copy here.

The fine print

There are some drawbacks to SPARK.

First, it’s not Ada; it’s a dialect of Ada. Some things are disallowed. That hasn’t been a practical problem for me yet, but I haven’t been working with SPARK long, or much.

Second, writing good contracts is hard. I’m reasonably good with logic, but figuring out how to phrase my assumptions in language SPARK understands is not always easy.

Third, SPARK is not a smart as I’d like it to be; if you look at the monomial-testing program above, you’ll notice that I had to wrap a call to the product of two monomials in an if statement. Technically SPARK could figure this out; after all, the values of T’s and U’s exponents are known at compile-time. I get the reason for this in general; to prove the preconditions for an arbitrary monomial, I have wrap the product inside a check. Sure, failing to do that is indeed problematic and a source of bugs that, for instance, I myself have encountered before. But it’s still not strictly necessary here, and I wish the prover could have determined that.

Nevertheless, getting past the obstacles has made me enjoy working with SPARK, and as the community (or at least AdaCore) continues to develop it actively, SPARK could be useful for mathematics, especially in a world where mathematicians sometimes express concern that it is difficult to ascertain the correctness of a computer algebra system.