Ada, SPARK, and Mathematics
It’s back!
I wrote a while back about
completing a programming
competition with the programming language Ada.
I’ve also been interested in
SPARK, a dialect of Ada where
the compiler can prove a program’s formal correctness.
Much of the 2014 revision of SPARK is based on
Design by Contract (DbC),
a programming methodology first described by
Bertrand Meyer,
designer of the
Eiffel
programming language.
Eiffel and Design by Contract
Eiffel?
In fact, Eiffel was an earlier love of mine.
When I got stuck on my PhD dissertation, I wrote an Eiffel program
to explore the problems I was studying,
and it was while studying one of its outputs
that I discovered an unexpected phenomenon that led to my first paper
(and, thus, my PhD).
That was back when the SmallEiffel / SmartEiffel was a going thing;
alas, its main developers abandoned it,
and for various reasons I decided to switch to Python and C++,
depending on need.
Design by Contract?
In Eiffel, the idea of DbC is that a feature specifies
not only its inputs and output,
but also
certain “contracts” on said inputs and output.
We see some limited examples of this in more popular programming languages
such as Java, Kotlin, and Rust, where the programmer can “decorate”
a function’s parameter as
@NonNull
.
This is a kind of contract, where the client can use the function
not only if he passes in the correct reference type,
but also
if that reference is not null. It’s a
requirement on the input.
Eiffel made this sort of thing more explicit
and more enforceable
by introducing the keywords
requires
and
ensures
.
✝(There were others,
including invariant
,
but I don’t want to get too carried away.
In the following Eiffel program excerpt, a client can use the feature
A
only when the parameters are positive.
In return,
A
“promises” that its result
will also be positive.
feature { ANY }
f ( a : INTEGER ; b : INTEGER ) : INTEGER
require
a > 0 and then b > 0
do
-- imagine some excellent code here
ensure
Result > 0
end
Eiffel has a lot of other nice features, as well, such as provable null safety
(“void safety” in Eiffel terminology).
I’m reasonably sure that Eiffel had this before Rust came along,
but Eiffel doesn’t receive as much attention as Rust.
Two drawbacks
One drawback is that Eiffel checks contracts at run-time,
rather than compile-time.
The methodology is to run a “debug version” of the system
which checks contracts, allowing the development team to find errors.
The violation of contracts is a
huge help to determining where problems
occur.
Once the team feels confident that it has worked out the problems,
it can release a version with contracts disabled,
but the “debug” version will remain very slow
as contracts are checked. If I recall correctly, the testers can refine this,
enabling and disabling contracts in different parts of the code,
which may mitigate this somewhat.
A second drawback is that Eiffel is a “pure”
object-oriented language which requires a lot of boilerplate code.
Every program must be defined as a class of some sort.
Not only Eiffel!
The idea and elegance of DbC remained impressed on my mind,
so when I discovered Ada 2012 a few years back, I was delighted to find
that they had added contracts to the language.
I’m aware that some other languages also offer
some support for DbC,
and I’ve experimented with Kotlin’s implementation,
but it isn&srquo;t as nice as Eiffel’s or Ada’s,
and nowhere near SPARK’s.
Rather notably, Rust, the current “hot” language
for “safe and secure programming”, does not implement DbC natively.
A crate exists
to add on this functionality, but the crate’s implementation is similar
to Eiffel’s, in that it explicitly states that contracts are
not
checked at compile-time.
SPARK’s implementation
In principle, Ada could imitate Eiffel and check contracts at runtime
on a “debug” build.
I’m not sure it does that,
since I haven’t yet gotten around to checking.
I’ve been too fascinated with the possibilities offered by SPARK.
A distinctive feature of SPARK’s implementation of DbC,
as opposed to Eiffel’s implementation, is that
The checker can prove a program satisfies contracts at compile-time,
or, if not,
indicate where such a proof falters!
It seems that SPARK does not have a “compiler”
in the traditional sense.
SPARK has a proof checker; you run a separate program on your code
to check that the specified contracts are specified.
When if the proof checker signs off on the code,
you don’t yet have an executable.
To compile SPARK code, one uses the usual Ada compiler.
I ran the examples below using AdaCore’s “GPS”,
an Ada-capable IDE that also works with other languages, such as C++.
It’s relatively to call the proof checker and set its options that way.
It’s also possible to use the command line:
gnatprove
will run the proof checker,
and
gnatmake
will build an executable.
I’ll spare the reader the details of project files,
which lie beyond the scope of this article.
Example
Here’s an extract of a work-in-progress Ada package that specifies
monomial arithemetic via a
package
(in Ada jargon).
It does not implement most features, though; most require a
package body
).
However,
Is_Divisible
is a simple enough function
that we can demonstrate “inline implementation”
of “expression functions”.
(I think I have the Ada jargon correct there.)
package Monomials with SPARK_Mode => On is
-- we can turn off SPARK_Mode when needed
type Exponent_Vector is array (Natural range <>) of Natural;
-- this will store exponents
type Monomial (Number_of_Variables: Positive) is record
-- a monomial is little more than an exponent vector,
-- but this can be expanded if need be
Exp: Exponent_Vector(1..Number_of_Variables);
Is_Initialized: Boolean;
end record;
type Term (Number_Of_Variables: Positive) is record
-- a term is the product of a monomial and a coeff
Coeff: Natural;
Mon: Monomial(Number_Of_Variables);
end record;
procedure Init(T: out Monomial; A: in Exponent_Vector) with
Relaxed_Initialization => T,
Pre => (
A'First = 1 and A'First <= A'Last and A'Last = T.Number_of_Variables
),
Post => (
T.Is_Initialized and then T'Initialized and then A = T.Exp
)
;
-- initalizes T's exponents to the values in A
-- notice how the postcondition guarantees initialization!
procedure Set_Exponent(T: in out Monomial; I: in Positive; To: Natural) with
Pre => I <= T.Number_Of_Variables,
Post => T.Exp(I) = To;
-- sets the ith exponent of T to the value of To
function Is_Divisible(First, Second: in Monomial) return Boolean
is ( for all I in Second.Exp'Range => First.Exp(I) >= Second.Exp(I) )
with Pre => (
First.Is_Initialized and then Second.Is_Initialized and then
First.Exp'First = Second.Exp'First and then
First.Exp'Last = Second.Exp'Last
),
Post => (
Is_Divisible'Result =
(
for all I in 1..Second.Number_Of_Variables =>
First.Exp(I) >= Second.Exp(I)
)
);
-- returns True if and only if First is divisible by Second
-- notice how the precondition requires initialization!
function "/" (Num: in Monomial; Den: in Monomial) return Monomial with
Pre => (
Num.Is_Initialized and then Den.Is_Initialized and then
Num.Exp'First = Den.Exp'First and then
Num.Exp'Last = Den.Exp'Last and then
Is_Divisible( Num, Den )
),
Post => (
"/"'Result.Exp'Last = Num.Exp'Last and then
"/"'Result.Number_Of_Variables = Num.Number_Of_Variables and then
( for all I in 1..Den.Number_Of_Variables =>
"/"'Result.Exp(I) = Num.Exp(I) - Den.Exp(I) )
)
;
-- returns the quotient of Monomial and Den
-- notice how the precondition requires divisibility and initialization!
procedure Divide_By(T: in out Monomial; U: in Monomial) with
Pre => (
T.Is_Initialized and then U.Is_Initialized and then
T.Number_Of_Variables = U.Number_Of_Variables and then
T.Exp'First <= T.Exp'Last and then
Is_Divisible(T, U)
),
Post => (
T.Number_Of_Variables = T'Old.Number_Of_Variables and then
( for all I in 1..T.Number_Of_Variables =>
T.Exp(I) = T'Old.Exp(I) - U.Exp(I) )
);
-- divides T by U and stores the result in T
-- notice how the precondition requires divisibility and initialization!
procedure Put(T: in Monomial);
end Monomials;
The full package has a lot more, but already you can see
a few interesting points of Ada and SPARK.
- Ada allows the programmer to turn SPARK on and off.
- Ada requires the programmer to be very specific about types:
- it does not usually convert automatically from one type to another,
only from subtypes to supertypes, like
Natural
to Integer
, and
- type compatibility is nominal (by name)
rather than structural (by the type’s structure).
- Arrays don’t have to start at 0.
- Ada allows the programmer to specify the start and end of every array,
which is much more convenient than most other languages.
It allows indexing to be appropriate for the problem,
rather than convenient for the compiler.
- Ada programmers can index array by enumerations as well as integers.
I’ve found this to be very useful in many situations,
though here it isn’t terribly helpful.
- Preconditions and postconditions can be highly non-trivial,
but can also be expressed rather “naturally”.
Most of the postconditions above contain a
for all
expression
which verifies everything in the specified range.
I had quite a bit of trouble getting the contracts to pass muster on
Is_Divisible
, if I recall correctly.
I think I had originally stated simply that
First
’s number of variables had to equal
Second
’s number of variables,
but the proof checker would fail.
Let’s look at that function again.
function Is_Divisible(First, Second: in Monomial) return Boolean
is ( for all I in Second.Exp'Range => First.Exp(I) >= Second.Exp(I) )
with Pre => (
First.Is_Initialized and then Second.Is_Initialized and then
First.Exp'First = Second.Exp'First and then
First.Exp'Last = Second.Exp'Last
),
Post => (
Is_Divisible'Result =
(
for all I in 1..Second.Number_Of_Variables =>
First.Exp(I) >= Second.Exp(I)
)
);
Remember that arrays can start and end anywhere.
The function’s implementation loops
through
Second
’s exponents,
and checks whether
First
’s exponent
of the same index is no smaller than
Second
’s.
That
needed to fail, because if
First
and
Second
started and ended in different locations,
that loop would have gone outside
First
’s range.
In short, SPARK saved my bacon here.
Most other languages would not!
Using it
Here’s a short program I wrote to test the code.
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;
with Monomials; use Monomials;
procedure Main with SPARK_Mode => On is
O: constant Weight_Vector := Grevlex(5);
T, U: Monomial(5);
V : Monomial := (
Is_Initialized => True,
Number_Of_Variables => 5,
Exp => ( 0, 0, 0, 0, 0 )
);
A: constant Exponent_Vector(1..5) := ( 1, 2, 3, 1, 0);
B: constant Exponent_Vector(1..5) := ( 1 => 1, others => 0 );
begin
-- initialize T and U to the exponents indicated in A and B
Init(T, A);
Init(U, B);
-- let V be the product of T and U
if ( for all I in T.Exp'Range => Natural'Last - U.Exp(I) >= T.Exp(I) ) then
V := T * U;
end if;
Put(V); New_Line;
-- check whether T is divisible by U
if Is_Divisible(T, U) then Put( T / U );
else Put(T); Put(" is not divisible by "); Put(U); New_Line(1);
end if;
New_Line;
-- output a line showing that V is indeed the product of T and U
Put(V); Put(" = "); Put(T); Put(" * "); Put(U); Put("? ");
New_Line;
end Main;
Suppose I comment out the first few lines, where I initialize
T
and
U
,
and skip directly to testing if
T
is divisible by
U
.
The proof checker will fail with several errors,
the most interesting for my purposes being the following:
rr:cc medium: precondition might fail
cannot prove First.Is_Initialized
SPARK detects that the program has failed to satisfy a precondition.
Sloppy initialization has been a frequent source of bugs
in just about any language, at least for me.
Won’t other solutions handle this problem?
You might wonder if information hiding and constructors suffice.
It’s true that I have exposed my types’ components.
I could in fact make them
private
,
and with the requirement of default construction
that most object-oriented languages impose
— as well as many
non-object-oriented languages —
that would have eliminated this particular error.
So, sure, DbC is not strictly necessary for this case.
But that’s an ad hoc solution to a particular instance
of the larger problem, which is that developers often don’t think about
a subprogram’s requirements,
especially subtle cases easily forgotten.
(Is that pointer
null
? could it be dangling?
Is that integer you’re about to divide by zero?)
In many cases,
developers may not even know the the requirements!
I’ve seen far too many examples of poorly documented,
even
non-documented code, to imagine that we don’t need DbC.
As far as I can tell, SPARK has the best implementation of DbC, hands down.
I concede that a programmer unlikely to document code
is unlikely to write good contracts for the code.
However, if
I write contracts, and if the build system can check them,
then even a bad programmer will have to respect them.
It works for generics, too!
Here’s another example of contracts, this time illustrating
how they can handle contracts.
The two packages listed specify, then implement the Euclidean algorithm.
Notice how preconditions and postconditions appear
only in the specification of
Gcd
.
We do not repeat them in the implementation.
generic
-- this is a “generic” package specification
-- everything that appears between the keywords “generic”
-- and “package” specifies parameters the client must specify
-- in this case, the client must specify “Ring_Element”,
-- a function to test equality, and a function to computer a remainder
type Ring_Element is private;
-- specifies the type of objects for which we want to compute gcd's;
-- algebraically speaking, this should be a Euclidean Ring.
with function "=" ( Left, Right: Ring_Element ) return Boolean is <>;
-- should returns True iff Left and Right have the same value.
-- the clause “is <>” indicates that the client need not specify “=”
-- if one is commonly available
with function "rem" ( Dividend, Divisor: Ring_Element )
return Ring_Element is <>;
-- returns the remainder of dividing the Dividend by the Divisor;
-- to guarantee termination, the result should be "smaller" than Divisor.
-- how precisely this is accomplished can depend on the ring,
-- and need not be specified.
Zero: Ring_Element;
-- needs to be the ring's additive identity
package Euclidean is
function Gcd(A, B: Ring_Element) return Ring_Element
with
Pre => ( A /= Zero and B /= Zero ),
-- The “Pre”condition indicates that the inputs should be nonzero
Post => ( Gcd'Result /= Zero )
-- The “Post”condition indicates that the result will be nonzero
;
-- returns a greatest common divisor (gcd) of A and B
end Euclidean;
package body euclidean is
function Gcd(A, B: Ring_Element) return Ring_Element
is
-- initialize local variables
M: Ring_Element := A;
N: Ring_Element := B;
begin
-- loop until the smaller is zero
while N /= Zero loop
declare R: Ring_Element := M rem N;
begin
M := N;
N := R;
end;
end loop;
-- the compiler will give an error if the following statement is missing
return M;
end Gcd;
end euclidean;
Here’s an example program that imports our Euclidean package,
(
with
’s it in
Ada jargon), instantiates a concrete version of the generic,
then uses it to compute the greatest common divisor of two numbers.
-- @summary demonstration of SPARK with a generic package
--
-- @author John Perry
--
-- @description
-- Test program for our generic package.
-- packages we need
with Euclidean;
with Ada.Text_IO;
procedure Main
with SPARK_Mode => On
-- setting to On means SPARK will check this procedure when we run GNAT prove
is
package IO is new Ada.Text_IO.Integer_IO ( Num => Integer );
-- instantiating the generic Integer_IO package to make integer output
package Euclidean_ZZ is new Euclidean
(
Ring_Element => Integer,
Zero => 0
);
-- Euclidean_ZZ instantiates our Euclidean package for the integers.
-- The package requires definition of the ring element type
-- and a zero element.
-- It technically requires "=" or "rem" as well, but Ada allows one to define
-- default values for generic parameters, in which case
-- the compiler will substitute those values.
begin
-- print the gcd of -24 and -16, followed by a new line
IO.Put( Euclidean_ZZ.Gcd( -24, -16 ) );
Ada.Text_IO.New_Line;
-- if you uncomment line 46, GNAT prove will report a problem:
--
-- medium: precondition might fail
-- cannot prove B /= 0
--
-- IO.Put( Euclidean_ZZ.Gcd( -24,0 ) );
-- I could avoid fully qualifying the statements above
-- if I resorted to a "use" statement;
-- i.e., "use Euclidean_ZZ" and/or "use Ada.Text_IO.Integer_IO"
-- (the latter defaults to the Integer type).
end Main;
If I un-comment the line
IO.Put( Euclidean_ZZ.Gcd( -24, 0 ) );
the checker will fail.
From my perspective as a mathematician and programmer who has often struggled
to debug code,
that is immensely useful.
It took me a while to figure out how to phrase the pre- and postconditions
in such a way that SPARK could prove correctness, which might seem odd,
since they look relatively straightforward (and obvious, to someone familiar
with the material).
The difficulty is that I wanted to do this for a generic type,
rather than simply for the integers,
and the error messages did not make it clear to me how I should specify
the generic parameters.
It’s true that the Euclidean algorithm is not a good candidate for this;
even if the parameters are both zero,
the algorithm I’ve given will terminately correctly even when they are.
I have also implemented a package for monomial arithmetic that uses contracts
to check for correctness, but that’s a bit too long to copy here.
The fine print
There are some drawbacks to SPARK.
First, it’s not Ada; it’s a
dialect of Ada.
Some things are disallowed.
That hasn’t been a practical problem for me yet,
but I haven’t been working with SPARK long, or much.
Second, writing good contracts is
hard.
I’m reasonably good with logic, but
figuring out how to phrase my assumptions in language SPARK understands
is not always easy.
Third, SPARK is not a smart as I’d like it to be;
if you look at the monomial-testing program above, you’ll notice that
I had to wrap a call to the product of two monomials in an
if
statement.
Technically SPARK
could figure this out; after all, the values of
T
’s and
U
’s
exponents are known at compile-time.
I get the reason for this
in general; to prove the preconditions
for an arbitrary monomial, I have wrap the product inside a check.
Sure, failing to do that is indeed problematic and a source of bugs
that, for instance, I myself have encountered before.
But it’s still not strictly necessary here,
and I wish the prover could have determined that.
Nevertheless, getting past the obstacles has made me enjoy working with SPARK,
and as the community (or at least AdaCore)
continues to develop it actively,
SPARK could be useful for mathematics,
especially in a world where mathematicians sometimes express concern
that it is difficult to ascertain the correctness of a computer algebra system.