C++ `struct` carelessness

One in a possibly never-ending series of reflections on what makes C++ a terrible choice for programming.

At work I recently updated a bridge between our Rust project and a C++ API. The code includes both extensive, automated testing on both the Rust and C++ sides, as well as an example C++ program to illustrate feature usage. Everything in Rust built and tested successfully, and everything in C++ built and seemed to run successfully, but for some mysterious reason one test didn’t pass.

Eventually I realized the problem came down to this:

// in api.h
struct Thing {
   Type1 first_field;
   Type2 second_field;
   Type3 third_field;
   Type4 fourth_field;
};

// in tests.cpp
Thing one {
      first_field_value,
      second_field_value,
      third_field_value
   };

The precise values of Type1, Type2, etc. don’t matter; what matters here is that:

fourth_field was not initialized, because…
lines 10-14 had existed in tests.cpp since time immemorial;
the latest API change added line 6 (fourth_field);
lines 10-14 are part of a setup for a web server integration test;
said test required every field, including fourth_field, to have a particular value;
since lines 10-14 did not initialize fourth_field, it contained data the server did not expect, so that the test failed.

On the one hand, that’s a good thing: the test should fail when you feed it garbage data. Debugging this in a test, then finding the error, reminds the programmer using this api to hunt down every instance of Thing in the codebase and properly initialize fourth_field when needed.

On the other hand,

It wasn’t supposed to fail.
It’s a rather subtle bug — sure, it stands out here, because I don’t have it surrounded by hundreds or thousands of lines of code, much of it “delicate” enough to have broken in the past, compelling me to check lots of wires before I finally came to this one. Hunting this down wasted the better part of an hour.
That’s one hour I could have spent doing something else.
If we didn’t have such a strict testing policy, the programmer using our api might well not think to go fix this, leading to bugs on the client’s end that are hard to debug and may well seem like bugs on the server’s end. (After all, we just updated our code, while their code hasn’t changed!)

The average C/C++ programmer may think that this is how programming should work; there could be cases where you don’t need to initialize every field of a struct — say, for whatever reason, you only need those fields in certain cases, but not all — and initializing things you don’t need wastes a few precious nanoseconds of CPU time. Per the C++ motto,

You don’t pay for things you don’t use.

The trouble with this philosophy is that most programs are read and/or modified, not written, then forgotten. As a result, all too often in C++,

You pay — hard — for things you forget to use.

As Donald Knuth might say,

Premature optimization is the root of all evil (in programming).

Is there a safer way?

Of course there is, especially if you’re willing to trust the optimizer to remove things you don’t need, and focus on solving the problem reliably.

Ada

Let’s look at what happens when I try this in Ada.

with Ada.Text_IO;

procedure Test_Ada is
   type Thing is record
      First, Second: Integer;
   end record;
   One: Thing := ( First => 4 );
begin
   Ada.Text_IO.Put_Line(One.First'Image & ", " & One.Second'Image);
end Test_Ada;

This fails at compile time:

$ gnatmake test_ada.adb 
gcc -c test_ada.adb
test_ada.adb:7:18: error: no value supplied for component "Second"

This forces you to define the Second field. If you don’t care what value it has, just initialize it to the default:

    One: Thing := ( First => 4, Second => <> );

(<> is Ada shorthand for “default value”, and is called “the box” on account of its looks.) This now compiles and runs as expected. (Interestingly, the output on my machine depends on the optimization level. That does not happen with C++, which always spits out 0 for the second field.)

Rust

The equivalent Rust code would be:

#[derive(Debug)]
struct Thing {
   first: isize,
   second: isize,
}

fn main() {
   let one: Thing = Thing { first: 4 };
   println!("{:?}", one);
}

Like Ada, Rust refuses to compile this. Unlike Ada, Rust gives a characteristically verbose error message:

$ rustc test_rust.rs 
error[E0063]: missing field `second` in initializer of `Thing`
 --> test_rust.rs:8:21
  |
8 |    let one: Thing = Thing { first: 4 };
  |                     ^^^^^ missing `second`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0063`.

Getting this to compile when you don’t care about second’s value is a little harder in Rust, but not much. You’ll need to derive the Default trait, then call it explicitly. Modify the following lines:

#[derive(Debug, Default)]

   let one: Thing = Thing { first: 4, ..Thing::default() };

Again, this compiles and runs as expected.

Summary

Requiring a struct’s user to define all its fields, as Ada and Rust do, prevents the introduction of bugs by adding fields during API revision. It may carry a very small run-time cost, but if you really don’t need a field every time, then perhaps you don’t so much need a struct as a union, or, if you want to use a safer, more modern C++, a std::variant. Unfortunately, the C++ language committee must not like that approach, as it went out of its way to make std::variant incredibly painful to use, a topic I’ll visit at some point in the future.

C++ struct carelessness

Is there a safer way?

Ada

Rust

Summary

C++ `struct` carelessness