C++ std::visit carelessness

The second in a possibly never-ending series of reflections on what makes C++ a terrible choice for programming.

i C++ what you did there&hellip ; i C++ what you did there…

The problem

It’s useful sometimes to list all possible values for some object in your problem. Here are two examples that might arise when navigating a maze. A straightforward implementation represents the maze as an array, where each entry’s value indicates the object in that location. As a player navigates the maze, the program inspects the value.

The safe, reliable solution: enumerated types

Ada: Time-tested, safe and secure Ada: Time-tested, safe and secure Ada

In Ada one describes this situation with an enumerated type:
type Directions = (North, South, East, West);
type Object = (Empty, Orc, Gnome, Pit, Gem);

--  to test the location
--  Ada infers the variant
case Maze (Row, Col) is
   when Empty       => Keep_Moving;
   when Orc | Gnome => Flee;
   when Pit         => Die;
   when Gem         =>
      Pick_It_Up;
      Keep_Moving;
end case;

Rust Ada: Time-tested, safe and secure Rust is for Rustaceans

Rust also obliges with a simple way of doing this:
enum Directions {
    North,
    South,
    East,
    West,
}

enum Object {
    Empty,
    Orc,
    Gnome,
    Pit,
    Gem
}

// to test the location
// rust requires you to scope the variant or to use it
match maze[row][col] {
    Object::Empty => keep_moving(),
    Object::Orc | Object::Gnome => flee(),
    Object::Pit => die(),
    Object::Gem => {
        pick_it_up();
        keep_moving();
    }
}

Results

Both are readable and, importantly, both are safe, in that if later you add a new object (Sidekick, say) then you must add logic to consider the new variant to every every case/match. If you don’t, the program simply won’t compile.

For example, suppose I had forgotten to consider the Gem type. The GNAT Ada compiler complains:
test_std_visit.adb:20:01: error: missing case value: "Gem"
Whereas rustc complains:
error[E0004]: non-exhaustive patterns: `Object::Gem` not covered
  --> test_std_visit.rs:20:10
   |
20 |    match maze[row][col] {
   |          ^^^^^^^^^^^^^^ pattern `Object::Gem` not covered
   |
note: `Object` defined here
  --> test_std_visit.rs:7:4
   |
2  | enum Object {
   |      ------
...
7  |    Gem,
   |    ^^^ not covered
   = note: the matched value is of type `Object`
help: ensure that all possible cases are being handled by adding a match arm with a wildcard pattern or an explicit pattern as shown
   |
23 ~       Object::Pit => die(),
24 ~       Object::Gem => todo!(),
   |
The Ada error is one line long and tells you everything you need to know. The Rust error is longer, but is quite detailed and offers you a solution.

The various C++ approaches

How does C++ address this need? Unsurprisingly, it follows the primary C++ design criterion,
You pay for what you forget to use. Dearly.
Every iteration of C++ provides some solution that fails, hard, in at least one respect.

Pre-C++98

Before C++98, C++ programmers fell back on the C approach to this problem, which was to use the preprocessor. For the example above:
#define NORTH 0
#define SOUTH 1
#define EAST 2
#define WEST 3

#define EMPTY 0
#define ORC 1
#define GNOME 2
#define PIT 3
#define GEM 4

// to test the location
// if you're a *really* good boy, maze will be an int * and
// you’ll see pointer arithmetic instead of indices on the line below
switch (maze[row][col]) {
    case EMPTY: keep_moving(); break;
    case ORC: case GNOME: flee(); break;
    case PIT: die(); break;
    case GEM: pick_it_up(); keep_moving(); break;
}
Problems with this approach include, but are not limited to:

C++98

With C++98, developers could take advantage of an enum type:
enum Directions { north, south, east, west };
enum Object { empty, orc, gnome, pit, gem };

// to test the location
switch (maze[row][col]) {
   case empty: keep_moving(); break;
   case orc: case gnome: flee(); break;
   case pit: die(); break;
   case gem: pick_it_up(); keep_moving(); break;
}
This has the merit of making the code somewhat more readable. In addition, you no longer have to define integer values for the variants if you don’t need them. If you define maze to be of type Object maze[10][20];, it will even catch, at compile time, the attempt to assign a non-Object type; that is, maze[5][7] = north; will generate an error. Heck, to the untrained eye it’s indistinguishable from the Ada and Rust!

In reality, it’s still unsafe and unreliable. Indeed, it introduces a completely new problem!

C++11

C++11 added an enum class type:
enum class Directions { north, south, east, west };
enum class Object { empty, orc, gnome, pit, gem };

// to test the location
switch (maze[row][col]) {
   case Object::empty: keep_moving(); break;
   case Object::orc: case Object::gnome: flee(); break;
   case Object::pit: die(); break;
   case Object::gem: pick_it_up(); keep_moving(); break;
}
Things have improved! The symbol orc must now be namescoped or imported, as in Rust, and we can no longer use int or Directions as potential values for Objects.

However, we still can’t rely on the compiler to verify that we have handled every case correctly: if we add a sidekick variant, or if we comment out the last case statement, the compiler chugs along happily, not informing us that we’ve overlooked a variant.

C++14/C++17

The C++ committee adds a new tool we can use for this problem in C++14, and streamlines it slightly in C++17. To attack this problem, a C++ developer must Here’s how we would solve our problem in a C++17 program:
#include ⟨variant⟩

// this next line is for some reason load-bearing, is always the same,
// and is immensely useful... yet the standard library does not define it...
// it's possible to get around it, but prepare for more pain
// and before you complain: until C++17 you had to add another line, too!
template struct overload : Ts... { using Ts::operator()...; };

// why yes, your enum variants are now structs; did you not want that?
// you can also use a different type, but it gets... even more complicated
struct empty {};
struct orc {};
struct gnome {};
struct pit {};
struct gem {};

using Object = std::variant⟨empty, orc, gnome, pit, gem⟩;

// to test the location
std::visit(
    overload {
        [](const empty arg) { keep_moving(); },
        [](const orc arg) { flee(); },    // if there's a way to put these
        [](const gnome arg) { flee(); },  // in the same case, i'm unaware
        [](const pit arg) { die(); },
        [](const gem arg) { die(); },
    },
    maze[row][col]    // why is this at the end?
);
What do we notice from this? On the plus side, On the other hand,

Conclusion

For a problem this simple and fundamental, I’d’a thunk the C++ committee could simply extend the switch keyword for an enum class argument — or even for a plain enum argument! — to check that every variant is covered. Instead of doing that, they provide std::visit, as if to prove that there’s no problem so simple that templates can’t wreck it.

After 40-odd years of existence, C++ still doesn’t offer an ergonomic tool to handle something as fundamental as checking every variant of an enumerated type. Lots of languages offered proper enumeration types before C++ came along. (Pascal, Modula-2, and Ada come to mind; I doubt they were alone.) So why didn’t C++ offer it? I honestly don’t know. But the current “best” approach is so bad that, judging from the compiler’s confused error messages, even the C++ compilers can’t figure it out!