C++ type promotion carelessness
The third in a possibly never-ending series of reflections on what makes C++
a terrible choice for programming.
The problem
We recently lost several hours at work trying to figure out
the cause of a problem where a client who works primarily in C++
was using our C++ bridge to link to our Rust library
but getting unexpected results.
I knew the code well enough to recognize that the unexpected results
were entirely expected when you were
testing certain functionality.
That led me to believe that he was calling the wrong constructor,
and passing an argument meant to test said functionality.
Except he wasn’t, or at least, not on purpose.
Here’s an abstraction of our API:
const bool TESTING = true;
class A {
A(bool = false); // false initializes w/a default value;
// true gives you testing mode
A(string); // initializes with a custom value
}
Here’s how he was calling the code:
A a("test value");
He’s passing a string, right?
So he should end up in the second constructor, right?
WRONG.
While he’s passing a string, he’s not passing a
string
. The different typeface,
if you can perceive it, means that
he’s not passing the C++
string
type.
Why not? C++’s designers strive for compatibility with C whenever possible,
even when that defies common sense.
In this instance,
the C language considers the string
"test value"
to be of type
const char *
,
which for the mere mortals reading this, means
“an unchangeable pointer to
char
acters.”
So C++ looks at it that way, too,
even though
it has a “native”
string
type.
There’s nothing
inherently bad about that,
but the API above lacks a constructor that accepts
the type
const char *
as an argument.
To handle this, a language has essentially three options at compile-time:
- Consider it an error.
- Promote/convert/whatever the supplied data to a sensible type.
- Promote/convert/whatever the supplied data to a type that makes no sense
except as a very limited hack for progammer convenience.
The designers of C did not include a proper type for strings,
✝They couldn’t be bothered even to include
a proper type for arrays,
which is why you get an unchangeable pointer to characters
instead of an unchangeable array of characters,
the way you would in a sanely-designed language
like Pascal, Modula-2, , Rust, …
so they were left with options
1 and
3.
Naturally, they chose option
3: the compiler promotes/converts/whatevers
anything of type
const char *
to
bool
,
automatically, even though
that makes no sense at all at any level other than
hey, it saves the programmer a few keystrokes.
To wit, the way a C programmer decides if a string is empty
is by testing whether its first character is the null character,
typically denoted
\0
.
In the logic of C, where everything is an integer,
- If
\0
is the character
at the pointer address, then
- the compiler promotes/converts/whatevers
\0
to the integer value 0, since
\0
is the first entry in the ASCII encoding,
and then
- the compiler then promotes/converts/whatevers that integer value 0 to
the
bool
ean value
false
.
- If the first character is not
\0
,
but any other character at all, then
- the compiler likewise promotes/converts/whatevers the character
to the corresponding integer in the ASCII encoding, and then
- it promotes/converts/whatevers that value is
to
true
.
Worse, this automatic conversion is considered a
language feature,
not an oversight, let alone a flaw, so like all C promotions/conversions/whatevers
it happens
silently and automatically.
You don’t even get a compiler warning about it, because…
well, why would you?
It’s a
language feature, after all, so why would you care
if you were possibly, inadvertently, introducing a bug that has,
over the last few decades, been one of the many sources of security exploits,
never mind programmer confusion while debugging?
And since C++ mindlessly follows C’s design decisions whenever possible,
our client was calling the first constructor without realizing it.
And despite being a long-time C++ developer, even
he required
a couple of hours to work out the problem.
It likely would have taken him longer, had I not had the insight
that he was inadvertently calling the wrong constructor.
Unsurprisingly,
it follows one of the primary C++ design criteria,
You pay for what you forget you use. Dearly.