More C++ type promotion carelessness
The fourth in a possibly never-ending series of reflections on what makes C++
a terrible choice for programming.
Happy independence day! If only we could declare our independence from C++.
Today’s article is a riff on the phenomenon encountered in
this article.
It shows that C++ sometimes compels you to use type promotion when…
- a sane person wouldn’t think to use it,
- a sane person wouldn’t want to use it, and most importantly,
- it makes no sense to use it.
Setup
Consider the following code snippet.
#include ⟨cstdint⟩
#include ⟨iostream⟩
const uint8_t values[] {
079, 110, 101, 032, 114, 105, 110, 103,
032, 116, 111, 032, 114, 117, 108, 101,
032, 116, 104, 101, 109, 032, 097, 108,
108, 33
};
int main() {
for (uint8_t value : values) {
std::cout << value;
}
std::cout << std::endl;
return 0;
}
Question
What output do you expect from this program?
The sane answer (which is of course wrong)
You expect to see 26 numbers, starting with 79 and ending with 33.
After all, the type of the elements in the array
values
is
uint8_t
,
which according to the C++ standard means, and I quote,
uint8_t =
unsigned integer type; // optional
Well, that’s not very helpful, let’s look at what
cppreference.com tells us:
unsigned integer type with width of exactly 8… bits…
(provided if and only if the implementation directly supports the type)
Thus, you’d expect to see
7911010132114105110103321161113211411710810132116104101109329710810833
But, as I say, you won’t.
(Nor do you expect to see spaces, because
- I deliberately left out the spaces, so that…
- there will be some entertainment when all is revealed.)
The insane answer (which is of course correct)
The actual output is…
One ring to rule them all!
(Formatting and cool image not included. This is C++, after all.)
What is going on?
Apparently
uint8_t
is a synonym for
char
.
Of course, that isn’t specified in the standard, and
there’s not even a warning that this might not do what you want.✝Sure, failing to warn you about unexpected behavior
is par for the course in C++, but they’ve gotten a little better about that over the last couple
of decades.
What’s that? You scoff? Come on, I said a little.
So I doubt anyone on the language design committee or the compiler development team thought very much about this
at all.
How do I get around it?
Use a sane programming language like Ada or Rust.
Not an option? OK, use a cast:
✞Casting
has its place, but this really isn’t it.
And even if there’s a better way, this is precisely what most C++ developers will do,
with the exception that they’ll probably use (uint16_t )value
rather than the more verbose static_cast⟨uint16_t⟩
.
If you don’t believe me, check out the answers to a similar question on
Stack Overflow, where the least insane workarounds are typecasts,
and one solution advances an entirely new namespace. I have a hard time believing that guy
held a straight face while submitting that solution, but not only do many people defend it,
it apparently solves quite a few other solutions. Why can’t the compiler be bothered
to do this for you? …welcome to C++, kindly leave your sanity at the door.
std::cout << static_cast⟨short⟩(value); // or uint16_t or whatever, just so long as you waste at least one byte
This forces the compiler to
- see that what you’ve already plainly labeled as an integer really ought to be output as an
integer; and
- waste at least byte of space, since a
uint16_t
is one byte larger.
After all,
the primary C++ design criteria is that
You pay for what you forget you use. Dearly.