Borrowing in Rust

I wrote previously that I was working a lot in Rust at work, and hinted that while I like it, certain design decisions disappoint me. One of these has to do with Rust’s most notorious difficult: the borrow checker.

It is not my intention to attempt a full explanation of the borrow checker. Suffice it to say that data in Rust is either owned or borrowed: a variable either owns its data or borrows it from another variable.

This concept turns out to be quite powerful, and effective. That being the case, let me pose the following question:

Is the following code correct?

let mut result = 0;
for row in matrix.iter() {
   for element in row.iter() {
      result += element;
   }
}

The reader is correct to suspect that this is a trick question. Followup question: what about the following?

let mut result = i32::MIN;
for row in matrix.iter() {
   for element in row.iter() {
      result = result.max(element);
   }
}

Why is this a trick question?

The reader unfamiliar with Rust, but familiar with type-checking, may wonder whether it has something to do with the types of result and element. You would be correct! In fact, you would be more correct than at least a few moderately experienced Rustaceans I’ve observed, whose stated approach is to just type and let the IDE’s language plugin correct them when they’re wrong.

I could be wrong, but relying on the IDE as a crutch strikes me as suggesting the presence of unneeded conceptual complexity. I’m not dealing with morons here; these are very, very smart people.

The experienced Rustacean reader hopefully knows which code is incorrect, and how to correct it:

let mut result = i32::MIN;
for row in matrix.iter() {
   for element in row.iter() {
      result = result.max(*element);
   }
}

If you don’t see the difference between the second and third examples, look carefully. It’s there.

See it yet? It’s the absence (or presence) of an asterisk right before element in line 4. Whether you put one there depends, somewhat perversely, on whether the function you call borrows or moves its data.

To illustrate this problem further, the following code’s correctness depends entirely on whether matrix itself borrows or owns its data:

let mut result = 0;
for row in matrix.into_iter() {
   for element in row.into_iter() {
      result = result.max(*element);
   }
}

For example, if matrix is of type &Vec<Vec<f32>>, you need an asterisk — it’s borrowed, which means element is a reference, and you have to dereference it to have the same type as sum — but if matrix is of type <Vec<f32>>, no asterisk is needed — it’s owned, so there’s no need to dereference.

Clear?

Apparently it isn’t, since, as I say, I routinely see moderately experienced Rustaceans let the IDE or compiler correct them. The effect is such that when I asked a very experienced senior developer the difference between iter() and into_iter(), he got it wrong: he simply tended to use .into_iter() unless the IDE or compiler tells him it’s incorrect. The reality is exactly backwards, though to be fair the Rust manual goes out of its way to tell you in symbols, rather than words. That’s characteristic of Rust: like C, it prefers that you memorize the meanings of &, *, but unlike C &* is necessary, because…

Well, that gets us to disfavored design decision #2, which is related to the convoluted explanation I had to give above:

Rust notation conflates concepts with implementation

Rather than use the & to indicate the concept of borrowed data, Rust’s designers used it to indicate the implementation: You have to use it to borrow data, but it’s not a designation of the borrowing concept; it’s a designation of the tool used to borrow: references. Both the Rust manual and many Rustaceans will tell you that the two are one and the same, but we see above that this is not in fact the case.. In a similar way, * doesn’t mean “use borrowed data”; it means “dereference referenced data.” The necessity of a * when you want to use borrowed data depends entirely on whether the client is willing to use a borrowed copy, or insists on having the data moved to it.

And if the client insists on having the data moved to it, then God help you, because the only way out of that is a .clone() — assuming you have one available.

That brings us to disfavored design decision #3: the default action in Rust is to move data, rather than to borrow it. According to one of the Rust designers’s answers on Reddit, this was a deliberate choice made to suit their preferred programming style. Very well, but there are an awful lot of &’s littering most Rust code I’ve seen; in most cases I work with, I use an & to pass the parameter, because borrowing is more or less painless: I almost never need to “drop” data, to use Rust parlance, and one of Rust’s strengths lies precisely in that the compiler can always figure out when it can drop the data. So there’s no practical reason for me to move data, and I suspect that’s the case for many others, as well.

To conclude, I’d like to compare this to one of Ada’s design decisions that I really appreciate: procedures accept parameters in one of three modes:

in, which means the parameter is to be read but not modified;
out, which maens the parameter is to be assigned but not read; and
in out, which means what you hopefully expect: the parameter can be both read and assigned.

A common question one sees in Ada forums is,

How do I tell the compiler I want to pass the parameter by reference?

The typical answer is: You don’t. That’s not your job. The programmer’s job is to specify the solution to a problem at a conceptual level, and not to micromanage the details.

There is some not entirely invalid criticism to this, in that if you look at the following Ada code, it’s not clear what mode Do_Something_With accepts for Thing:

declare Thing: Integer;
begin
   Do_Something_With(Thing);
end;

But that’s beside the point, I think. Perhaps we can agree that we should praise Rust’s designers for deciding that the corresponding Rust code would be

let thing = 0;
do_something_with(&mut thing);

…if we want to change thing, or

let thing = 0;
do_something_with(&thing);

…if we don’t; or even

let thing = 0;
do_something_with(thing);

…if we never intend to use thing again. I do appreciate that sort of explicit designation.

Though I do wish they’s used borrow instead of that ampersand, or perhaps move with a different design… but you go to code with the language you have, not the language you wish you had… and I suppose that if I work with Rust long enough, I may come to appreciate the decision to make moving the default.