A first Rust macro

Rust is for Crustaceans Rust is for Crustaceans
A while back, I left the university for a job in software development. I’m glad I made the change, and will write more about it later, though the basic reasoning can be found here.

In my new job I am nominally a Java programmer, but mainly I’ve been writing Rust. In general, I like it, but some of the design decisions irk me, and Ada remains my favorite.

That discussion can wait for later. For now, I want to show off my first Rust macro. I’ll explain what that is in a moment.

The problem

In languages like Python and Ada, we can initialize a list or array in one line. For instance, in Python we’d write:
l = [ i * i for i in range(n) ]
…while in Ada 2022Not in earlier standards of Ada, not even Ada 2012. we’d declare:
L: array ( 0 .. N ) of Integer := ( for I in 0 .. N => I * I );
The current edition of Rust has no way to do this; it requires two steps:
let mut l = vec![];
for i in 0..n { l.push(i*i); }
// or you could do this
let mut l = Vec::with_capacity(n);
for i in 0..n { l.push(i*i); }
There are a few ways to do it, but they all requires at least two lines: one to initialize the vector (typically with 0’s), one to fill it with the desired values. (So agreeth StackOverflow, no less.) Some Rust-familiar readers may object that one can, in fact, initialize some vectors in just one line. For instance: let l = vec!( 0, 1, 4, 9, 16 );. This is correct, so long as we know exactly how many elements we need at compile time, and don’t mind typing out their values exhaustively. However, if we don’t know the exact size at compile time, or don’t want to list the elements exhaustively, we need at least two lines.

The tool: Macros

Unlike Ada or Python, Rust offers a facility to “extend” the language, so to speak, via macros. You can think of a macro as a script that the compiler executes to replace some source code with other source code. If you’re familiar with C, then yes, it’s a lot like the C preprocessor, only safer and more powerful — just like almost everything in Rust is safer than C. In particular, Rust will hand you the relevant part of the source code’s abstract syntax tree, and you can mess with it to your heart’s content.

I needed to learn how to write macros, but the examples I found online were inadequate to the task, either trivial to be useful (looking at you, Rust By Example! and Rust manual!) or they were too complicated and unhelpful for the problem I wanted to solve.

At first I tried attacking it with a declarative macro. I had trouble with it for a while, then concluded that I had to use a procedural macro. That conclusion was wrong, as it turns out; a colleague worked it out while I was struggling with procedural macros. It wasn’t quite what I wanted, so together we got it to where I wanted.

I won’t show that result here, in part because I’m more interested in procedural macros, which are more flexible than declarative macros. Instead, I’ll show the third version of the procedural macro solution.

This may look complicated, but it really isn’t. I’ve included some comments to explain what’s going on, and will add some commentary as well.

The solution

This comes from a crate I’m writing called macros. First, the all-important macros/Cargo.toml:
[package]
name = "macros"
version = "0.1.0"
edition = "2021"

[lib]
proc-macro = true

[dependencies]
syn = { version = "1.0.92", features = [ "full" ] }
quote = "1.0.18"
Next, in macros/src/lib.rs, the macro definition:
// formula-defined vectors

// necessary to define procedural macros
extern crate proc_macro;

// importing other crates' definitions
use proc_macro::TokenStream;
use quote::quote;
use syn::{
    Expr,
    ExprClosure,
    parse::{
        Parse,
        ParseStream,
    },
    Token,
};

// this struct helps keep the information we extract from the syntax tree
struct FormVec {
    size: Expr,           // the vector's desired size
    formula: ExprClosure, // the formula defining the vector's elements
}

// this implements the `Parse` trait by implementing its required
// `parse` function; as a result, we can parse a formula vector
// in a natural, Rust-like idiom:
//    form_vec!( , |  |  )
// This construct appears several other places in the code; for instance,
// the `fold` function expects an argument of the form
//    .fold(  | ,  |  )
impl Parse for FormVec {

    fn parse(input: ParseStream) -> syn::Result {
        let size: Expr = input.parse()?;
        input.parse::()?;
        let formula: ExprClosure = input.parse()?;
        Ok(FormVec { size, formula } )
    }

}

#[proc_macro]
/// Read the form
///    form_vec!( , |  |  )
/// and create a vector of given size according to the given formula.
/// # Examples
/// ```
/// let vec = form_vec!( 5, | x | x * x );
/// assert_eq!( vec, vec!( 0, 1, 4, 9, 16 ) );
/// ```
pub fn form_vec(input: TokenStream) -> TokenStream {

    // parse `input` according to the `parse` function defined above
    let fv: FormVec = syn::parse(input).unwrap();
    // get the information
    let size = fv.size;
    let closure = fv.formula;
    let var = closure.inputs.first().unwrap();
    let var_type = closure.output;
    let formula = closure.body;

    // replace the code
    let result = TokenStream::from( quote!(
        {
            // initialize a new vector
            let mut tmp = Vec::<#var_type>::with_capacity(#size);
            // avoid initializing with useless data
            unsafe { // not unsafe! tmp is allocated w/capacity #size
                tmp.set_len(#size);
            }
            // apply formula to the entries
            for #var in 0..#size {
                tmp[#var] = #formula
            }
            tmp
        }
    ) );

    result

}

The test code

This comes from a crate called macro-testing. Again we start with macro-testing/Cargo.toml.
[package]
name = "macro_testing"
version = "0.1.0"
edition = "2021"

[dependencies]
macros = { path = "../macros" }
And now the main program in macro-testing/src/main.rs:
use macros::form_vec;

fn main() {
    let n = 10;
    let result = form_vec!( n, | x | x * x );
    assert_eq!( result, vec!( 0, 1, 4, 9, 16, 25, 36, 49, 64, 81 ) );
    println!("assertions pass!");
}
When we build and run with cargo run --release, we see this output:
$ cargo run --release
   Compiling proc-macro2 v1.0.39
   Compiling unicode-ident v1.0.0
   Compiling syn v1.0.95
   Compiling quote v1.0.18
   Compiling macros v0.1.0 (/home/cantanima/common/rust/macros)
   Compiling macro_testing v0.1.0 (/home/cantanima/common/rust/macro_testing)
    Finished release [optimized] target(s) in 9.06s
     Running `target/release/macro_testing`
assertions pass!
In addition, cargo clippy offers not one complaint. Time to celebrate!

Additional commentary

Some additional comments: