Series: The Rust Annals

Vol. I Issue 46 nlopes.dev

Announcing Rust 1.45.0

Fixes a long-standing compiler unsoundness in floating-point to integer casts and stabilizes widely requested std library APIs like char ranges and pointer methods.

Fixing unsoundness in casts

Issue 10184 was originally opened back in October of 2013, a year and a half before Rust 1.0. As you may know, rustc uses LLVM as a compiler backend. When you write code like this:

pub fn cast(x: f32) -> u8 {
    x as u8
}

The Rust compiler in Rust 1.44.0 and before would produce LLVM-IR that looks like this:

define i8 @_ZN10playground4cast17h1bdf307357423fcfE(float %x) unnamed_addr #0 {
start:
  %0 = fptoui float %x to i8
  ret i8 %0
}

That fptoui implements the cast, it is short for “floating point to unsigned integer.”

But there’s a problem here. From the docs:

The ‘fptoui’ instruction converts its floating-point operand into the nearest (rounding towards zero) unsigned integer value. If the value cannot fit in ty2, the result is a poison value.

Now, unless you happen to dig into the depths of compilers regularly, you may not understand what that means. It’s full of jargon, but there’s a simpler explanation: if you cast a floating point number that’s large to an integer that’s small, you get undefined behavior.

That means that this, for example, was not well-defined:

fn cast(x: f32) -> u8 {
    x as u8
}

fn main() {
    let f = 300.0;

    let x = cast(f);

    println!("x: {}", x);
}

On Rust 1.44.0, this happens to print “x: 0” on my machine. But it could print anything, or do anything: this is undefined behavior. But the unsafe keyword is not used within this block of code. This is what we call a “soundness” bug, that is, it is a bug where the compiler does the wrong thing. We tag these bugs as I-unsound on our issue tracker, and take them very seriously.

This bug took a long time to resolve, though. The reason is that it was very unclear what the correct path forward was.

In the end, the decision was made to do this:

  • as would perform a “saturating cast”.
  • A new unsafe cast would be added if you wanted to skip the checks.

This is very similar to array access, for example:

  • array[i] will check to make sure that array has at least i + 1 elements.
  • You can use unsafe { array.get_unchecked(i) } to skip the check.

So, what’s a saturating cast? Let’s look at a slightly modified example:

fn cast(x: f32) -> u8 {
    x as u8
}

fn main() {
    let too_big = 300.0;
    let too_small = -100.0;
    let nan = f32::NAN;

    println!("too_big_casted = {}", cast(too_big));
    println!("too_small_casted = {}", cast(too_small));
    println!("not_a_number_casted = {}", cast(nan));
}

This will print:

too_big_casted = 255
too_small_casted = 0
not_a_number_casted = 0

That is, numbers that are too big turn into the largest possible value. Numbers that are too small produce the smallest possible value (which is zero). NaN produces zero.

The new API to cast in an unsafe manner is:

let x: f32 = 1.0;
let y: u8 = unsafe { x.to_int_unchecked() };

But as always, you should only use this method as a last resort. Just like with array access, the compiler can often optimize the checks away, making the safe and unsafe versions equivalent when the compiler can prove it.

Library changes

In Rust 1.45.0, the following APIs were stabilized:

Additionally, you can use char with ranges, to iterate over codepoints:

for ch in 'a'..='z' {
    print!("{}", ch);
}
println!();
// Prints "abcdefghijklmnopqrstuvwxyz"

For a full list of changes, see the full release notes.

2013 contributors to this release.

Reproduced from the Rust blog under its publication licence. Typeset in Literata.