Float

#208 Jun 2026

208. f64::mul_add — One Rounding, One Instruction, Better Accuracy

a * b + c rounds twice and may compile to two separate operations. a.mul_add(b, c) computes the whole thing at full precision, rounds once, and on a CPU with FMA folds into a single instruction.

Two roundings vs. one

Floating-point math rounds after every operation. Writing a * b + c first rounds the product a * b to the nearest f64, then rounds again after adding c. That intermediate rounding throws away bits before they ever reach the sum:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
let a = 1.0_f64;
let b = 2.0_f64;
let c = 3.0_f64;

// Two roundings: (a*b) rounded, then (+ c) rounded
let separate = a * b + c;

// One rounding: a*b + c evaluated at full precision, rounded once
let fused = a.mul_add(b, c);

assert_eq!(separate, 5.0);
assert_eq!(fused, 5.0);

For tidy values they agree. The difference shows up when the product needs more bits than an f64 can hold and the addition would have recovered them — exactly the catastrophic-cancellation cases that wreck numeric code.

Where it pays off: Horner’s method

Polynomial evaluation is built out of multiply-then-add, so it’s the textbook home for mul_add. To evaluate 2x² + 3x + 4 with Horner’s method you nest the operations, and each step is one fused step:

1
2
3
4
5
6
fn poly(x: f64) -> f64 {
    // ((2*x) + 3)*x + 4
    2.0_f64.mul_add(x, 3.0).mul_add(x, 4.0)
}

assert_eq!(poly(2.0), 18.0); // 2*4 + 3*2 + 4

Each mul_add keeps full precision through the chain instead of rounding at every * and +. The same pattern carries dot products, which are a sum of products:

1
2
3
4
5
6
7
fn dot(a: &[f64], b: &[f64]) -> f64 {
    a.iter()
        .zip(b)
        .fold(0.0, |acc, (&x, &y)| x.mul_add(y, acc))
}

assert_eq!(dot(&[1.0, 2.0, 3.0], &[4.0, 5.0, 6.0]), 32.0);

The honest caveat

mul_add is faster only when the target CPU has a hardware fused-multiply-add instruction (modern x86-64 with FMA, AArch64, most others you’ll deploy to). Where it doesn’t, the standard library has to emulate the exact single-rounding semantics in software, and that emulation is slower than a plain a * b + c. So this is a hot-path tool: reach for it in tight numeric loops on hardware you control (or behind target-feature/target-cpu flags), and lean on it freely when you need the extra accuracy regardless of speed.

The bottom line

mul_add gives you a single rounding step and, on capable hardware, a single instruction for a * b + c. Use it in polynomial evaluation, dot products, and any multiply-accumulate loop where precision or throughput matters — and remember it can regress on FMA-less targets, so measure when speed is the goal.

#068 Apr 2026

float math precision std

68. f64::next_up — Walk the Floating Point Number Line

Ever wondered what the next representable floating point number after 1.0 is? Since Rust 1.86, f64::next_up and f64::next_down let you step through the number line one float at a time.

The problem

Floating point numbers aren’t evenly spaced — the gap between representable values grows as the magnitude increases. Before next_up / next_down, figuring out the next neighbor required bit-level manipulation of the IEEE 754 representation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
fn main() {
    // The hard way (before 1.86): manually decode bits
    let x: f64 = 1.0;
    let bits = x.to_bits();
    let next_bits = bits + 1;
    let next = f64::from_bits(next_bits);

    assert!(next > x);
    assert_eq!(next, 1.0000000000000002);
}

Error-prone, unreadable, and doesn’t handle edge cases like negative numbers, zero, or special values.

The clean way

next_up returns the smallest f64 greater than self. next_down returns the largest f64 less than self:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
fn main() {
    let x: f64 = 1.0;

    let up = x.next_up();
    let down = x.next_down();

    assert!(up > x);
    assert!(down < x);
    assert_eq!(up, 1.0000000000000002);
    assert_eq!(down, 0.9999999999999998);

    // There's no float between x and its neighbors
    assert_eq!(up.next_down(), x);
    assert_eq!(down.next_up(), x);
}

They handle all the edge cases you’d rather not think about — negative numbers, subnormals, and infinity:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
fn main() {
    // Works across zero
    assert_eq!(0.0_f64.next_up(), 5e-324);   // smallest positive f64
    assert_eq!(0.0_f64.next_down(), -5e-324); // largest negative f64

    // Infinity is the boundary
    assert_eq!(f64::MAX.next_up(), f64::INFINITY);
    assert_eq!(f64::INFINITY.next_up(), f64::INFINITY);

    // NaN stays NaN
    assert!(f64::NAN.next_up().is_nan());
}

Practical use: precision-aware comparisons

The gap between adjacent floats is called an ULP (unit in the last place). next_up lets you build tolerance-aware comparisons without guessing at epsilon values:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
fn almost_equal(a: f64, b: f64, max_ulps: u32) -> bool {
    if a == b { return true; }

    let mut current = a;
    for _ in 0..max_ulps {
        current = if a < b { current.next_up() } else { current.next_down() };
        if current == b { return true; }
    }
    false
}

fn main() {
    let a = 0.1 + 0.2;
    let b = 0.3;

    // They're not equal...
    assert_ne!(a, b);

    // ...but they're within 1 ULP of each other
    assert!(almost_equal(a, b, 1));
}

Also available on f32 with the same API. These methods are const fn, so you can use them in const contexts too.