Hints

#090 Apr 2026

90. black_box — Stop the Compiler From Erasing Your Benchmarks

Your benchmark ran in 0 nanoseconds? Congratulations — the compiler optimised away the code you were trying to measure. std::hint::black_box prevents that by hiding values from the optimiser.

The problem: the optimiser is too smart

The Rust compiler aggressively eliminates dead code. If it can prove a result is never used, it simply removes the computation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
fn sum_range(n: u64) -> u64 {
    (0..n).sum()
}

fn main() {
    let start = std::time::Instant::now();
    let result = sum_range(10_000_000);
    let elapsed = start.elapsed();

    // Without using `result`, the compiler may skip the entire computation
    println!("took: {elapsed:?}");

    // Force the result to be "used" so the above isn't optimised out
    assert!(result > 0);
}

In release mode, the compiler can see through this and may still optimise the loop away — or even compute the answer at compile time. Your benchmark reports near-zero time, and you learn nothing.

Enter `black_box`

std::hint::black_box takes a value and returns it unchanged, but the compiler treats it as an opaque barrier — it can’t see through it, so it can’t optimise away whatever produced that value:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
use std::hint::black_box;

fn sum_range(n: u64) -> u64 {
    (0..n).sum()
}

fn main() {
    let start = std::time::Instant::now();
    let result = sum_range(black_box(10_000_000));
    let _ = black_box(result);
    let elapsed = start.elapsed();

    println!("sum = {result}, took: {elapsed:?}");
}

Two black_box calls do the trick:

Wrap the input — prevents the compiler from constant-folding the argument
Wrap the output — prevents dead-code elimination of the computation

Before and after

Without black_box (release mode):

1
sum = 49999995000000, took: 83ns     ← suspiciously fast

With black_box (release mode):

1
sum = 49999995000000, took: 5.612ms  ← actual work

It works on any type

black_box is generic — it works on integers, strings, structs, references, whatever:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
use std::hint::black_box;

fn main() {
    // Hide a vector from the optimiser
    let data: Vec<i32> = black_box(vec![1, 2, 3, 4, 5]);
    let total: i32 = data.iter().sum();
    let total = black_box(total);

    assert_eq!(total, 15);
}

Micro-benchmark recipe

Here’s a minimal pattern for quick-and-dirty micro-benchmarks:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
use std::hint::black_box;
use std::time::Instant;

fn fibonacci(n: u32) -> u64 {
    match n {
        0 => 0,
        1 => 1,
        _ => fibonacci(n - 1) + fibonacci(n - 2),
    }
}

fn main() {
    let iterations = 100;
    let start = Instant::now();
    for _ in 0..iterations {
        black_box(fibonacci(black_box(30)));
    }
    let elapsed = start.elapsed();
    println!("{iterations} runs in {elapsed:?} ({:?}/iter)", elapsed / iterations);
}

Without black_box, the compiler could hoist the pure function call out of the loop or eliminate it entirely. With it, each iteration does real work.

When to use it

Reach for black_box whenever you’re timing code and the results look suspiciously fast. It’s also the foundation that benchmarking frameworks like criterion and the built-in #[bench] use under the hood.

It’s not a full benchmarking harness — for serious measurement you still want warmup, statistics, and outlier detection. But when you need a quick sanity check, black_box + Instant gets the job done.

Available since Rust 1.66 on stable.

#089 Apr 2026

performance hints std rust-1.95

89. cold_path — Tell the Compiler Which Branch Won't Happen

Your error-handling branch fires once in a million calls, but the compiler doesn’t know that. core::hint::cold_path lets you mark unlikely branches so the optimiser can focus on the hot path.

Why branch layout matters

Modern CPUs predict which way a branch will go and speculatively execute instructions ahead of time. When the prediction is right, execution flies. When it’s wrong, the pipeline stalls.

Compilers already try to guess which branches are hot, but they don’t always get it right — especially when both sides of an if look equally plausible from a static analysis perspective. That’s where cold_path comes in.

The basics

Call cold_path() at the start of a branch that is rarely taken:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
use std::hint::cold_path;

fn process(value: Option<u64>) -> u64 {
    if let Some(v) = value {
        v * 2
    } else {
        cold_path();
        log_miss();
        0
    }
}

fn log_miss() {
    // imagine logging or metrics here
}

The compiler now knows the else arm is unlikely. It can lay out the machine code so the hot path (the Some arm) has no jumps, keeping it in the instruction cache and the branch predictor happy.

In match expressions

cold_path works well in match arms too — mark the rare variants:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
use std::hint::cold_path;

fn handle_status(code: u16) -> &'static str {
    match code {
        200 => "ok",
        301 => "moved",
        404 => { cold_path(); "not found" }
        500 => { cold_path(); "server error" }
        _   => { cold_path(); "unknown" }
    }
}

Only the branches you expect to be common stay on the fast track.

Building `likely` and `unlikely` helpers

If you’ve used C/C++, you might miss __builtin_expect. With cold_path you can build the same thing:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
use std::hint::cold_path;

#[inline(always)]
const fn likely(b: bool) -> bool {
    if !b { cold_path(); }
    b
}

#[inline(always)]
const fn unlikely(b: bool) -> bool {
    if b { cold_path(); }
    b
}

fn check_bounds(index: usize, len: usize) -> bool {
    if unlikely(index >= len) {
        panic!("out of bounds: {} >= {}", index, len);
    }
    true
}

Now you can annotate conditions directly instead of marking individual branches.

A word of caution

Misusing cold_path on a branch that actually runs often can hurt performance — the compiler will deprioritise it, and you’ll get more pipeline stalls, not fewer. Always benchmark before sprinkling hints around. Profile first, hint second.

The bottom line

cold_path is a zero-cost, zero-argument function that tells the optimiser what you already know: this branch is the exception, not the rule. It’s a small tool, but in hot loops and latency-sensitive code, it can make a measurable difference.

Stabilised in Rust 1.95 (April 2026).