#150 May 2026

150. Vec::spare_capacity_mut — Fill a Vec From a Callback Without Zeroing It First

You reserve 4 KiB to read from a socket, and to hand the buffer over you first… write 4096 zeros. Vec::spare_capacity_mut exposes the reserved-but-uninitialized tail as &mut [MaybeUninit<u8>] so the callback writes straight into the allocation.

The pain: paying to overwrite

The intuitive fill-a-buffer pattern resizes the Vec first so the slice exists:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
let mut buf: Vec<u8> = Vec::with_capacity(8);
buf.resize(8, 0); // writes 8 zeros we're about to clobber

// Pretend this is `read(fd, buf.as_mut_ptr(), buf.len())`.
fill(&mut buf);

assert_eq!(buf, b"rustbite");

fn fill(out: &mut [u8]) {
    out.copy_from_slice(b"rustbite");
}

It works, but resize walks the whole tail writing zeros that the next line overwrites — a measurable cost for big reads, and pointless for types where “zero” isn’t even a valid value.

The fix: write into the uninitialized tail

Vec::spare_capacity_mut(&mut self) -> &mut [MaybeUninit<T>] hands you a slice covering exactly capacity - len slots. Write into them, then call set_len to tell the Vec they’re now initialized:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
use std::mem::MaybeUninit;

let mut buf: Vec<u8> = Vec::with_capacity(8);

// View the reserved-but-uninitialized tail.
let spare: &mut [MaybeUninit<u8>] = buf.spare_capacity_mut();
assert_eq!(spare.len(), 8);

// Write through the MaybeUninit pointer — no zeroing first.
for (slot, byte) in spare.iter_mut().zip(b"rustbite") {
    slot.write(*byte);
}

// Promise the Vec those 8 slots are now valid `u8`s.
unsafe { buf.set_len(8); }

assert_eq!(buf, b"rustbite");

spare_capacity_mut itself is safe — MaybeUninit<T> is the type that lets you hold “maybe garbage” without UB. The unsafe block is just the set_len call where you assert you really did initialize them.

Pairing with a real fill API

The standard pattern is “reserve, write, set_len”:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
use std::mem::MaybeUninit;

fn read_bytes(buf: &mut Vec<u8>, extra: usize, source: &[u8]) {
    buf.reserve(extra);

    let spare = buf.spare_capacity_mut();
    let n = extra.min(source.len()).min(spare.len());

    // Initialize the prefix we actually wrote.
    for i in 0..n {
        spare[i].write(source[i]);
    }

    // Only extend by what's been initialized.
    unsafe { buf.set_len(buf.len() + n); }
}

let mut v = vec![b'>', b' '];
read_bytes(&mut v, 8, b"rustbite");
assert_eq!(v, b"> rustbite");

The same shape works with read-style callbacks: cast the MaybeUninit<u8> slice to a raw pointer, hand it to C, and only extend len by the byte count the call returned. The bytes you didn’t write stay MaybeUninit — never read them.

When to reach for it

Reading from sockets, files, or FFI fill-style APIs into a Vec<u8> is the headline use case — every tokio and mio read path eventually bottoms out in this pattern. It’s also useful for non-Copy types where there’s no sensible default to seed with: image decoders writing Vec<Pixel>, audio decoders writing Vec<f32>, parser arenas writing Vec<Node>.

If you don’t need the spare capacity view — you’re building up element-by-element — Vec::push (or Vec::push_mut from bite 88) is still the right call. spare_capacity_mut is the tool for the moment you have an external writer that wants a flat buffer and you’d rather not pay to zero it first.

149. OnceLock::wait — Block a Thread Until Another One Initializes the Value

You have one thread loading a config and a handful of workers that can’t start until it’s ready. OnceLock::wait blocks until the value lands — no Condvar, no Mutex, no spin loop.

OnceLock<T> is a write-once cell: any number of threads can race to set it, but only the first wins. The usual reader API is get, which returns None until something has been stored:

1
2
3
4
5
6
use std::sync::OnceLock;

let cell: OnceLock<u32> = OnceLock::new();
assert_eq!(cell.get(), None);
cell.set(42).unwrap();
assert_eq!(cell.get(), Some(&42));

That’s fine when the reader can keep working without the value. But what if the reader genuinely needs it now? Before Rust 1.86 you’d reach for a Mutex<Option<T>> plus a Condvar, or spin in a loop calling get — both more code and more bugs than the problem deserves.

OnceLock::wait parks the calling thread until the cell is initialized, then hands back &T:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
use std::sync::OnceLock;
use std::thread;
use std::time::Duration;

static CONFIG: OnceLock<String> = OnceLock::new();

thread::scope(|s| {
    // Producer: pretend this is loading config from disk.
    s.spawn(|| {
        thread::sleep(Duration::from_millis(20));
        CONFIG.set("rustbites=on".into()).unwrap();
    });

    // Consumers: block until the producer is done, then read.
    for _ in 0..3 {
        s.spawn(|| {
            let cfg: &String = CONFIG.wait();
            assert_eq!(cfg, "rustbites=on");
        });
    }
});

Every consumer gets back the same &StringOnceLock only ever holds one value, so the borrow is shared and lives as long as the cell does. No cloning, no Arc wrapping.

wait plays nicely with the existing init helpers. If you have a fallible initializer that some threads might run and others just want to await, mix get_or_init with wait:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
use std::sync::OnceLock;
use std::thread;

static GREETING: OnceLock<String> = OnceLock::new();

thread::scope(|s| {
    s.spawn(|| {
        // First thread here pays the cost; others get the cached value.
        GREETING.get_or_init(|| "hello, bites".into());
    });
    s.spawn(|| {
        // This thread doesn't care who initialized — just wants the value.
        assert_eq!(GREETING.wait(), "hello, bites");
    });
});

A few things worth knowing:

  • wait blocks forever if nobody ever calls set (or get_or_init succeeds). It’s a synchronization primitive, not a timeout — pair it with thread::spawn for a producer you actually control.
  • It’s &self, so any number of threads can wait on the same cell at once.
  • OnceLock<T> requires T: Send + Sync to be shared across threads, same as Arc.

For lazy-init that runs on first read, LazyLock is still the right tool. But when initialization happens elsewhere and other threads need to pause until it’s done, wait turns a Condvar dance into one method call.

148. Result::is_ok_and — Test the Variant and the Value in One Call

Checking “is this Ok, and is the inner value positive?” usually means a match or an if let. Result::is_ok_and collapses both questions into one line.

The pattern crops up everywhere: you have a Result, and you want a bool that says “yes, it succeeded and the value passes some test”. Without help, you write this:

1
2
3
4
5
6
7
let parsed: Result<i32, &str> = "42".parse().map_err(|_| "bad input");

let big_enough = match &parsed {
    Ok(n) => *n > 10,
    Err(_) => false,
};
assert!(big_enough);

It works, but five lines for what reads as one question is a lot. Result::is_ok_and takes a closure that runs only on the Ok value, and returns false for any Err:

1
2
let parsed: Result<i32, &str> = "42".parse().map_err(|_| "bad input");
assert!(parsed.is_ok_and(|n| n > 10));

The closure receives the value by move, so it works cleanly with references too:

1
2
let res: Result<String, ()> = Ok(String::from("rustbites"));
assert!(res.as_ref().is_ok_and(|s| s.starts_with("rust")));

There’s a mirror method for the failure path. Result::is_err_and runs the predicate only on the Err value:

1
2
3
4
5
let res: Result<i32, &str> = Err("missing field: name");
assert!(res.is_err_and(|e| e.contains("name")));

let ok: Result<i32, &str> = Ok(1);
assert!(!ok.is_err_and(|_| true));  // Ok short-circuits to false

Both methods short-circuit on the wrong variant without ever calling the closure, so you can put expensive checks in the predicate without worrying about wasted work on the unhappy path.

A nice place this shines is filtering an iterator of Results:

1
2
3
4
5
6
let inputs = ["12", "hello", "7", "0", "99"];
let count = inputs
    .iter()
    .filter(|s| s.parse::<i32>().is_ok_and(|n| n > 5))
    .count();
assert_eq!(count, 3);  // "12", "7", and "99"

Same trick exists on Option as is_some_and and is_none_or — small surface area, big readability win.

147. Cell::as_array_of_cells — Mutate One Slot of a Cell-Wrapped Array

You have a Cell<[i32; 4]> and you want to bump element [2]. cell.get(), mutate the copy, cell.set(...) the whole thing back — for one slot? Cell::as_array_of_cells hands you &[Cell<i32>; 4] so each slot is its own little Cell.

The setup

Cell<T> gives you interior mutability for Copy types: &Cell<T> lets you swap the inner value through a shared reference. That’s lovely for a scalar, but the moment T is an array it becomes awkward — Cell only exposes get() and set() for the entire T:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
use std::cell::Cell;

fn main() {
    let scores = Cell::new([10, 20, 30, 40]);

    // Want to bump index 2. The natural reach...
    // scores[2] += 1;            // can't index through Cell
    // scores.get()[2] += 1;       // mutating a temporary copy — does nothing

    // The "real" old way: copy out, mutate, copy back.
    let mut arr = scores.get();
    arr[2] += 1;
    scores.set(arr);

    assert_eq!(scores.get(), [10, 20, 31, 40]);
}

Three lines and a full-array copy in each direction — just to add 1. It also doesn’t compose: if you wanted to hand a single slot to another function, you’d have to pass the whole Cell<[i32; 4]> plus an index, and trust the callee to put the array back.

Enter as_array_of_cells

Stabilized in Rust 1.91, Cell::as_array_of_cells reinterprets &Cell<[T; N]> as &[Cell<T>; N]:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
use std::cell::Cell;

fn main() {
    let scores = Cell::new([10, 20, 30, 40]);
    let slots: &[Cell<i32>; 4] = scores.as_array_of_cells();

    // Each element is now its own Cell — mutate one without touching the others.
    slots[2].set(slots[2].get() + 1);

    assert_eq!(scores.get(), [10, 20, 31, 40]);
}

No copy, no set of the whole array. The cast is free at runtime — Cell<T> is #[repr(transparent)] over T, so a Cell<[T; N]> and a [Cell<T>; N] have identical layout. The standard library just gives you the safe view of that fact.

Pair it with Cell::update

Cell::update is the obvious dance partner — read-modify-write in one call, on a single slot:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
use std::cell::Cell;

fn main() {
    let scores = Cell::new([10, 20, 30, 40]);

    for slot in scores.as_array_of_cells() {
        slot.update(|n| n * 2);
    }

    assert_eq!(scores.get(), [20, 40, 60, 80]);
}

That’s the loop you actually wanted. No RefCell, no runtime borrow check, no panic risk.

Hand out a single slot

Because each element is a real &Cell<T>, you can pass one slot to another function and let it mutate just that slot — the rest of the array is untouched and the caller keeps full access:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
use std::cell::Cell;

fn bump(slot: &Cell<i32>) {
    slot.update(|n| n + 100);
}

fn main() {
    let scores = Cell::new([10, 20, 30, 40]);
    let slots = scores.as_array_of_cells();

    bump(&slots[1]);
    bump(&slots[3]);

    assert_eq!(scores.get(), [10, 120, 30, 140]);
}

Try expressing that with cell.get() / cell.set() — you can’t, not without rebuilding the array on every call.

Slices too

There’s a sibling for unsized arrays: Cell::as_slice_of_cells turns &Cell<[T]> into &[Cell<T>]. Useful when the length isn’t known at compile time:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
use std::cell::Cell;

fn zero_out(buf: &Cell<[u8]>) {
    for slot in buf.as_slice_of_cells() {
        slot.set(0);
    }
}

fn main() {
    let buf: Cell<[u8; 5]> = Cell::new([1, 2, 3, 4, 5]);
    // &Cell<[u8; 5]> coerces to &Cell<[u8]> at the call site.
    zero_out(&buf);
    assert_eq!(buf.get(), [0, 0, 0, 0, 0]);
}

And as of Rust 1.95, both views also implement AsRef, so generic code can take impl AsRef<[Cell<T>]> and accept either form.

The signatures

1
2
3
4
5
6
7
impl<T, const N: usize> Cell<[T; N]> {
    pub const fn as_array_of_cells(&self) -> &[Cell<T>; N];
}

impl<T> Cell<[T]> {
    pub fn as_slice_of_cells(&self) -> &[Cell<T>];
}

Both are zero-cost reinterpretations — pure type-system moves, no copying. Reach for them any time you find yourself doing the get / mutate / set two-step on a Cell that wraps a collection.

#146 May 2026

146. char::MAX_LEN_UTF8 — Size UTF-8 Buffers Without Magic Numbers

Every time you’ve called char::encode_utf8, you’ve written [0u8; 4] from memory. Rust 1.93 stabilises char::MAX_LEN_UTF8 so you don’t have to keep that magic number in your head.

The magic number you keep typing

encode_utf8 writes the UTF-8 bytes of a char into a &mut [u8] and returns a &mut str pointing at the written portion. The slice has to be big enough — which means knowing that the worst-case UTF-8 encoding is 4 bytes:

1
2
3
let mut buf = [0u8; 4]; // why 4? because UTF-8, that's why
let s = '🦀'.encode_utf8(&mut buf);
assert_eq!(s, "🦀");

That 4 is correct but unexplained. Anyone reading your code has to either trust you or go re-derive the UTF-8 spec.

The named version

Rust 1.93 stabilises two constants on char:

1
2
assert_eq!(char::MAX_LEN_UTF8, 4);
assert_eq!(char::MAX_LEN_UTF16, 2);

MAX_LEN_UTF8 is the maximum number of u8s encode_utf8 can ever write. MAX_LEN_UTF16 is the same for encode_utf16 (a surrogate pair = 2 u16s). Drop them straight into your buffer declarations:

1
2
3
4
5
6
7
8
let mut buf = [0u8; char::MAX_LEN_UTF8];
let s = '🦀'.encode_utf8(&mut buf);
assert_eq!(s, "🦀");
assert_eq!(s.len(), 4);

let mut wide = [0u16; char::MAX_LEN_UTF16];
let w = '🦀'.encode_utf16(&mut wide);
assert_eq!(w.len(), 2);

Same behaviour, but the intent is self-documenting — the buffer is sized to hold exactly one char, by definition.

Sizing a buffer for N chars

Where this really pays off is when you’re computing a buffer for several chars on the stack:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
const N: usize = 8;
let mut buf = [0u8; N * char::MAX_LEN_UTF8];

let mut pos = 0;
for c in ['h', 'é', 'l', 'l', 'o'] {
    let s = c.encode_utf8(&mut buf[pos..]);
    pos += s.len();
}

assert_eq!(&buf[..pos], "héllo".as_bytes());

Now if Unicode ever expanded its scalar value range and MAX_LEN_UTF8 grew, your code would still be correct. With a hardcoded 4, you’d have a silent buffer overflow waiting to happen the day someone bumps the constant.

Why bother?

It’s a small change — one constant, no new behaviour. But it kills a real source of off-by-one bugs (people writing [0u8; 3] because they “only handle Latin-1”) and makes UTF-8 buffer code legible at a glance. Available since Rust 1.93 (January 2026).

#145 May 2026

145. Duration::from_nanos_u128 — Round-Trip Nanoseconds Without the u64 Cast

Duration::as_nanos() hands you a u128. Duration::from_nanos() takes a u64. You feed one into the other and the compiler yells at you — or worse, you cast and quietly truncate at 584 years. Rust 1.93 closed the loop with from_nanos_u128.

The mismatched-types papercut

The old API was asymmetric. Going from Duration to nanos was 128-bit:

1
2
3
4
5
use std::time::Duration;

let d = Duration::new(7, 250);
let n: u128 = d.as_nanos();
assert_eq!(n, 7_000_000_250);

Coming back, though, you only got from_nanos(_: u64) — so the round-trip needed a cast:

1
2
3
4
5
use std::time::Duration;

let n: u128 = Duration::new(7, 250).as_nanos();
let back = Duration::from_nanos(n as u64); // narrowing cast, fingers crossed
assert_eq!(back, Duration::new(7, 250));

That as u64 silently truncates anything past u64::MAX — and u64::MAX nanoseconds is roughly 584 years. Inside a calendar app you’ll never notice. Inside a scientific or simulation context, you absolutely will.

from_nanos_u128 matches as_nanos

Rust 1.93 stabilised Duration::from_nanos_u128, a const fn that takes the full 128-bit value:

1
2
3
4
5
use std::time::Duration;

let n: u128 = Duration::new(7, 250).as_nanos();
let back = Duration::from_nanos_u128(n);
assert_eq!(back, Duration::new(7, 250));

Same shape on both sides. No cast, no truncation, no silent wraparound.

Past the 584-year ceiling

Where the new constructor actually earns its keep is when you have nanoseconds counts that wouldn’t fit in a u64:

1
2
3
4
5
6
7
8
9
use std::time::Duration;

// 10^24 ns is ~31.7 million years — well past u64::MAX nanos
let nanos: u128 = 10_u128.pow(24) + 321;
let d = Duration::from_nanos_u128(nanos);

assert_eq!(d.as_secs(), 10_u64.pow(15));
assert_eq!(d.subsec_nanos(), 321);
assert_eq!(d.as_nanos(), nanos); // exact round-trip

Duration itself stores (u64 seconds, u32 nanos), so it has plenty of room — the old from_nanos was just bottlenecked by its argument type.

One thing to watch

from_nanos_u128 panics if you hand it more than Duration::MAX worth of nanoseconds. If you’re pulling values from user input or untrusted sources, guard the upper bound yourself — there isn’t a checked_from_nanos_u128 (yet).

When to reach for it

Use from_nanos_u128 whenever you already have a u128 of nanoseconds — typically because it came out of as_nanos, an arithmetic accumulator, or a high-precision external clock. Stick with the plain from_nanos(_: u64) for short-lived timeouts and durations measured in milliseconds or seconds; the u64 is plenty.

Stabilised in Rust 1.93 (January 2026). Available as const fn, so it works in const contexts too.

#143 May 2026

143. Vec::dedup_by_key — Collapse Consecutive Duplicates by a Derived Key

Vec::dedup() only collapses runs that are exactly equal. When you care about a derived attribute — the minute on a timestamp, the domain in an email, the first letter of a word — reach for dedup_by_key.

The plain dedup() is strict: two adjacent elements are only merged if == says so.

1
2
3
let mut nums = vec![1, 1, 2, 3, 3, 3, 2, 5];
nums.dedup();
assert_eq!(nums, vec![1, 2, 3, 2, 5]);

But often “duplicate” really means “shares some property with its neighbour.” dedup_by_key takes a closure that maps each element to a key, then keeps the first of every consecutive run whose keys match:

1
2
3
let mut nums = vec![1, 3, 5, 2, 4, 7, 6, 8];
nums.dedup_by_key(|n| *n % 2);
assert_eq!(nums, vec![1, 2, 7, 6]);

1, 3, 5 all have key 1 → keep 1. Then 2, 4 have key 0 → keep 2. Then 7 has key 1 → keep it. Then 6, 8 have key 0 → keep 6.

The practical case: log lines already in time order, and you want one representative per minute.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
let mut lines = vec![
    "12:01 GET /a".to_string(),
    "12:01 GET /b".to_string(),
    "12:01 POST /c".to_string(),
    "12:02 GET /d".to_string(),
    "12:02 GET /e".to_string(),
    "12:03 PATCH /f".to_string(),
];

lines.dedup_by_key(|line| line[..5].to_string());

assert_eq!(lines.len(), 3);
assert_eq!(lines[0], "12:01 GET /a");
assert_eq!(lines[1], "12:02 GET /d");
assert_eq!(lines[2], "12:03 PATCH /f");

Two things worth remembering. First, it only looks at adjacent pairs — if you need full uniqueness, pair it with sort_by_key first. Second, if your equivalence isn’t expressible as a key (e.g. “values within 0.1 of each other”), there’s a sibling dedup_by that takes |a, b| -> bool directly. All three are in-place, allocation-free, and run in linear time.

#144 May 2026

144. Vec::into_raw_parts — Hand a Vec to C Without the ManuallyDrop Dance

You want to give a Rust-allocated buffer to C and re-take it later. That means handing over (ptr, len, capacity) — and historically, prying those three out of a Vec without freeing the allocation meant wrapping the vector in ManuallyDrop first. Rust 1.93 stabilises Vec::into_raw_parts, a single safe call that returns the triple and consumes the Vec for you.

The pain: extracting parts while suppressing drop

The classic recipe leaks the Vec’s destructor on purpose so the C side owns the memory. You need three reads and a guard to keep Drop from racing the allocator:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
use std::mem::ManuallyDrop;

let v: Vec<u32> = vec![10, 20, 30];

let mut me = ManuallyDrop::new(v);
let ptr = me.as_mut_ptr();
let len = me.len();
let cap = me.capacity();

assert_eq!(unsafe { *ptr.add(1) }, 20);
assert_eq!((len, cap), (3, 3));

// Hand (ptr, len, cap) to C here.
// Reclaim it later with Vec::from_raw_parts to free the allocation.
let _reclaimed = unsafe { Vec::from_raw_parts(ptr, len, cap) };

It works, but the ManuallyDrop wrapper exists only to keep the destructor from running. Forget it, write mem::forget(v) in the wrong order, or read capacity() after the move and you’ve got a use-after-free or a leak.

The fix: one safe call, three return values

Vec::into_raw_parts(self) -> (*mut T, usize, usize) consumes the Vec, hands you the pointer-length-capacity triple, and leaves the allocation alive for you to manage:

1
2
3
4
5
6
7
8
9
let v: Vec<u32> = vec![10, 20, 30];
let (ptr, len, cap) = v.into_raw_parts();

assert_eq!((len, cap), (3, 3));
assert_eq!(unsafe { *ptr.add(1) }, 20);

// Reclaim and free at the end (or hand to C and have C call back).
let reclaimed = unsafe { Vec::from_raw_parts(ptr, len, cap) };
assert_eq!(reclaimed, vec![10, 20, 30]);

No wrapper, no separate field reads, no chance of accidentally calling a &self method after the move. The method is const, too.

String::into_raw_parts follows the same shape

String gets the same treatment in 1.93. The triple is (*mut u8, usize, usize), which is what String::from_raw_parts wants back:

1
2
3
4
5
6
7
let s = String::from("hello");
let (ptr, len, cap) = s.into_raw_parts();

assert_eq!((len, cap), (5, 5));

let rebuilt = unsafe { String::from_raw_parts(ptr, len, cap) };
assert_eq!(rebuilt, "hello");

The pairing is the point: into_raw_parts is safe (the Vec/String is gone, no aliasing exists yet), and from_raw_parts is unsafe (you’re asserting the triple came from a matching allocator with the right layout). The split keeps the unsafety where it actually lives.

When to reach for it

Any FFI boundary where the C side will hold the buffer for a while: graphics buffers, codec frames, command queues, anything with an extern "C" fn free_my_thing(ptr, len, cap) callback. Also handy when you’re building your own typed handles around a raw allocation — Box::into_raw covers the single-value case; into_raw_parts covers the variable-length one.

If you only need the pointer and nothing will ever reclaim the allocation, Vec::leak is still the shorter call. Reach for into_raw_parts the moment the capacity matters — i.e. anyone, anywhere, might want to give the memory back.

142. Path::absolute — Make a Path Absolute Without Touching the Filesystem

Need an absolute path for a log line, an error message, or a “files will land here” preview — but the file might not exist yet? fs::canonicalize will refuse. std::path::absolute (stable since Rust 1.79) gives you the absolute form without ever opening the disk.

The canonicalize trap

The instinctive choice for “turn this into a full path” is fs::canonicalize. It works — until it doesn’t:

1
2
3
4
use std::fs;

let p = fs::canonicalize("does_not_exist.toml");
assert!(p.is_err()); // canonicalize requires the path to exist

It also resolves symlinks and walks every .. component against the real directory tree. That’s the right behaviour for finding a file. It’s wrong for printing one back to the user before you’ve written it.

path::absolute does the syntactic thing

std::path::absolute joins a relative path with the current working directory and normalises the result. No syscalls beyond looking up the CWD; the file doesn’t have to exist:

1
2
3
4
5
use std::path::absolute;

let p = absolute("config/app.toml").unwrap();
assert!(p.is_absolute());
// e.g. "/work/config/app.toml" — without ever opening anything

If the path is already absolute it’s left alone (modulo platform-specific normalisation). .. components are resolved syntactically, without consulting the filesystem for what each directory really is.

Useful for nicely-formatted output

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
use std::path::{absolute, PathBuf};

fn describe(relative: &str) -> String {
    let abs: PathBuf = absolute(relative).unwrap();
    format!("writing to {}", abs.display())
}

let msg = describe("logs/today.log");
assert!(msg.contains("logs/today.log"));
assert!(msg.starts_with("writing to "));

When you’re echoing the user’s choices back to them, or building helpful error messages, this is usually what you want — the path they meant, not whatever the filesystem turned it into.

When to reach for it

Use path::absolute for log lines, config previews, default-location calculations, or any “this is where it will go” message about a file that might not exist yet. Stick with fs::canonicalize when you actually want to follow symlinks and prove the file exists — that’s its job.

Stabilised in Rust 1.79 (June 2024).

141. BinaryHeap::into_sorted_vec — Heapsort in One Call

You stuffed everything into a BinaryHeap to keep “biggest first” cheap, but at the end of the day you want a sorted Vec to hand to the next stage. The pop-loop you almost wrote is built into the type — into_sorted_vec consumes the heap and gives you the ascending-order Vec for free.

The pop-loop

The naive shape: drain the heap with pop and push into a fresh Vec. Pops come out largest-first, so to get ascending order you have to either reverse at the end or push to the front — both add steps for no reason.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
use std::collections::BinaryHeap;

let heap = BinaryHeap::from([3, 1, 4, 1, 5, 9, 2, 6, 5]);

let mut out = Vec::with_capacity(heap.len());
let mut heap = heap;
while let Some(x) = heap.pop() {
    out.push(x);
}
out.reverse(); // pops came out descending — flip them

assert_eq!(out, [1, 1, 2, 3, 4, 5, 5, 6, 9]);

Five lines and a temporary out for what is, in the end, “sort this thing.” The heap is already a heap — you’ve paid for the structure, now you’re throwing it away.

The one-liner

BinaryHeap::into_sorted_vec(self) -> Vec<T> does exactly the drain-and-sort the heap was built for, in O(n log n), and reuses the heap’s allocation as the output Vec. No reverse(), no spare buffer.

1
2
3
4
5
6
use std::collections::BinaryHeap;

let heap = BinaryHeap::from([3, 1, 4, 1, 5, 9, 2, 6, 5]);
let sorted = heap.into_sorted_vec();

assert_eq!(sorted, [1, 1, 2, 3, 4, 5, 5, 6, 9]);

Ascending order, because that’s almost always what the next consumer wants. BinaryHeap is a max-heap, so internally into_sorted_vec repeatedly sifts the max to the end of the backing buffer — the same in-place heapsort you’d write by hand.

Top-k without sorting the whole input

Where this really pays off: “I want the largest k of n items.” Push everything into the heap with Reverse to make it a min-heap-of-size-k, then call into_sorted_vec once at the end:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
use std::cmp::Reverse;
use std::collections::BinaryHeap;

fn top_k(items: impl IntoIterator<Item = i32>, k: usize) -> Vec<i32> {
    let mut heap: BinaryHeap<Reverse<i32>> = BinaryHeap::with_capacity(k);
    for x in items {
        if heap.len() < k {
            heap.push(Reverse(x));
        } else if let Some(mut min) = heap.peek_mut() {
            if x > min.0 { *min = Reverse(x); }
        }
    }
    heap.into_sorted_vec()
        .into_iter()
        .map(|Reverse(x)| x)
        .collect()
}

assert_eq!(top_k([7, 3, 9, 1, 8, 2, 6], 3), [9, 8, 7]);

into_sorted_vec returns the Reverse-wrapped items in ascending Reverse order, which is descending by inner value — strip the wrapper with map and the largest of the top-k comes out first, exactly the order a “top” list wants.

When to reach for it

Any time the loop you’re about to write is “pop until empty, collect into Vec.” into_sorted_vec is the same algorithm — heapsort — with one fewer allocation and one fewer reverse. The heap was already half of a sort; let it finish the job.