#204 Jun 2026

204. take_while / skip_while — Act on the Leading Run, Not Every Match

Reaching for .filter() to drop leading blank lines? It’ll drop the blank lines in the middle too. take_while and skip_while work on the leading run and stop the moment the predicate flips.

The Problem

You want everything after the leading comment/blank lines in a config — but filter has no concept of “leading”:

1
2
3
4
5
6
7
8
9
let lines = ["# header", "", "key = 1", "", "key = 2"];

// Wrong: this also eats the blank line between the two keys
let body: Vec<&str> = lines
    .iter()
    .copied()
    .filter(|l| !l.is_empty() && !l.starts_with('#'))
    .collect();
assert_eq!(body, ["key = 1", "key = 2"]); // lost the structure

filter tests every element independently, so it strips matches wherever they appear. That’s rarely what “skip the header” means.

The Fix: skip_while

skip_while discards elements while the predicate holds, then yields the rest untouched — including later elements that would have matched:

1
2
3
4
5
6
7
8
let lines = ["# header", "", "key = 1", "", "key = 2"];

let body: Vec<&str> = lines
    .iter()
    .copied()
    .skip_while(|l| l.is_empty() || l.starts_with('#'))
    .collect();
assert_eq!(body, ["key = 1", "", "key = 2"]); // blank line kept

The blank line between the keys survives because skip_while already stopped skipping at "key = 1".

Its Mirror: take_while

take_while yields the leading run and stops at the first non-match — perfect for parsing a prefix:

1
2
3
4
5
6
7
let input = "42px";

let digits: String = input.chars().take_while(|c| c.is_ascii_digit()).collect();
let unit: String = input.chars().skip_while(|c| c.is_ascii_digit()).collect();

assert_eq!(digits, "42");
assert_eq!(unit, "px");

take_while halts at 'p', so even a trailing "9" in the unit wouldn’t sneak back into digits. Unlike filter, both adapters care about position: they describe the boundary between a leading run and everything after it.

203. Peekable::next_if_map — Consume a Token Only If It Parses, Transform in One Step

next_if only answers yes/no, so when you also need the converted value you end up peeking, computing, and calling next() by hand. Rust 1.94 stabilized Peekable::next_if_map — conditionally consume the next item and transform it in a single call, putting the item back if it doesn’t match.

The trap: the peek / compute / advance dance

Hand-rolled lexers are full of this pattern — look at the next item, decide whether it’s the kind you want, and only then consume it. With next_if you can express the decide part, but next_if hands you back the original item, so you have to redo the conversion afterward. Most people skip it and drop down to a manual peek() + next():

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
use std::iter::Peekable;
use std::str::Chars;

// Peek, compute the digit, THEN remember to advance. Three steps,
// and it's easy to forget the next() and loop forever.
fn take_digit_manual(it: &mut Peekable<Chars>) -> Option<u32> {
    let &c = it.peek()?;
    let d = c.to_digit(10)?;
    it.next();
    Some(d)
}

The conversion (to_digit) and the consumption (next) are split across separate lines, and the iterator only advances as a side effect. Forget the it.next() and you’ve written an infinite loop.

The fix: decide and transform in one call

next_if_map takes the next item by value and hands it to a closure returning Result<R, Item>. Return Ok(value) and the item is consumed, giving you Some(value); return Err(item) and the item is pushed back, giving you None. The classic conversion-or-give-back is just .ok_or(c):

1
2
3
4
5
6
use std::iter::Peekable;
use std::str::Chars;

fn take_digit(it: &mut Peekable<Chars>) -> Option<u32> {
    it.next_if_map(|c| c.to_digit(10).ok_or(c))
}

One line, no manual next(), and the “advance only on success” rule is enforced by the method instead of by you remembering to call it.

It really does put the item back

When the closure returns Err, the iterator is left exactly where it was — the next read still sees that item:

1
2
3
4
5
6
7
8
9
# use std::iter::Peekable;
# use std::str::Chars;
# fn take_digit(it: &mut Peekable<Chars>) -> Option<u32> {
#     it.next_if_map(|c| c.to_digit(10).ok_or(c))
# }
let mut it = "px".chars().peekable();

assert_eq!(take_digit(&mut it), None); // 'p' isn't a digit...
assert_eq!(it.next(), Some('p'));      // ...so it's still here

That give-it-back guarantee is what makes it safe to chain in a loop: each call either makes progress or leaves the stream untouched for the next rule to try.

Where it shines: tokenizing

A digit-run parser becomes a tight while let that stops cleanly at the first non-digit, leaving the rest of the input for whatever comes next:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fn parse_number(s: &str) -> (u64, String) {
    let mut it = s.chars().peekable();
    let mut n = 0u64;
    while let Some(d) = it.next_if_map(|c| c.to_digit(10).ok_or(c)) {
        n = n * 10 + d as u64;
    }
    (n, it.collect()) // leftover chars, untouched
}

assert_eq!(parse_number("42px"), (42, "px".to_string()));
assert_eq!(parse_number("2026"), (2026, String::new()));

There’s also next_if_map_mut, which passes &mut Item and takes a closure returning Option<R> — handy when the item is expensive to move or you want to mutate it in place rather than hand it back.

The bottom line

When you only want the next item if it converts to something useful, reach for next_if_map instead of the peek / compute / next shuffle. It folds the test and the transform into one call and guarantees the iterator only advances when the conversion succeeds — exactly the invariant hand-written lexers keep getting wrong.

202. Arc::clone Is a Refcount Bump, Not a Deep Copy — Share Big Data, Don't Duplicate It

big.clone() on a 50MB lookup table allocates 50MB every time a worker needs a copy. Wrap it in an Arc once and Arc::clone is just an atomic +1 on a counter — every owner reads the same bytes.

This closes out performance week, the afternoon pair to the morning’s entry() bite: that one was about not building a default you’ll throw away, this one is about not copying a payload you only ever read.

The trap: .clone() deep-copies the payload

When several owners each need “their own” handle to a large immutable value, the obvious move is to clone it. But Clone on a Vec/String/HashMap walks the data and allocates a fresh copy:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
let table: Vec<u64> = (0..1_000_000).collect();

// Four "owners" — four full heap allocations, ~32MB copied.
let a = table.clone();
let b = table.clone();
let c = table.clone();

assert_eq!(a.len(), 1_000_000);
assert_eq!(b[500_000], 500_000);
assert_eq!(c.last(), Some(&999_999));

Nobody mutates the data — they just read it — yet you paid for four independent copies. That’s pure waste.

The fix: one allocation, shared by reference count

Put the value behind an Arc<T> once. Now Arc::clone doesn’t touch the payload at all — it bumps an atomic reference count and hands back another pointer to the same allocation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
use std::sync::Arc;

let table: Arc<Vec<u64>> = Arc::new((0..1_000_000).collect());

let a = Arc::clone(&table);
let b = Arc::clone(&table);
let c = Arc::clone(&table);

// All four point at the SAME bytes — no data was copied.
assert_eq!(Arc::strong_count(&table), 4);
assert_eq!(a[500_000], 500_000);

// Proof it's one allocation, not four:
assert!(Arc::ptr_eq(&a, &b));
assert!(Arc::ptr_eq(&b, &c));

The Arc::clone(&x) spelling (rather than x.clone()) is a convention worth keeping: at the call site it reads as “bump the counter,” so a reviewer knows a cheap pointer copy happened, not a 32MB memcpy.

This is what makes it cheap to send to threads

The same property is why Arc is the building block for sharing immutable data across threads — each thread gets a counted handle, all reading one copy:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
use std::sync::Arc;
use std::thread;

let config: Arc<Vec<u64>> = Arc::new((0..1_000).collect());

let handles: Vec<_> = (0..4)
    .map(|_| {
        let c = Arc::clone(&config); // refcount bump, then moved into the thread
        thread::spawn(move || c.iter().sum::<u64>())
    })
    .collect();

let total: u64 = handles.into_iter().map(|h| h.join().unwrap()).sum();
assert_eq!(total, 499_500 * 4);

Four threads, one underlying Vec. Cloning the payload into each thread would have been four allocations; Arc::clone is four counter bumps.

When not to reach for it

Arc shines for data that’s large, shared, and read-only after construction. It’s not free: every clone and drop is an atomic operation, and the payload lives until the last handle goes away. For small Copy types (bite-200) a plain copy is cheaper than the atomic traffic. And if you need to mutate shared state, you want Arc<Mutex<T>> or Arc<RwLock<T>> (bite-160) — or Arc::make_mut (bite-113) for copy-on-write.

The bottom line

If you’re cloning a big value just to hand read-only access to several owners or threads, you’re copying bytes nobody changes. Wrap it in Arc once: every Arc::clone after that is an atomic increment over a shared allocation, not a deep copy.

201. or_insert vs or_insert_with — Don't Build a Default You'll Throw Away

map.entry(k).or_insert(expensive()) builds expensive() on every call — even when the key is already there and the value gets dropped on the floor. Reach for or_insert_with and the default is computed only when it’s actually needed.

The entry API already saves you the contains_key-then-insert double lookup. But there’s a second, quieter cost hiding in or_insert: its argument is an ordinary value, so it’s evaluated before the call, regardless of whether the slot is occupied or vacant.

1
2
3
4
5
6
7
8
9
use std::collections::HashMap;

let mut cache: HashMap<&str, String> = HashMap::new();
cache.insert("hit", "already here".to_string());

// "expensive default".to_string() allocates a fresh String here...
// ...then gets immediately discarded because "hit" is occupied.
cache.entry("hit").or_insert("expensive default".to_string());
assert_eq!(cache["hit"], "already here");

That throwaway allocation happens on every hit. In a hot loop over a mostly-populated map, you’re paying to construct defaults you never store.

or_insert_with takes a closure instead of a value, so the work is deferred until the entry is genuinely vacant:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
use std::collections::HashMap;

let mut map: HashMap<&str, String> = HashMap::new();
map.insert("a", "existing".to_string());

let mut builds = 0;

// "a" is present, so the closure never runs.
map.entry("a").or_insert_with(|| { builds += 1; "default".to_string() });
assert_eq!(builds, 0);

// "b" is vacant, so the closure runs exactly once.
map.entry("b").or_insert_with(|| { builds += 1; "default".to_string() });
assert_eq!(builds, 1);

assert_eq!(map["a"], "existing");
assert_eq!(map["b"], "default");

The rule of thumb: if the default is a plain literal or a cheap Copy value (0, false, None), or_insert is fine and reads cleaner. The moment the default allocates or computes — a Vec::new(), a String, a hash of the key, a database handle — switch to or_insert_with.

When the default depends on the key itself, or_insert_with_key hands the key to the closure so you don’t have to capture it:

1
2
3
4
5
6
7
8
9
use std::collections::HashMap;

let mut sizes: HashMap<String, usize> = HashMap::new();

let n = sizes
    .entry("hello".to_string())
    .or_insert_with_key(|k| k.len());

assert_eq!(*n, 5);

All three still cost a single hash and a single lookup — the entry lands on the slot once and hands you a &mut V. The only thing you’re choosing is when the default gets built: always, or only when it’s needed.

#200 Jun 2026

200. #[derive(Copy)] and #[inline] — Make Small Types Free to Pass Around

A two-field struct that you .clone() everywhere, behind a function the optimizer won’t inline across crates — that’s two small taxes you can stop paying. Copy deletes the move/drop bookkeeping, #[inline] lets the body fold into the caller.

This is the afternoon half of a pair with the morning’s static-dispatch bite: both are about handing the optimizer a body it can actually see through and fold into the caller.

Copy: tiny types shouldn’t need moving

For a small, plain-data struct — a couple of integers, a pair of floats — a move is just a memcpy. But without Copy, the value is moved out of its binding when you pass it, so you can’t use it again afterward, and the compiler tracks drop state for it. Derive Copy and it’s duplicated bit-for-bit instead, no move semantics, no drop glue:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#[derive(Copy, Clone, Debug, PartialEq)]
struct Point {
    x: i32,
    y: i32,
}

fn shift(p: Point, dx: i32) -> Point {
    Point { x: p.x + dx, y: p.y }
}

let a = Point { x: 1, y: 2 };
let b = shift(a, 10);
// `a` is still usable — it was copied, not moved
assert_eq!(a, Point { x: 1, y: 2 });
assert_eq!(b, Point { x: 11, y: 2 });

Without Copy that shift(a, ..) would move a, and the later assert_eq!(a, ..) wouldn’t compile. You’d reach for .clone() or & borrows to work around it — friction for a type that’s cheaper to copy than a pointer.

The rule of thumb: derive Copy when the type is small (fits in a register or two) and has no heap-owning fields. String, Vec, and Box can’t be Copy — only Clone — because duplicating them bit-for-bit would alias an owned allocation.

#[inline]: the optimizer can’t see across crate walls

Within a single crate the compiler inlines freely based on its own cost model. The catch is the crate boundary: a normal (non-generic) function compiled into your library is just a symbol other crates call into. Without LTO, the calling crate only sees the signature, not the body — so it emits a real call and the optimizer can’t fold a one-line helper into the loop around it.

#[inline] ships the function’s body in the crate metadata so downstream crates can inline it:

1
2
3
4
5
6
#[inline]
pub fn lerp(a: f32, b: f32, t: f32) -> f32 {
    a + (b - a) * t
}

assert_eq!(lerp(0.0, 10.0, 0.5), 5.0);

This matters for exactly the small, hot, public functions where the call overhead rivals the work: accessors, newtype getters, math helpers, Iterator glue. Generic functions and Copy-type constructors are already inlinable across crates (their code is monomorphized at the call site), so you mostly need #[inline] for concrete, non-generic ones.

#[inline(always)] is the stronger hammer — it overrides the cost model. Reach for it rarely, only for trivial wrappers you’ve measured, because over-inlining bloats code size and can evict the instruction cache.

Putting both on a newtype wrapper

The combination shows up constantly on zero-cost newtypes — make the value free to pass and free to call through:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#[derive(Copy, Clone, Debug, PartialEq)]
struct Meters(f64);

impl Meters {
    #[inline]
    fn as_feet(self) -> f64 {
        self.0 * 3.28084
    }
}

let d = Meters(100.0);
let ft = d.as_feet();
// `d` is Copy, so taking `self` by value didn't consume it
assert_eq!(d, Meters(100.0));
assert!((ft - 328.084).abs() < 1e-9);

as_feet takes self by value with zero guilt because Meters is Copy, and #[inline] means a downstream crate calling .as_feet() in a tight loop gets the multiply folded in directly instead of a function call.

The bottom line

For small plain-data types, #[derive(Copy)] removes move/drop overhead and the ergonomic friction that pushes you toward needless .clone()s. For small public functions, #[inline] gives downstream crates the body to fold into their own code. Both are about the same thing as static dispatch: never make the optimizer guess at something you could just hand it. Profile first — but these two are cheap wins on the hot path.

199. Static Dispatch — Generics Beat Box<dyn Trait> When You Can Afford the Code

Box<dyn Trait> is the reflex when a function “takes something that implements a trait.” But every call through it pays for a vtable hop the compiler can’t see past. Swap it for a generic and the optimizer inlines the whole thing.

This is the morning half of a pair with this afternoon’s #[inline] & Copy bite: both are about giving the optimizer a body it can actually fold into the caller.

The cost: Box<dyn Trait> hides the call from the optimizer

When you accept Box<dyn Fn(i32) -> i32> (or any dyn Trait), the concrete type is erased. At each call site the program loads a function pointer from a vtable and jumps through it. The compiler has no idea what’s on the other end, so it can’t inline the body, can’t constant-fold through it, can’t vectorize a loop around it:

1
2
3
fn apply_all(f: Box<dyn Fn(i32) -> i32>, xs: &[i32]) -> Vec<i32> {
    xs.iter().map(|&x| f(x)).collect() // indirect call every iteration
}

There’s also a heap allocation just to hold the closure, and the pointer-chase ruins instruction-cache locality in a hot loop.

The fix: a generic parameter monomorphizes to the real type

Take impl Fn (sugar for a generic) instead. The compiler stamps out a specialized copy of apply_all for each concrete f you pass — monomorphization. Inside that copy the closure’s body is fully visible, so it gets inlined and the map loop optimizes as if you’d written the arithmetic by hand:

1
2
3
fn apply_all<F: Fn(i32) -> i32>(f: F, xs: &[i32]) -> Vec<i32> {
    xs.iter().map(|&x| f(x)).collect() // direct call, inlinable
}

No box, no vtable, no allocation. impl Trait in argument position is the same thing with less typing:

1
2
3
fn apply_all(f: impl Fn(i32) -> i32, xs: &[i32]) -> Vec<i32> {
    xs.iter().map(|&x| f(x)).collect()
}
1
2
let doubled = apply_all(|x| x * 2, &[1, 2, 3]);
assert_eq!(doubled, vec![2, 4, 6]);

Both generic versions compile to a tight loop with the multiply spliced straight in.

The same trick for returns: impl Trait instead of Box<dyn>

Returning a dyn value forces a box too. If the function only ever returns one concrete type, impl Trait keeps it static — the caller gets the real type and can inline through it:

1
2
3
4
5
6
fn adder(n: i32) -> impl Fn(i32) -> i32 {
    move |x| x + n          // one concrete closure type, no Box
}

let add5 = adder(5);
assert_eq!(add5(10), 15);

When dyn is still the right call

Static dispatch trades code size for speed: each instantiation is a fresh copy, so monomorphizing over many types bloats the binary. And generics can’t do heterogeneous collections — Vec<Box<dyn Draw>> holding circles and squares genuinely needs dynamic dispatch, because the element type varies at runtime. Reach for dyn when you need a uniform type for mixed values, want to shrink compile times and binary size, or the call isn’t hot enough to matter. Reach for generics / impl Trait when the call sits in a loop and you want the optimizer to see through it.

1
2
3
4
5
// heterogeneous → dyn is correct
let shapes: Vec<Box<dyn Fn() -> &'static str>> =
    vec![Box::new(|| "circle"), Box::new(|| "square")];
let names: Vec<&str> = shapes.iter().map(|s| s()).collect();
assert_eq!(names, vec!["circle", "square"]);

Rule of thumb: default to a generic, and downgrade to dyn only when you have a reason — mixed types, code-size pressure, or a cold path where the indirection is free.

198. into_iter() to Transform — Move Owned Items Instead of cloned()

You have a Vec you’re about to throw away, and you want a transformed one. Reaching for iter().cloned() (or iter().map(|x| x.clone())) duplicates every element on the way out — but you owned them already. into_iter() moves them straight through.

This is the afternoon half of this morning’s mem::replace bite: both are about moving owned data forward instead of copying it. There, it was an enum behind &mut self; here, it’s the elements of a collection.

The trap: iter().cloned() on a collection you’re discarding

You want to uppercase a list of names. The list is a local you won’t touch again:

1
2
3
4
5
6
fn shout(names: Vec<String>) -> Vec<String> {
    names
        .iter()                       // yields &String
        .map(|s| s.to_uppercase())    // to_uppercase already allocates a fresh String...
        .collect()
}

That one’s not even the worst case — to_uppercase builds a new String regardless. The real waste shows up when the transform keeps the value and you clone just to own it:

1
2
3
4
5
6
7
fn tag(names: Vec<String>) -> Vec<(String, usize)> {
    names
        .iter()                                  // &String
        .enumerate()
        .map(|(i, s)| (s.clone(), i))            // clone purely to own it
        .collect()
}

Every s.clone() heap-allocates a duplicate of a string you were about to drop. The original names gets freed on return — you paid to copy bytes that were headed for the incinerator.

The fix: into_iter() consumes the collection and hands you owned items

into_iter() on a Vec<String> yields String by value, not &String. The transform now moves each element — no clone, no second allocation:

1
2
3
4
5
6
7
fn tag(names: Vec<String>) -> Vec<(String, usize)> {
    names
        .into_iter()                 // yields String, owned
        .enumerate()
        .map(|(i, s)| (s, i))        // move s straight in
        .collect()
}
1
2
3
let names = vec!["ada".to_string(), "linus".to_string()];
let tagged = tag(names);
assert_eq!(tagged, vec![("ada".to_string(), 0), ("linus".to_string(), 1)]);

Each string’s heap buffer is threaded through by pointer. Zero element copies.

The rule of thumb

If you still need the collection afterward, iter() (borrow) is correct — you can’t move out of something you’re keeping. But the moment the collection is yours to consume and you don’t need it again, into_iter() skips a copy of every element. A for x in v loop already does this (it’s into_iter under the hood); the win is remembering that .map, .filter, and friends can start from into_iter() too.

1
2
3
4
5
6
// keep the source → borrow
let total: usize = names.iter().map(|s| s.len()).sum();
println!("{}", names.len()); // still usable

// done with the source → move
let owned: Vec<String> = names.into_iter().filter(|s| s.len() > 3).collect();

cloned() earns its keep when you genuinely need both the original and a copy. When you don’t, it’s a tax on data you’re about to free.

197. Advance a State Machine with mem::replace — Move the Enum Out, No Clone

Transitioning an enum state behind &mut self looks impossible: you can’t move the old variant’s owned data into the new one without the borrow checker stopping you — so people reach for .clone(). mem::replace lets you move the whole state out, leaving a cheap placeholder behind.

This closes out the performance week. Earlier bites covered mem::take, mem::replace, and mem::swap as primitives. Here’s the pattern they were built for: a state machine that moves owned data forward through its transitions.

The setup

A job that walks Queued → Running → Done, carrying an owned String payload from one state into the next:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#[derive(Debug, PartialEq)]
enum Stage {
    Queued { payload: String },
    Running { payload: String, worker: u32 },
    Done { result: String },
}

struct Job {
    stage: Stage,
}

The trap: matching a borrow forces a clone

You only have &mut self, so the obvious move is to match on &self.stage. But that gives you a borrow of payload — to put it in the next state you have to clone it:

1
2
3
4
5
6
7
8
9
fn advance(&mut self) {
    self.stage = match &self.stage {
        Stage::Queued { payload } => Stage::Running {
            payload: payload.clone(), // borrowed, so clone to reuse
            worker: 7,
        },
        // ...
    };
}

Matching on self.stage by value would move out of &mut self — the borrow checker rejects it outright. So clone feels like the only way out. It isn’t.

The fix: replace the whole state, then match by value

mem::replace swaps in a cheap placeholder and hands you the real state by value. Now the match owns payload and can move it straight into the next variant — zero clones:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
use std::mem;

impl Job {
    fn advance(&mut self) {
        self.stage = match mem::replace(&mut self.stage, Stage::Done { result: String::new() }) {
            Stage::Queued { payload } => Stage::Running { payload, worker: 7 },
            Stage::Running { payload, worker } => {
                Stage::Done { result: format!("{payload}@{worker}") }
            }
            done => done, // terminal state stays put
        };
    }
}

The placeholder (Done { result: String::new() }) is free — an empty String allocates nothing — and it lives for only the instant before you overwrite self.stage with the match result.

1
2
3
4
5
6
7
let mut job = Job { stage: Stage::Queued { payload: "build".into() } };

job.advance();
assert_eq!(job.stage, Stage::Running { payload: "build".into(), worker: 7 });

job.advance();
assert_eq!(job.stage, Stage::Done { result: "build@7".into() });

The payload string is allocated once and threaded through all three states by pointer. No copy of the bytes ever happens — exactly what the clone-based version threw away on every transition.

196. Return impl Iterator, Not Vec — Let the Caller Decide What to Do

Returning a Vec from a helper allocates eagerly, every time — even when the caller only wants the first match or a running sum. Return impl Iterator instead and the allocation simply never happens unless the caller asks for it.

This is the function-boundary version of yesterday’s bite-195: chaining adapters avoids temporary Vecs inside a pipeline; returning impl Iterator avoids forcing one across a function call.

The eager version

A helper that builds and returns a Vec commits to a heap allocation and a full pass over the data before the caller has said what they want:

1
2
3
fn evens_doubled(nums: &[i32]) -> Vec<i32> {
    nums.iter().filter(|&&n| n % 2 == 0).map(|&n| n * 2).collect()
}

If the caller just wants the first result, they still pay for the whole Vec:

1
2
3
let data = [1, 2, 3, 4, 5, 6];
let first = evens_doubled(&data).into_iter().next(); // allocated all 3, used 1
assert_eq!(first, Some(4));

Hand back the iterator instead

Drop the .collect() and return the lazy iterator. The + '_ ties its lifetime to the borrowed slice:

1
2
3
fn evens_doubled(nums: &[i32]) -> impl Iterator<Item = i32> + '_ {
    nums.iter().filter(|&&n| n % 2 == 0).map(|&n| n * 2)
}

Now nothing runs until the caller pulls values through — and they pick the consumer:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
let data = [1, 2, 3, 4, 5, 6];

let v: Vec<i32> = evens_doubled(&data).collect(); // collect if you want to
assert_eq!(v, [4, 8, 12]);

let total: i32 = evens_doubled(&data).sum();       // or fold straight to a number
assert_eq!(total, 24);

let first_big = evens_doubled(&data).find(|&n| n > 5); // or short-circuit
assert_eq!(first_big, Some(8));                    // stops at 8, never doubles 6

The find call never allocates and never touches the last element. The Vec-returning version couldn’t do that — collect() always drains the whole thing first.

The one rule: don’t borrow a local

The iterator you return can borrow your parameters, but not data you created inside the function — that data is dropped when the function ends. Iterators over owned values (like a Range) carry no borrow, so they just work:

1
2
3
4
5
6
fn squares(n: u32) -> impl Iterator<Item = u32> {
    (1..=n).map(|x| x * x)
}

let sq: Vec<u32> = squares(4).collect();
assert_eq!(sq, [1, 4, 9, 16]);

If you must produce owned data inside the function and stream it out, move it into the iterator (e.g. vec.into_iter() or a move closure) rather than returning a borrow of a local.

195. Chain Iterator Adapters — Don't collect() Between Every Step

Every collect::<Vec<_>>() in the middle of a pipeline is a heap allocation and a full pass over your data. Adapters like map and filter are lazy and fuse together — chain them and the whole transformation runs in one pass with zero temporary Vecs.

A Vec between every step

It’s tempting to do one transformation at a time, binding each result to a variable. Every collect() allocates a throwaway Vec and walks the entire sequence before the next step even starts:

1
2
3
4
5
6
7
let nums = [1, 2, 3, 4, 5, 6];

let doubled: Vec<i32> = nums.iter().map(|&n| n * 2).collect();
let big: Vec<i32> = doubled.into_iter().filter(|&n| n % 4 == 0).collect();
let sum: i32 = big.iter().sum();

assert_eq!(sum, 24);

Two intermediate Vecs, two extra allocations, three separate passes — all to compute a single number.

One chain, one pass, no temporaries

The adapters compose directly. Nothing is materialized until the final consumer (sum) pulls values through, so there are no intermediate collections at all:

1
2
3
4
5
6
7
8
9
let nums = [1, 2, 3, 4, 5, 6];

let sum: i32 = nums
    .iter()
    .map(|&n| n * 2)        // 2, 4, 6, 8, 10, 12
    .filter(|&n| n % 4 == 0) // 4, 8, 12
    .sum();                  // 24

assert_eq!(sum, 24);

Each element flows through map then filter then into the sum, one at a time. No buffer is ever allocated.

Laziness means short-circuiting works

Because nothing runs until pulled, a chain only does the work it needs. Add a take(2) and the pipeline stops after producing two results — the elements past that point are never touched:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
use std::cell::Cell;

let visited = Cell::new(0);
let nums = [1, 2, 3, 4, 5, 6, 7, 8];

let first_two: Vec<i32> = nums
    .iter()
    .inspect(|_| visited.set(visited.get() + 1))
    .map(|&n| n * 10)
    .filter(|&n| n > 20)
    .take(2)
    .collect();

assert_eq!(first_two, [30, 40]);
assert_eq!(visited.get(), 4); // stopped early — never looked at 5..8

The intermediate-collect version can’t do this: collect() always drains the whole iterator, so it would have visited all eight elements before take ever saw one.

When you genuinely do need a Vec

The point isn’t “never collect” — it’s “don’t collect between steps.” Collect once, at the end, when you actually need an owned, reusable collection:

1
2
3
4
5
6
7
8
9
let words = ["fast", "lazy", "fused", "iter"];

let shouted: Vec<String> = words
    .iter()
    .filter(|w| w.len() == 4)
    .map(|w| w.to_uppercase())
    .collect();

assert_eq!(shouted, ["FAST", "LAZY", "ITER"]);

One collect, at the end, when the Vec is the actual result. Everything before it stays lazy and allocation-free.