Strings

233. str::split_terminator — Split Without the Trailing Empty String

"a.b.c.".split('.') hands you a phantom empty string at the end. split_terminator reads the separator as ending each piece, not delimiting it — so the trailing one just vanishes.

Split counts the region after the final separator, even when it’s empty. Feed it text that ends with the separator and you get a ghost element:

1
2
let parts: Vec<&str> = "a.b.c.".split('.').collect();
assert_eq!(parts, ["a", "b", "c", ""]);

That trailing "" is a landmine downstream: an empty record, a blank line, a zero-length field you didn’t ask for.

split_terminator treats the separator as terminating each piece, so a trailing separator produces no empty tail:

1
2
let parts: Vec<&str> = "a.b.c.".split_terminator('.').collect();
assert_eq!(parts, ["a", "b", "c"]);

It only drops the empty piece caused by a trailing separator — interior empties are still real data and still show up:

1
2
let parts: Vec<&str> = "a..b.".split_terminator('.').collect();
assert_eq!(parts, ["a", "", "b"]);

And with no trailing separator it behaves exactly like split:

1
2
let parts: Vec<&str> = "a.b.c".split_terminator('.').collect();
assert_eq!(parts, ["a", "b", "c"]);

Reach for it when parsing records that end with their delimiter — newline-terminated logs, ;-terminated statements, trailing-comma lists — and skip the .filter(|s| !s.is_empty()) cleanup.

#227 Jun 2026

227. trim_matches — Strip the Same Char Off Both Ends, However Many There Are

trim() only knows about whitespace, and strip_prefix peels off one occurrence. When you need to shave every leading and trailing . (or 0, or quote) off a string, reach for trim_matches.

The hand-rolled trim

You’ve got a string padded with some character and want it gone from both ends — but only the ends:

1
2
3
4
5
let s = "***heading***";

// strip_prefix only removes one, and only the front
let once = s.strip_prefix('*').unwrap_or(s);
assert_eq!(once, "**heading***");

Looping strip_prefix/strip_suffix until they stop matching works, but it’s a chore. trim_matches does exactly that for you — it removes all consecutive matches from both ends and leaves the middle alone:

1
2
3
4
5
let s = "***heading***";
assert_eq!(s.trim_matches('*'), "heading");

// only the ends — interior matches stay put
assert_eq!("0x00ff00".trim_matches('0'), "x00ff");

One end at a time

There are directional versions when you only care about one side:

1
2
assert_eq!("--verbose".trim_start_matches("--"), "verbose");
assert_eq!("file.txt.bak".trim_end_matches(".bak"), "file.txt");

The pattern can be a closure or a set of chars

The argument is a Pattern, so you’re not limited to a single char. Pass a closure to trim by predicate, or an array of chars to trim any of them:

1
2
3
4
5
// strip leading digits
assert_eq!("12abc34".trim_start_matches(|c: char| c.is_numeric()), "abc34");

// trim any of several characters
assert_eq!("(value)".trim_matches(['(', ')']), "value");

Note trim_end_matches("--") strips the whole substring repeatedly, not a set of chars — that’s the difference between passing "--" and passing ['-'].

#224 Jun 2026

224. String::from_utf8_lossy — Returns a Cow, So Valid Bytes Cost Zero

from_utf8_lossy doesn’t always allocate. It hands back a Cow<str> that borrows your bytes when they’re already valid UTF-8 — you only pay for a String when there’s an invalid byte to replace.

The assumption that costs allocations

It’s easy to read this and assume every call builds a fresh String:

1
let text = String::from_utf8_lossy(bytes);

It doesn’t. The return type is Cow<'_, str> — clone-on-write. If bytes is valid UTF-8 (the common case for most files, headers, and protocol fields), you get back a Cow::Borrowed that points straight at your slice. No copy, no heap allocation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
use std::borrow::Cow;

fn read_name(bytes: &[u8]) -> Cow<'_, str> {
    String::from_utf8_lossy(bytes)
}

let valid = b"hello world";
let s = read_name(valid);
assert!(matches!(s, Cow::Borrowed(_))); // borrowed — nothing allocated
assert_eq!(s, "hello world");

You only pay on the rare path

The allocation happens only when there’s an invalid byte to swap for the replacement character U+FFFD. Then — and only then — it builds an owned String:

1
2
3
4
let invalid = &[b'c', b'a', b'f', 0xFF];
let s2 = read_name(invalid);
assert!(matches!(s2, Cow::Owned(_))); // owned — had to fix a bad byte
assert_eq!(s2, "caf\u{fffd}");

So the cost scales with how messy your input is, not with how often you call it.

Don’t undo it with a reflexive .to_string()

The anti-pattern is forcing an allocation right back on:

1
let owned = String::from_utf8_lossy(bytes).to_string(); // ⚠️ always allocates

Keep the Cow for as long as you’re only reading. If a caller genuinely needs ownership, into_owned() allocates on the borrowed path but reuses the buffer on the owned path — no double allocation:

1
2
let owned: String = read_name(b"abc").into_owned();
assert_eq!(owned, "abc");

When you’re decoding bytes you’ll mostly just inspect, let from_utf8_lossy stay a Cow. Valid input — the usual case — flows through without touching the heap.

#218 Jun 2026

218. str::match_indices — Find Every Match and Its Position in One Pass

Hand-rolling a find loop to locate every occurrence of a substring means juggling a running offset and remembering to skip past each match. match_indices hands you each hit and its byte position as an iterator — no bookkeeping.

The classic way to collect every position of a needle is a loop over find, slicing the remainder each time and adding the offset back by hand:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
let log = "ERROR x ERROR y WARN z ERROR";

let mut positions = Vec::new();
let mut start = 0;
while let Some(i) = log[start..].find("ERROR") {
    let idx = start + i;
    positions.push(idx);
    start = idx + "ERROR".len(); // easy to get this advance wrong
}

assert_eq!(positions, vec![0, 8, 23]);

It works, but the offset arithmetic is exactly the kind of thing you fumble at 5pm. match_indices walks the string for you and yields (byte_index, matched_str) pairs:

1
2
3
4
5
let log = "ERROR x ERROR y WARN z ERROR";

let positions: Vec<(usize, &str)> = log.match_indices("ERROR").collect();

assert_eq!(positions, vec![(0, "ERROR"), (8, "ERROR"), (23, "ERROR")]);

It’s lazy and composes like any other iterator, so you can map down to just the indices, or count without allocating at all:

1
2
3
4
let log = "ERROR x ERROR y WARN z ERROR";

let count = log.matches("ERROR").count();
assert_eq!(count, 3);

The pattern argument is the full Pattern family — a &str, a char, a closure, or a slice of chars — so you can find runs of a class of characters without a regex:

1
2
3
4
let s = "a1b22c333";

let digits: Vec<(usize, &str)> = s.match_indices(char::is_numeric).collect();
assert_eq!(digits, vec![(1, "1"), (3, "2"), (4, "2"), (6, "3"), (7, "3"), (8, "3")]);

Matches are non-overlapping and scanned left to right; when you need the last hit first, rmatch_indices walks right to left:

1
2
3
4
let s = "aXbXc";

let last = s.rmatch_indices('X').next();
assert_eq!(last, Some((3, "X")));

The byte indices are real &str offsets, so they slot straight into slicing and replacement logic without any conversion. When you only care about whether something matches, reach for contains; when you want each location, match_indices already did the bookkeeping.

#217 Jun 2026

217. eq_ignore_ascii_case — Case-Insensitive Compare Without the Allocation

a.to_lowercase() == b.to_lowercase() allocates two fresh Strings just to throw them away after the comparison. For ASCII text — headers, extensions, keywords — eq_ignore_ascii_case does it byte-by-byte with zero allocations.

The reflexive way to compare strings case-insensitively is to lowercase both sides and check for equality:

1
2
3
4
5
let header = "Content-Type";
let wanted = "content-type";

// Two heap allocations, then immediately dropped
assert!(header.to_lowercase() == wanted.to_lowercase());

That’s two Strings built on the heap for a question that’s really just “are these equal if we ignore case?” The standard library answers it directly, walking both byte sequences and folding AZ against az as it goes — no allocation, and it bails early on the first mismatch:

1
2
3
let header = "Content-Type";

assert!(header.eq_ignore_ascii_case("content-type"));

It’s defined on [u8] too, which is handy when you’re matching protocol tokens straight out of a buffer without first validating UTF-8:

1
assert!(b"GET".eq_ignore_ascii_case(b"get"));

The one thing to know: it folds only ASCII letters. Non-ASCII bytes must match exactly, so it won’t treat Ä and ä as equal, and it won’t expand ß:

1
2
assert!(!"Ä".eq_ignore_ascii_case("ä"));
assert!(!"Straße".eq_ignore_ascii_case("STRASSE"));

For human-facing, multilingual text you still want full Unicode case folding. But for the things you actually compare case-insensitively in systems code — HTTP methods and header names, file extensions, config keys, command names — the input is ASCII by definition, and reaching for eq_ignore_ascii_case is both faster and clearer:

1
2
3
4
5
6
7
fn is_jpeg(ext: &str) -> bool {
    ext.eq_ignore_ascii_case("jpg") || ext.eq_ignore_ascii_case("jpeg")
}

assert!(is_jpeg("JPG"));
assert!(is_jpeg("Jpeg"));
assert!(!is_jpeg("png"));

If you need to fold case in place rather than compare, make_ascii_lowercase mutates a &mut str or &mut [u8] without allocating either — same ASCII-only rule applies.

#205 Jun 2026

205. strip_prefix / strip_suffix — Remove a Prefix Once, Not Every Repeat

Reaching for trim_start_matches to peel off a "--" or a leading slash? It strips every repeated match and silently does nothing when there’s no match. strip_prefix removes exactly one and tells you whether it hit.

The Problem

trim_start_matches keeps eating as long as the pattern matches, which is rarely what “remove the prefix” means:

1
2
3
4
5
6
7
let path = "/////etc";

// Strips ALL leading slashes — not just one
assert_eq!(path.trim_start_matches('/'), "etc");

// And with a repeated substring it does the same
assert_eq!("foofoobar".trim_start_matches("foo"), "bar");

It also can’t tell you whether anything was removed — a non-match returns the string unchanged, so you can’t branch on it.

The Fix: strip_prefix

strip_prefix removes one occurrence and returns an Option: Some(rest) on a hit, None when the prefix isn’t there.

1
2
3
4
5
let path = "/////etc";

assert_eq!(path.strip_prefix('/'), Some("////etc")); // exactly one
assert_eq!("foofoobar".strip_prefix("foo"), Some("foobar"));
assert_eq!("foobar".strip_prefix("xyz"), None);       // no match, told you so

The Option is the real win: it doubles as a “did this start with the prefix?” test. Parsing a CLI flag becomes a one-liner:

1
2
3
4
5
6
fn flag_value(arg: &str) -> Option<&str> {
    arg.strip_prefix("--")
}

assert_eq!(flag_value("--verbose"), Some("verbose"));
assert_eq!(flag_value("positional"), None);

Its Mirror: strip_suffix

Same deal at the other end — perfect for trimming a known extension or unit without slicing indices by hand:

1
2
3
4
5
6
fn without_ext(name: &str) -> &str {
    name.strip_suffix(".rs").unwrap_or(name)
}

assert_eq!(without_ext("main.rs"), "main");
assert_eq!(without_ext("README"), "README"); // unchanged, no panic

Use trim_start_matches / trim_end_matches only when you genuinely want to collapse a run of repeats. For peeling one known prefix or suffix — and knowing if it was there — strip_prefix and strip_suffix say exactly what you mean.

194. Reuse One Buffer with .clear() — Allocate Once, Loop Many Times

with_capacity (bite 193) buys a buffer once instead of growing it repeatedly. But if you allocate a fresh String or Vec inside a loop, you throw that buffer away every iteration. .clear() resets the length to zero while keeping the capacity — so one allocation serves the whole loop.

A fresh allocation every iteration

It’s easy to declare the working buffer inside the loop. Each pass allocates a new heap buffer and drops it at the end of the iteration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
let lines = ["alpha", "beta", "gamma"];
let mut out = Vec::new();

for line in lines {
    let mut buf = String::new();   // new heap allocation, every iteration
    buf.push_str(line);
    buf.make_ascii_uppercase();
    out.push(buf.clone());
}

assert_eq!(out, ["ALPHA", "BETA", "GAMMA"]);

Three iterations, three allocate-then-free cycles for the scratch buffer. Scale that to a million lines and it’s a million wasted allocations.

.clear() keeps the capacity

Hoist the buffer out of the loop and clear() it at the top of each pass. clear() sets the length to 0 but leaves the allocated capacity in place, so after the first iteration the buffer is already big enough and never reallocates:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
let lines = ["alpha", "beta", "gamma"];
let mut out = Vec::new();
let mut buf = String::new();       // allocated once

for line in lines {
    buf.clear();                   // len -> 0, capacity untouched
    buf.push_str(line);
    buf.make_ascii_uppercase();
    out.push(buf.clone());
}

assert_eq!(out, ["ALPHA", "BETA", "GAMMA"]);

The contract is the whole point — clear drops the contents but not the buffer:

1
2
3
4
5
6
7
let mut s = String::with_capacity(64);
s.push_str("hello");
let cap = s.capacity();

s.clear();
assert_eq!(s.len(), 0);            // empty again
assert_eq!(s.capacity(), cap);     // ...but the buffer is still there

The read-into-a-reused-buffer pattern

This shows up constantly when reading input. BufRead::read_line appends to the buffer you give it, so the idiomatic loop clears one String each pass instead of allocating a new one per line:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
use std::io::BufRead;

let input = "12\n34\n56\n";
let mut reader = std::io::BufReader::new(input.as_bytes());

let mut line = String::new();      // one buffer for every line
let mut sum = 0i64;

loop {
    line.clear();                  // required — read_line appends
    let n = reader.read_line(&mut line).unwrap();
    if n == 0 {
        break;                     // 0 bytes read == EOF
    }
    sum += line.trim().parse::<i64>().unwrap();
}

assert_eq!(sum, 102);

The same trick works for any scratch Vecclear() it at the top of the loop and reuse the capacity for the next batch:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
let mut scratch: Vec<u8> = Vec::new();
let mut total = 0;

for chunk in [&[1u8, 2, 3][..], &[4, 5], &[6]] {
    scratch.clear();
    scratch.extend_from_slice(chunk);
    total += scratch.iter().map(|&b| b as u32).sum::<u32>();
}

assert_eq!(total, 21);

Reach for a fresh Vec/String only when you actually need to keep each result. When the buffer is just scratch space, allocate it once, clear() it, and let the loop run free.

#190 Jun 2026

190. Return Cow<str> — Allocate Only When You Actually Change Something

An escaping or normalizing function usually has nothing to do — the input is already clean. Returning String forces an allocation anyway. Return Cow<str> and the common path stays a borrow.

The wasteful version

A function that escapes HTML returns String, so every caller pays for an allocation — even the overwhelming majority whose input contains nothing to escape:

1
2
3
4
5
6
fn escape_html(input: &str) -> String {
    input
        .replace('&', "&amp;")
        .replace('<', "&lt;")
        .replace('>', "&gt;")
}

"hello world" has no special characters, yet replace still walks the string three times and hands back a fresh String. In a template renderer or a parser running this over thousands of fields, that’s thousands of pointless heap allocations.

Borrow on the fast path

Cow<str> lets one return type be either a borrow or an owned String. Check first; only allocate when there’s real work:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
use std::borrow::Cow;

fn escape_html(input: &str) -> Cow<str> {
    // Fast path: nothing to escape, hand back the original borrow.
    if !input.contains(['&', '<', '>']) {
        return Cow::Borrowed(input);
    }

    // Slow path: build the escaped String exactly once.
    let mut out = String::with_capacity(input.len());
    for c in input.chars() {
        match c {
            '&' => out.push_str("&amp;"),
            '<' => out.push_str("&lt;"),
            '>' => out.push_str("&gt;"),
            _ => out.push(c),
        }
    }
    Cow::Owned(out)
}

The clean input never touches the heap; the dirty input allocates once instead of three times:

1
2
3
4
5
6
let clean = escape_html("hello world");
assert!(matches!(clean, Cow::Borrowed(_))); // zero allocation

let dirty = escape_html("a < b & c");
assert!(matches!(dirty, Cow::Owned(_)));
assert_eq!(dirty, "a &lt; b &amp; c");

Callers don’t notice

Cow<str> derefs to &str, so anything that reads the result just works — no .unwrap(), no matching:

1
2
3
4
5
fn render(field: &str) -> usize {
    escape_html(field).len() // Cow derefs to &str
}

assert_eq!(render("plain"), 5);

And when a caller genuinely needs ownership, .into_owned() allocates only if it’s still borrowed:

1
2
let owned: String = escape_html("safe").into_owned();
assert_eq!(owned, "safe");

The rule: any function that might return its input unchanged — escaping, trimming, normalizing, path canonicalization — should return Cow<str>, not String. The signature tells the caller “I’ll borrow when I can,” and the body only reaches for the heap on the path that earns it.

#189 Jun 2026

189. str::char_indices — Slice a String Without Panicking on Non-ASCII

chars().enumerate() hands you a character count, but &s[..] wants a byte offset. Mix them up and one accented letter blows your program apart.

Say you want everything from the underscore onward. The enumerate version looks right and works fine in tests full of ASCII:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
let s = "café_table"; // 'é' is two bytes in UTF-8

let idx = s
    .chars()
    .enumerate()
    .find(|(_, c)| *c == '_')
    .map(|(i, _)| i)
    .unwrap();

let rest = &s[idx..]; // idx == 4 (char count), but '_' starts at byte 5

idx is 4, the character position. Byte 4 lands in the middle of é, so the slice panics: byte index 4 is not a char boundary.

char_indices yields the real byte offset of each character, which is exactly what slicing expects:

1
2
3
4
5
6
7
8
let idx = s
    .char_indices()
    .find(|(_, c)| *c == '_')
    .map(|(i, _)| i)
    .unwrap();

assert_eq!(idx, 5);
assert_eq!(&s[idx..], "_table"); // no panic, correct slice

The pattern is (byte_offset, char) instead of enumerate’s (count, char). It’s also a DoubleEndedIterator, so next_back gives you the last character and where it begins:

1
2
let (last_off, last_ch) = s.char_indices().next_back().unwrap();
assert_eq!((last_off, last_ch), (10, 'e'));

Rule of thumb: the moment a character index touches &s[..], .split_at(), or any byte-indexed API, reach for char_indices — not enumerate.

#187 Jun 2026

187. fmt::Write — Stop Allocating a Temp String Just to Append It

out.push_str(&format!("{name}: {score}")) builds a brand-new String, copies it into out, then throws it away — every single iteration. One use std::fmt::Write; and write! formats straight into your buffer instead.

The double-allocation habit

This pattern is everywhere, and it allocates a temporary String per call just to immediately copy and drop it:

1
2
3
4
5
6
7
let scores = [("ferris", 100), ("hermit", 42)];

let mut out = String::new();
for (name, score) in scores {
    out.push_str(&format!("{name}: {score}\n")); // temp String, copy, drop
}
assert_eq!(out, "ferris: 100\nhermit: 42\n");

Clippy even has a lint for it: format_push_string.

write! into the String directly

String implements std::fmt::Write, so the same write!/writeln! macros you use in Display impls work on it. The formatted output lands directly in the existing buffer — no intermediate allocation:

1
2
3
4
5
6
7
8
9
use std::fmt::Write; // bring the trait into scope

let scores = [("ferris", 100), ("hermit", 42)];

let mut out = String::new();
for (name, score) in scores {
    writeln!(out, "{name}: {score}").unwrap();
}
assert_eq!(out, "ferris: 100\nhermit: 42\n");

The .unwrap() looks scary but isn’t: write! returns fmt::Result because the trait allows failure, yet writing into a String can never fail — it just grows. let _ = writeln!(...) works too if you prefer.

Why it matters

The format! version allocates N temporary strings for N iterations. The write! version allocates only when out needs to grow — amortized, that’s a handful of reallocations total. In hot loops building large strings (reports, codegen, SQL), the difference shows up in profiles.

One gotcha: std::fmt::Write is for UTF-8 sinks (String); std::io::Write is for byte sinks (files, stdout). Same macro, different trait — if write!(out, ...) complains about no method named write_fmt, you imported the wrong one.

163. Cow::to_mut — Lazy In-Place Mutation Through Cow

Cow<str> is the type everyone reaches for when a function might need to modify its input. Cow::Borrowed and Cow::Owned are the constructors that get the spotlight; to_mut is the third piece, and it’s the one that actually pays off the laziness.

What to_mut does

to_mut takes &mut Cow<str> and hands back &mut String:

  • If the Cow is already Owned, you get a direct &mut to the inner String.
  • If it’s Borrowed, to_mut clones the slice into a fresh String, swaps the Cow over to Owned, and then hands you the mutable reference.

That asymmetry is the whole point. Many callers borrow and never touch to_mut — they never allocate. The ones that do call it pay the allocation cost exactly once, on first write.

A walking-the-string example

Expand \t into two spaces, but only allocate if the input actually contains a tab:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
use std::borrow::Cow;

fn expand_tabs(s: &str) -> Cow<'_, str> {
    let mut out: Cow<'_, str> = Cow::Borrowed(s);
    if let Some(i) = s.find('\t') {
        // First write — `to_mut` clones the slice into a String, then we
        // rebuild from byte `i` onwards.
        let buf = out.to_mut();
        buf.truncate(i);
        for c in s[i..].chars() {
            if c == '\t' {
                buf.push_str("  ");
            } else {
                buf.push(c);
            }
        }
    }
    out
}

The happy path — input has no tab — never enters the if, never allocates, and returns the original slice wrapped in Cow::Borrowed. The unhappy path allocates exactly once.

Composing transformations

to_mut really earns its keep when you chain several optional mutations. The first one that fires flips the Cow to Owned; every following mutation sees an already-owned buffer and reuses it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
use std::borrow::Cow;

fn apply_rules<'a>(s: &'a str, rules: &[(char, &str)]) -> Cow<'a, str> {
    let mut out: Cow<'a, str> = Cow::Borrowed(s);
    for &(from, to) in rules {
        if out.contains(from) {
            let replaced = out.replace(from, to);
            *out.to_mut() = replaced;
        }
    }
    out
}

Three things worth pointing at. First, out.contains(from) works because Cow<str> derefs to str. Second, the assignment *out.to_mut() = replaced replaces the inner String, not the Cow itself. Third, once the first rule fires, all subsequent to_mut calls are a no-op &mut String — no extra clones.

Pitfall: to_mut always commits

There’s no “preview, then maybe commit” mode. Calling to_mut on a borrowed Cow clones immediately, even if you never end up writing through the returned reference. So this is a trap:

1
2
3
4
if !out.is_empty() {
    let _ = out.to_mut();  // allocates even though we may not change anything
    // ... maybe mutate, maybe not
}

Guard the call with the actual condition that means “I’m about to write,” not the condition that means “I might.” The mental shortcut: to_mut is the moment you trade your &str for a String. Reach for it lazily, but commit completely.

#135 May 2026

135. str::strip_prefix — Trim a Prefix Without Slicing by Hand

Reaching for if s.starts_with("foo") { &s[3..] } to drop a prefix? That’s an off-by-one waiting to happen — and a panic the first time someone passes in an emoji. str::strip_prefix returns Option<&str> and gets it right by construction.

The Problem

You want the part of a string after a known prefix:

1
2
3
4
5
6
7
8
let s = "Bearer abc123";

let token = if s.starts_with("Bearer ") {
    &s[7..]
} else {
    s
};
assert_eq!(token, "abc123");

Two things wrong here: the literal 7 has to stay in sync with the literal "Bearer ", and slicing by byte offset will panic if the prefix ever lands mid-codepoint. Even using prefix.len() only saves you from the first bug, not the second when the prefix is dynamic.

The Fix: strip_prefix

1
2
3
4
let s = "Bearer abc123";

let token = s.strip_prefix("Bearer ").unwrap_or(s);
assert_eq!(token, "abc123");

strip_prefix returns Some(&str) if the prefix matched (giving you the rest), or None if it didn’t. No magic numbers, no slicing, no UTF-8 footguns — the prefix length comes from the prefix itself.

Pattern Matching, Not Just Strings

The argument is anything implementing Pattern, so a char, a closure, or even an array of chars all work:

1
2
3
4
5
6
assert_eq!("-x".strip_prefix('-'), Some("x"));
assert_eq!("x".strip_prefix('-'), None);

// Trim any leading whitespace character
let s = "\t  hello".strip_prefix(|c: char| c.is_whitespace());
assert_eq!(s, Some("  hello"));

Note this only strips one match — the char form doesn’t loop. For “strip every leading space,” reach for trim_start_matches.

The Twin: strip_suffix

Same shape, other end:

1
2
3
4
let filename = "report.tar.gz";

let stem = filename.strip_suffix(".gz").unwrap_or(filename);
assert_eq!(stem, "report.tar");

Together they replace half the manual &s[..s.len() - 3] arithmetic you’d otherwise write — and the Option return makes “did it actually have the prefix?” a value, not a separate starts_with call.

70. Iterator::intersperse — Join Elements Without Collecting First

Tired of collecting into a Vec just to call .join(",")? intersperse inserts a separator between every pair of elements — lazily, right inside the iterator chain.

The problem

You have an iterator of strings and want to join them with a separator. The classic approach forces you to collect first:

1
2
3
4
5
6
7
8
fn main() {
    let words = vec!["hello", "world", "from", "rust"];

    // Works, but allocates an intermediate Vec<&str> just to join it
    let sentence = words.iter().copied().collect::<Vec<_>>().join(" ");

    assert_eq!(sentence, "hello world from rust");
}

It gets the job done, but that intermediate Vec allocation is wasteful — you’re collecting just to immediately consume it again.

The clean way

intersperse inserts a separator value between every adjacent pair of elements, returning a new iterator:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fn main() {
    let words = vec!["hello", "world", "from", "rust"];

    let sentence: String = words
        .iter()
        .copied()
        .intersperse(" ")
        .collect();

    assert_eq!(sentence, "hello world from rust");
}

No intermediate Vec. The separator is lazily inserted as you iterate, and collect builds the final String directly.

It works with any type

intersperse isn’t just for strings — it works with any iterator where the element type implements Clone:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fn main() {
    let numbers = vec![1, 2, 3, 4];

    let with_zeros: Vec<i32> = numbers
        .iter()
        .copied()
        .intersperse(0)
        .collect();

    assert_eq!(with_zeros, vec![1, 0, 2, 0, 3, 0, 4]);
}

This is handy for building sequences with delimiters, padding, or sentinel values between real data.

When the separator is expensive to create

If your separator is costly to clone, use intersperse_with — it takes a closure that produces the separator on demand:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fn main() {
    let parts = vec!["one", "two", "three"];

    let result: String = parts
        .iter()
        .copied()
        .intersperse_with(|| " | ")
        .collect();

    assert_eq!(result, "one | two | three");
}

The closure is only called when a separator is actually needed, so you pay zero cost for single-element or empty iterators.

Edge cases

intersperse handles the corners gracefully — empty iterators stay empty, and single-element iterators pass through unchanged:

1
2
3
4
5
6
7
8
9
fn main() {
    let empty: Vec<&str> = Vec::new();
    let result: String = empty.iter().copied().intersperse(", ").collect();
    assert_eq!(result, "");

    let single = vec!["alone"];
    let result: String = single.iter().copied().intersperse(", ").collect();
    assert_eq!(result, "alone");
}

Next time you reach for .collect::<Vec<_>>().join(...), try intersperse instead — it’s one less allocation and reads just as clearly.

#055 Apr 2026

55. floor_char_boundary — Truncate Strings Without Breaking UTF-8

Ever tried to truncate a string to a byte limit and got a panic because you sliced in the middle of a multi-byte character? floor_char_boundary fixes that.

The Problem

Slicing a string at an arbitrary byte index panics if that index lands inside a multi-byte UTF-8 character:

1
2
3
4
5
6
let s = "Héllo 🦀 world";
// This panics at runtime!
// let truncated = &s[..5]; // 'é' spans bytes 1..3, index 5 is fine here
// but what if we don't know the content?
let s = "🦀🦀🦀"; // each crab is 4 bytes
// &s[..5] would panic — byte 5 is inside the second crab!

You could scan backward byte-by-byte checking is_char_boundary(), but that’s tedious and easy to get wrong.

The Fix: floor_char_boundary

str::floor_char_boundary(index) returns the largest byte position at or before index that sits on a valid character boundary. Its counterpart ceil_char_boundary gives you the smallest position at or after the index.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
fn main() {
    let s = "🦀🦀🦀"; // each 🦀 is 4 bytes, total 12 bytes

    // We want ~6 bytes, but byte 6 is inside the second crab
    let i = s.floor_char_boundary(6);
    assert_eq!(i, 4); // rounds down to end of first 🦀
    assert_eq!(&s[..i], "🦀");

    // ceil_char_boundary rounds up instead
    let j = s.ceil_char_boundary(6);
    assert_eq!(j, 8); // rounds up to end of second 🦀
    assert_eq!(&s[..j], "🦀🦀");
}

Real-World Use: Safe Truncation

Here’s a practical helper that truncates a string to fit a byte budget, adding an ellipsis if it was shortened:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
fn truncate(s: &str, max_bytes: usize) -> String {
    if s.len() <= max_bytes {
        return s.to_string();
    }
    let end = s.floor_char_boundary(max_bytes.saturating_sub(3));
    format!("{}...", &s[..end])
}

fn main() {
    let bio = "I love Rust 🦀 and crabs!";
    let short = truncate(bio, 16);
    assert_eq!(short, "I love Rust 🦀...");
    // 'I love Rust 🦀' = 15 bytes + '...' = 18 total
    // Safe! No panics, no broken characters.

    // Short strings pass through unchanged
    assert_eq!(truncate("hi", 10), "hi");
}

No more manual boundary scanning — these two methods handle the UTF-8 dance for you.

#044 Mar 2026

44. split_once — Split a String Exactly Once

When you need to split a string on the first occurrence of a delimiter, split_once is cleaner than anything you’d write by hand. Stable since Rust 1.52.

Parsing key=value pairs, HTTP headers, file paths — almost everywhere you split a string, you only care about the first separator. Before split_once, you’d reach for .find() plus index arithmetic:

The old way

1
2
3
4
5
6
7
8
let s = "Content-Type: application/json; charset=utf-8";

let colon = s.find(':').unwrap();
let header = &s[..colon];
let value = s[colon + 1..].trim();

assert_eq!(header, "Content-Type");
assert_eq!(value, "application/json; charset=utf-8");

Works, but it’s four lines of noise. The index arithmetic is easy to get wrong, and .trim() is a separate step.

With split_once

1
2
3
4
5
6
let s = "Content-Type: application/json; charset=utf-8";

let (header, value) = s.split_once(": ").unwrap();

assert_eq!(header, "Content-Type");
assert_eq!(value, "application/json; charset=utf-8");

One line. The delimiter is consumed, both sides are returned, and you pattern-match directly into named bindings.

Handling missing delimiters

split_once returns Option<(&str, &str)>None if the delimiter isn’t found. This makes it composable with ? or if let:

1
2
3
4
5
6
7
fn parse_env_var(s: &str) -> Option<(&str, &str)> {
    s.split_once('=')
}

assert_eq!(parse_env_var("HOME=/root"), Some(("HOME", "/root")));
assert_eq!(parse_env_var("NOVALUE"), None);
assert_eq!(parse_env_var("KEY=a=b=c"), Some(("KEY", "a=b=c")));

Note the last case: split_once stops at the first =. The rest of the string — a=b=c — is kept intact in the second half. That’s usually exactly what you want.

rsplit_once — split from the right

When you need the last delimiter instead of the first, rsplit_once has you covered:

1
2
3
4
5
6
let path = "/home/martin/projects/rustbites/content/posts/bite-044.md";

let (dir, filename) = path.rsplit_once('/').unwrap();

assert_eq!(dir, "/home/martin/projects/rustbites/content/posts");
assert_eq!(filename, "bite-044.md");

Multi-char delimiters work too

The delimiter can be any pattern — a char, a &str, or even a closure:

1
2
3
4
5
6
7
8
let record = "alice::42::engineer";

let (name, rest) = record.split_once("::").unwrap();
let (age_str, role) = rest.split_once("::").unwrap();

assert_eq!(name, "alice");
assert_eq!(age_str, "42");
assert_eq!(role, "engineer");

Whenever you reach for .splitn(2, ...) just to grab two halves, replace it with split_once — the intent is clearer and the return type is more ergonomic.

36. Cow<str> — Clone on Write

Stop cloning strings “just in case” — Cow<str> lets you borrow when you can and clone only when you must.

The problem

You’re writing a function that sometimes needs to modify a string and sometimes doesn’t. The easy fix? Clone every time:

1
2
3
4
5
6
7
fn ensure_greeting(name: &str) -> String {
    if name.starts_with("Hello") {
        name.to_string() // unnecessary clone!
    } else {
        format!("Hello, {name}!")
    }
}

This works, but that first branch allocates a brand-new String even though name is already perfect as-is. In a hot loop, those wasted allocations add up.

Enter Cow<str>

Cow stands for Clone on Write. It holds either a borrowed reference or an owned value, and only clones when you actually need to mutate or take ownership:

1
2
3
4
5
6
7
8
9
use std::borrow::Cow;

fn ensure_greeting(name: &str) -> Cow<str> {
    if name.starts_with("Hello") {
        Cow::Borrowed(name) // zero-cost: just wraps the reference
    } else {
        Cow::Owned(format!("Hello, {name}!"))
    }
}

Now the happy path (name already starts with “Hello”) does zero allocation. The caller gets a Cow<str> that derefs to &str transparently — most code won’t even notice the difference.

Using Cow values

Because Cow<str> implements Deref<Target = str>, you can use it anywhere a &str is expected:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
use std::borrow::Cow;

fn ensure_greeting(name: &str) -> Cow<str> {
    if name.starts_with("Hello") {
        Cow::Borrowed(name)
    } else {
        Cow::Owned(format!("Hello, {name}!"))
    }
}

fn main() {
    let greeting = ensure_greeting("Hello, world!");
    assert_eq!(&*greeting, "Hello, world!");

    // Call &str methods directly on Cow
    assert!(greeting.contains("world"));

    // Only clone into String when you truly need ownership
    let _owned: String = greeting.into_owned();

    let greeting2 = ensure_greeting("Rust");
    assert_eq!(&*greeting2, "Hello, Rust!");
}

When to reach for Cow

Cow shines in these situations:

  • Conditional transformations — functions that modify input only sometimes (normalization, trimming, escaping)
  • Config/lookup values — return a static default or a dynamically built string
  • Parser outputs — most tokens are slices of the input, but some need unescaping

The Cow type works with any ToOwned pair, not just strings. You can use Cow<[u8]>, Cow<Path>, or Cow<[T]> the same way.

Quick reference

OperationCost
Cow::Borrowed(s)Free — wraps a reference
Cow::Owned(s)Whatever creating the owned value costs
*cow (deref)Free
cow.into_owned()Free if already owned, clones if borrowed
cow.to_mut()Clones if borrowed, then gives &mut access