Strings

#263 Jul 2026

263. String::replace_range — Swap a Span In Place, No Rebuild

Swapping one piece of a String by slicing and re-format!-ing rebuilds the whole thing. replace_range splices the new text straight into the buffer you already own.

The trap

You need to change one segment in the middle of a String. The slice-and-rebuild reflex kicks in:

1
2
3
4
5
let mut path = String::from("/api/v1/users");

// new allocation, three copies, old String dropped
path = format!("{}{}{}", &path[..5], "v2", &path[7..]);
assert_eq!(path, "/api/v2/users");

A fresh allocation and three copies to change two bytes. Bite 262’s insert_str covers inserting at a point — but here you want to replace a span.

The fix

String::replace_range takes a byte range and splices the replacement in, in place:

1
2
3
4
let mut path = String::from("/api/v1/users");

path.replace_range(5..7, "v2");
assert_eq!(path, "/api/v2/users");

The replacement doesn’t have to match the range’s length — the tail shifts to fit:

1
2
3
4
let mut greet = String::from("Hello, world!");

greet.replace_range(7..12, "rustbites");
assert_eq!(greet, "Hello, rustbites!");

And any range form works. An empty replacement deletes the span — no drain iterator to throw away:

1
2
3
4
let mut line = String::from("DEBUG: cache miss");

line.replace_range(..7, "");
assert_eq!(line, "cache miss");

Together with insert_str (bite 262) you get the full in-place toolkit: insert at a point, replace a span, delete a range — all without touching the allocator when capacity allows.

One caveat

Same rules as insert_str: the range is in bytes and both ends must land on char boundaries, or it panics — mid-emoji is a runtime error, not a compile error. And because the tail may shift, it’s O(n) per call: fine for a targeted splice, wrong as the workhorse of a text editor’s inner loop.

#262 Jul 2026

strings allocation

262. String::insert_str — Prepend or Splice Text Without Rebuilding the String

Adding a prefix with format!("{prefix}{s}") builds a brand-new String and throws the old one away. insert_str splices the text into the buffer you already have.

The trap

You have a String and need to stick something in front of it — a log level, a scheme, a marker. The reflex is format!:

1
2
3
4
5
6
let mut msg = String::from("connection lost");

// allocates a second String, copies both halves,
// drops the original
msg = format!("[ERROR] {msg}");
assert_eq!(msg, "[ERROR] connection lost");

That’s a full new allocation and two copies just to add eight bytes at the front. push_str only helps at the end — there’s no push_front for strings.

The fix

String::insert_str shifts the existing bytes over and copies the new text in, reusing the allocation when capacity allows:

1
2
3
4
let mut msg = String::from("connection lost");

msg.insert_str(0, "[ERROR] ");
assert_eq!(msg, "[ERROR] connection lost");

And it’s not just for prepending — the index can be anywhere, which makes splicing into the middle a one-liner:

1
2
3
4
5
6
let mut name = String::from("report_final.txt");

if let Some(dot) = name.rfind('.') {
    name.insert_str(dot, "_v2");
}
assert_eq!(name, "report_final_v2.txt");

For a single character there’s the sibling String::insert(idx, char).

One caveat

The index is a byte offset and must land on a char boundary — mid-emoji it panics, same rule as slicing. And since the tail gets shifted each call, insert_str is O(n): perfect for the occasional splice, wrong for building a string front-to-back in a loop. If you’re prepending repeatedly, collect the pieces and join them once instead.

#261 Jul 2026

strings performance

261. String::retain — Delete Characters In Place, No New Allocation

Stripping characters with replace("-", "") builds a brand-new String just to throw characters away. retain deletes them in the buffer you already own.

The trap

This morning’s bite (260) covered replacen — but both replace and replacen always return a fresh String, even when the “replacement” is deleting. Same story with the iterator route:

1
2
3
4
5
6
7
let phone = String::from("+49 (0)30 901820");

// both of these allocate a whole new String
let a = phone.replace(|c: char| !c.is_ascii_digit(), "");
let b: String = phone.chars().filter(char::is_ascii_digit).collect();
assert_eq!(a, "49030901820");
assert_eq!(b, a);

If you’re cleaning strings in a loop, that’s one allocation per string, per pass — for data you already had in a perfectly good buffer.

The fix

String::retain keeps every char the closure approves and shifts the rest out, in place, in one O(n) pass:

1
2
3
4
5
6
7
let mut phone = String::from("+49 (0)30 901820");
let cap = phone.capacity();

phone.retain(|c| c.is_ascii_digit());

assert_eq!(phone, "49030901820");
assert_eq!(phone.capacity(), cap); // same buffer

No new allocation, and the capacity stays put — ready for push_str later. It reads as intent, too: “keep digits” instead of “replace non-digits with nothing”.

One caveat

The closure sees chars in order, exactly once, and retain keeps what returns true — it’s a keep-list, not a kill-list. To delete matches, negate:

1
2
3
let mut s = String::from("no_more_underscores");
s.retain(|c| c != '_');
assert_eq!(s, "nomoreunderscores");

Vec<T> and VecDeque<T> have the same method, so the pattern transfers. When you need to substitute text, replace/replacen earn their allocation — but when you’re only deleting, retain does it where the string already lives.

#260 Jul 2026

strings

260. str::replacen — Replace the First N Matches, Not Every Single One

replace is all-or-nothing: it rewrites every occurrence, whether you wanted that or not. replacen lets you say how many.

The trap

str::replace has no off switch — it replaces every match in the string. The moment your pattern appears somewhere you didn’t expect, it happily rewrites that too:

1
2
3
4
5
6
7
let line = "user=admin role=admin";

// demote the user, keep the role... oops
assert_eq!(
    line.replace("admin", "guest"),
    "user=guest role=guest"
);

The usual workaround is find + manual slicing — index math, an allocation, and an edge case when the pattern is missing.

The fix

replacen takes a third argument: the maximum number of replacements, counted from the left. Everything after the Nth match is left alone:

1
2
3
4
assert_eq!(
    line.replacen("admin", "guest", 1),
    "user=guest role=admin"
);

Fewer matches than n is fine — it just replaces what’s there. And like replace, the pattern can be a char, a &str, or a closure over char:

1
2
3
let csv = "a-b-c-d";
assert_eq!(csv.replacen('-', "+", 2), "a+b+c-d");
assert_eq!(csv.replacen('-', "+", 0), "a-b-c-d"); // n = 0: copy, untouched

One caveat

Counting is strictly left-to-right — there’s no rreplacen for “just the last one”. For that, reach for rfind and slice, or rsplit_once if you’re splitting anyway.

Both replace and replacen return a fresh String and leave the original untouched. If you only need to check the first match, find is cheaper — but when you need “replace the first one and stop”, replacen(pat, to, 1) says exactly that.

#259 Jul 2026

strings parsing

259. str::lines — Split Into Lines Without Dragging \r Along

split('\n') works fine — until a Windows-saved file hands you lines that all end in an invisible \r. lines() was built for exactly this.

The trap

Same story as bite 258: split takes your separator literally. A file saved on Windows uses \r\n line endings, so every “line” keeps a carriage return — and the trailing newline produces a bonus empty string:

1
2
3
4
5
6
7
let text = "alpha\r\nbeta\r\ngamma\r\n";

let naive: Vec<&str> = text.split('\n').collect();
assert_eq!(
    naive,
    ["alpha\r", "beta\r", "gamma\r", ""]
);

That stray \r is invisible in most debug output, so it surfaces as "gamma" != "gamma" mysteries: failed comparisons, HashMap misses, parse errors on the last field.

The fix

lines() splits on \n and strips a trailing \r if one is there — so the same code handles Unix and Windows files:

1
2
let lines: Vec<&str> = text.lines().collect();
assert_eq!(lines, ["alpha", "beta", "gamma"]);

No trailing empty string either — like split_terminator (bite 233), the final newline is treated as a terminator, not a separator.

One caveat

Only \r\n and \n count as line endings. A lone \r (classic Mac OS, some protocol payloads) does not split:

1
2
3
let old_mac = "alpha\rbeta";
let lines: Vec<&str> = old_mac.lines().collect();
assert_eq!(lines, ["alpha\rbeta"]);

And a \r in the middle of a line stays untouched — only one directly before the \n is stripped.

Reading a file? BufRead::lines() gives you the same semantics over owned Strings. Either way: for “give me the lines”, it’s lines() every time — save split('\n') for when you truly mean raw bytes-between-newlines.

#258 Jul 2026

strings parsing

258. split_whitespace — Split on Runs, Not on Every Single Space

split(' ') hands you empty strings for every doubled space — and silently ignores tabs. split_whitespace is what you actually meant.

The trap

User input is messy: leading spaces, double spaces, a stray tab. split(' ') takes all of that literally:

1
2
3
4
5
6
7
let line = "  alpha\tbeta   gamma ";

let naive: Vec<&str> = line.split(' ').collect();
assert_eq!(
    naive,
    ["", "", "alpha\tbeta", "", "", "gamma", ""]
);

Two bugs in one line: every consecutive-space pair produces an empty string, and "alpha\tbeta" sails through as a single “word” because a tab isn’t a space.

The fix

split_whitespace splits on runs of any whitespace and never yields empty strings:

1
2
let words: Vec<&str> = line.split_whitespace().collect();
assert_eq!(words, ["alpha", "beta", "gamma"]);

Leading and trailing whitespace disappear too — no trim() needed first.

Unicode-aware, with an ASCII fast path

“Whitespace” here means the Unicode White_Space property, so a non-breaking space (\u{00A0}) splits words just like a regular one:

1
2
3
let fancy = "alpha\u{00A0}beta";
let words: Vec<&str> = fancy.split_whitespace().collect();
assert_eq!(words, ["alpha", "beta"]);

If your input is guaranteed ASCII (log files, protocol lines), split_ascii_whitespace does the same thing with a cheaper per-byte check — same no-empty-strings guarantee:

1
2
3
let words: Vec<&str> =
    " 42  7\t9 ".split_ascii_whitespace().collect();
assert_eq!(words, ["42", "7", "9"]);

Keep split(' ') for formats where empty fields are meaningful (CSV-like, fixed positions). For “give me the words”, it’s split_whitespace every time.

#233 Jul 2026

strings iterators parsing

233. str::split_terminator — Split Without the Trailing Empty String

"a.b.c.".split('.') hands you a phantom empty string at the end. split_terminator reads the separator as ending each piece, not delimiting it — so the trailing one just vanishes.

Split counts the region after the final separator, even when it’s empty. Feed it text that ends with the separator and you get a ghost element:

1
2
let parts: Vec<&str> = "a.b.c.".split('.').collect();
assert_eq!(parts, ["a", "b", "c", ""]);

That trailing "" is a landmine downstream: an empty record, a blank line, a zero-length field you didn’t ask for.

split_terminator treats the separator as terminating each piece, so a trailing separator produces no empty tail:

1
2
let parts: Vec<&str> = "a.b.c.".split_terminator('.').collect();
assert_eq!(parts, ["a", "b", "c"]);

It only drops the empty piece caused by a trailing separator — interior empties are still real data and still show up:

1
2
let parts: Vec<&str> = "a..b.".split_terminator('.').collect();
assert_eq!(parts, ["a", "", "b"]);

And with no trailing separator it behaves exactly like split:

1
2
let parts: Vec<&str> = "a.b.c".split_terminator('.').collect();
assert_eq!(parts, ["a", "b", "c"]);

Reach for it when parsing records that end with their delimiter — newline-terminated logs, ;-terminated statements, trailing-comma lists — and skip the .filter(|s| !s.is_empty()) cleanup.

#227 Jun 2026

strings idioms

227. trim_matches — Strip the Same Char Off Both Ends, However Many There Are

trim() only knows about whitespace, and strip_prefix peels off one occurrence. When you need to shave every leading and trailing . (or 0, or quote) off a string, reach for trim_matches.

The hand-rolled trim

You’ve got a string padded with some character and want it gone from both ends — but only the ends:

1
2
3
4
5
let s = "***heading***";

// strip_prefix only removes one, and only the front
let once = s.strip_prefix('*').unwrap_or(s);
assert_eq!(once, "**heading***");

Looping strip_prefix/strip_suffix until they stop matching works, but it’s a chore. trim_matches does exactly that for you — it removes all consecutive matches from both ends and leaves the middle alone:

1
2
3
4
5
let s = "***heading***";
assert_eq!(s.trim_matches('*'), "heading");

// only the ends — interior matches stay put
assert_eq!("0x00ff00".trim_matches('0'), "x00ff");

One end at a time

There are directional versions when you only care about one side:

1
2
assert_eq!("--verbose".trim_start_matches("--"), "verbose");
assert_eq!("file.txt.bak".trim_end_matches(".bak"), "file.txt");

The pattern can be a closure or a set of chars

The argument is a Pattern, so you’re not limited to a single char. Pass a closure to trim by predicate, or an array of chars to trim any of them:

1
2
3
4
5
// strip leading digits
assert_eq!("12abc34".trim_start_matches(|c: char| c.is_numeric()), "abc34");

// trim any of several characters
assert_eq!("(value)".trim_matches(['(', ')']), "value");

Note trim_end_matches("--") strips the whole substring repeatedly, not a set of chars — that’s the difference between passing "--" and passing ['-'].

#224 Jun 2026

strings allocation cow

224. String::from_utf8_lossy — Returns a Cow, So Valid Bytes Cost Zero

from_utf8_lossy doesn’t always allocate. It hands back a Cow<str> that borrows your bytes when they’re already valid UTF-8 — you only pay for a String when there’s an invalid byte to replace.

The assumption that costs allocations

It’s easy to read this and assume every call builds a fresh String:

1
let text = String::from_utf8_lossy(bytes);

It doesn’t. The return type is Cow<'_, str> — clone-on-write. If bytes is valid UTF-8 (the common case for most files, headers, and protocol fields), you get back a Cow::Borrowed that points straight at your slice. No copy, no heap allocation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
use std::borrow::Cow;

fn read_name(bytes: &[u8]) -> Cow<'_, str> {
    String::from_utf8_lossy(bytes)
}

let valid = b"hello world";
let s = read_name(valid);
assert!(matches!(s, Cow::Borrowed(_))); // borrowed — nothing allocated
assert_eq!(s, "hello world");

You only pay on the rare path

The allocation happens only when there’s an invalid byte to swap for the replacement character U+FFFD. Then — and only then — it builds an owned String:

1
2
3
4
let invalid = &[b'c', b'a', b'f', 0xFF];
let s2 = read_name(invalid);
assert!(matches!(s2, Cow::Owned(_))); // owned — had to fix a bad byte
assert_eq!(s2, "caf\u{fffd}");

So the cost scales with how messy your input is, not with how often you call it.

Don’t undo it with a reflexive .to_string()

The anti-pattern is forcing an allocation right back on:

1
let owned = String::from_utf8_lossy(bytes).to_string(); // ⚠️ always allocates

Keep the Cow for as long as you’re only reading. If a caller genuinely needs ownership, into_owned() allocates on the borrowed path but reuses the buffer on the owned path — no double allocation:

1
2
let owned: String = read_name(b"abc").into_owned();
assert_eq!(owned, "abc");

When you’re decoding bytes you’ll mostly just inspect, let from_utf8_lossy stay a Cow. Valid input — the usual case — flows through without touching the heap.

#218 Jun 2026

strings iterators

218. str::match_indices — Find Every Match and Its Position in One Pass

Hand-rolling a find loop to locate every occurrence of a substring means juggling a running offset and remembering to skip past each match. match_indices hands you each hit and its byte position as an iterator — no bookkeeping.

The classic way to collect every position of a needle is a loop over find, slicing the remainder each time and adding the offset back by hand:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
let log = "ERROR x ERROR y WARN z ERROR";

let mut positions = Vec::new();
let mut start = 0;
while let Some(i) = log[start..].find("ERROR") {
    let idx = start + i;
    positions.push(idx);
    start = idx + "ERROR".len(); // easy to get this advance wrong
}

assert_eq!(positions, vec![0, 8, 23]);

It works, but the offset arithmetic is exactly the kind of thing you fumble at 5pm. match_indices walks the string for you and yields (byte_index, matched_str) pairs:

1
2
3
4
5
let log = "ERROR x ERROR y WARN z ERROR";

let positions: Vec<(usize, &str)> = log.match_indices("ERROR").collect();

assert_eq!(positions, vec![(0, "ERROR"), (8, "ERROR"), (23, "ERROR")]);

It’s lazy and composes like any other iterator, so you can map down to just the indices, or count without allocating at all:

1
2
3
4
let log = "ERROR x ERROR y WARN z ERROR";

let count = log.matches("ERROR").count();
assert_eq!(count, 3);

The pattern argument is the full Pattern family — a &str, a char, a closure, or a slice of chars — so you can find runs of a class of characters without a regex:

1
2
3
4
let s = "a1b22c333";

let digits: Vec<(usize, &str)> = s.match_indices(char::is_numeric).collect();
assert_eq!(digits, vec![(1, "1"), (3, "2"), (4, "2"), (6, "3"), (7, "3"), (8, "3")]);

Matches are non-overlapping and scanned left to right; when you need the last hit first, rmatch_indices walks right to left:

1
2
3
4
let s = "aXbXc";

let last = s.rmatch_indices('X').next();
assert_eq!(last, Some((3, "X")));

The byte indices are real &str offsets, so they slot straight into slicing and replacement logic without any conversion. When you only care about whether something matches, reach for contains; when you want each location, match_indices already did the bookkeeping.

#217 Jun 2026

strings performance

217. eq_ignore_ascii_case — Case-Insensitive Compare Without the Allocation

a.to_lowercase() == b.to_lowercase() allocates two fresh Strings just to throw them away after the comparison. For ASCII text — headers, extensions, keywords — eq_ignore_ascii_case does it byte-by-byte with zero allocations.

The reflexive way to compare strings case-insensitively is to lowercase both sides and check for equality:

1
2
3
4
5
let header = "Content-Type";
let wanted = "content-type";

// Two heap allocations, then immediately dropped
assert!(header.to_lowercase() == wanted.to_lowercase());

That’s two Strings built on the heap for a question that’s really just “are these equal if we ignore case?” The standard library answers it directly, walking both byte sequences and folding A–Z against a–z as it goes — no allocation, and it bails early on the first mismatch:

1
2
3
let header = "Content-Type";

assert!(header.eq_ignore_ascii_case("content-type"));

It’s defined on [u8] too, which is handy when you’re matching protocol tokens straight out of a buffer without first validating UTF-8:

1
assert!(b"GET".eq_ignore_ascii_case(b"get"));

The one thing to know: it folds only ASCII letters. Non-ASCII bytes must match exactly, so it won’t treat Ä and ä as equal, and it won’t expand ß:

1
2
assert!(!"Ä".eq_ignore_ascii_case("ä"));
assert!(!"Straße".eq_ignore_ascii_case("STRASSE"));

For human-facing, multilingual text you still want full Unicode case folding. But for the things you actually compare case-insensitively in systems code — HTTP methods and header names, file extensions, config keys, command names — the input is ASCII by definition, and reaching for eq_ignore_ascii_case is both faster and clearer:

1
2
3
4
5
6
7
fn is_jpeg(ext: &str) -> bool {
    ext.eq_ignore_ascii_case("jpg") || ext.eq_ignore_ascii_case("jpeg")
}

assert!(is_jpeg("JPG"));
assert!(is_jpeg("Jpeg"));
assert!(!is_jpeg("png"));

If you need to fold case in place rather than compare, make_ascii_lowercase mutates a &mut str or &mut [u8] without allocating either — same ASCII-only rule applies.

#205 Jun 2026

strings std

205. strip_prefix / strip_suffix — Remove a Prefix Once, Not Every Repeat

Reaching for trim_start_matches to peel off a "--" or a leading slash? It strips every repeated match and silently does nothing when there’s no match. strip_prefix removes exactly one and tells you whether it hit.

The Problem

trim_start_matches keeps eating as long as the pattern matches, which is rarely what “remove the prefix” means:

1
2
3
4
5
6
7
let path = "/////etc";

// Strips ALL leading slashes — not just one
assert_eq!(path.trim_start_matches('/'), "etc");

// And with a repeated substring it does the same
assert_eq!("foofoobar".trim_start_matches("foo"), "bar");

It also can’t tell you whether anything was removed — a non-match returns the string unchanged, so you can’t branch on it.

The Fix: `strip_prefix`

strip_prefix removes one occurrence and returns an Option: Some(rest) on a hit, None when the prefix isn’t there.

1
2
3
4
5
let path = "/////etc";

assert_eq!(path.strip_prefix('/'), Some("////etc")); // exactly one
assert_eq!("foofoobar".strip_prefix("foo"), Some("foobar"));
assert_eq!("foobar".strip_prefix("xyz"), None);       // no match, told you so

The Option is the real win: it doubles as a “did this start with the prefix?” test. Parsing a CLI flag becomes a one-liner:

1
2
3
4
5
6
fn flag_value(arg: &str) -> Option<&str> {
    arg.strip_prefix("--")
}

assert_eq!(flag_value("--verbose"), Some("verbose"));
assert_eq!(flag_value("positional"), None);

Its Mirror: `strip_suffix`

Same deal at the other end — perfect for trimming a known extension or unit without slicing indices by hand:

1
2
3
4
5
6
fn without_ext(name: &str) -> &str {
    name.strip_suffix(".rs").unwrap_or(name)
}

assert_eq!(without_ext("main.rs"), "main");
assert_eq!(without_ext("README"), "README"); // unchanged, no panic

Use trim_start_matches / trim_end_matches only when you genuinely want to collapse a run of repeats. For peeling one known prefix or suffix — and knowing if it was there — strip_prefix and strip_suffix say exactly what you mean.

#194 Jun 2026

performance allocation strings

194. Reuse One Buffer with .clear() — Allocate Once, Loop Many Times

with_capacity (bite 193) buys a buffer once instead of growing it repeatedly. But if you allocate a fresh String or Vec inside a loop, you throw that buffer away every iteration. .clear() resets the length to zero while keeping the capacity — so one allocation serves the whole loop.

A fresh allocation every iteration

It’s easy to declare the working buffer inside the loop. Each pass allocates a new heap buffer and drops it at the end of the iteration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
let lines = ["alpha", "beta", "gamma"];
let mut out = Vec::new();

for line in lines {
    let mut buf = String::new();   // new heap allocation, every iteration
    buf.push_str(line);
    buf.make_ascii_uppercase();
    out.push(buf.clone());
}

assert_eq!(out, ["ALPHA", "BETA", "GAMMA"]);

Three iterations, three allocate-then-free cycles for the scratch buffer. Scale that to a million lines and it’s a million wasted allocations.

`.clear()` keeps the capacity

Hoist the buffer out of the loop and clear() it at the top of each pass. clear() sets the length to 0 but leaves the allocated capacity in place, so after the first iteration the buffer is already big enough and never reallocates:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
let lines = ["alpha", "beta", "gamma"];
let mut out = Vec::new();
let mut buf = String::new();       // allocated once

for line in lines {
    buf.clear();                   // len -> 0, capacity untouched
    buf.push_str(line);
    buf.make_ascii_uppercase();
    out.push(buf.clone());
}

assert_eq!(out, ["ALPHA", "BETA", "GAMMA"]);

The contract is the whole point — clear drops the contents but not the buffer:

1
2
3
4
5
6
7
let mut s = String::with_capacity(64);
s.push_str("hello");
let cap = s.capacity();

s.clear();
assert_eq!(s.len(), 0);            // empty again
assert_eq!(s.capacity(), cap);     // ...but the buffer is still there

The read-into-a-reused-buffer pattern

This shows up constantly when reading input. BufRead::read_line appends to the buffer you give it, so the idiomatic loop clears one String each pass instead of allocating a new one per line:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
use std::io::BufRead;

let input = "12\n34\n56\n";
let mut reader = std::io::BufReader::new(input.as_bytes());

let mut line = String::new();      // one buffer for every line
let mut sum = 0i64;

loop {
    line.clear();                  // required — read_line appends
    let n = reader.read_line(&mut line).unwrap();
    if n == 0 {
        break;                     // 0 bytes read == EOF
    }
    sum += line.trim().parse::<i64>().unwrap();
}

assert_eq!(sum, 102);

The same trick works for any scratch Vec — clear() it at the top of the loop and reuse the capacity for the next batch:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
let mut scratch: Vec<u8> = Vec::new();
let mut total = 0;

for chunk in [&[1u8, 2, 3][..], &[4, 5], &[6]] {
    scratch.clear();
    scratch.extend_from_slice(chunk);
    total += scratch.iter().map(|&b| b as u32).sum::<u32>();
}

assert_eq!(total, 21);

Reach for a fresh Vec/String only when you actually need to keep each result. When the buffer is just scratch space, allocate it once, clear() it, and let the loop run free.

#190 Jun 2026

cow strings performance

190. Return Cow<str> — Allocate Only When You Actually Change Something

An escaping or normalizing function usually has nothing to do — the input is already clean. Returning String forces an allocation anyway. Return Cow<str> and the common path stays a borrow.

The wasteful version

A function that escapes HTML returns String, so every caller pays for an allocation — even the overwhelming majority whose input contains nothing to escape:

1
2
3
4
5
6
fn escape_html(input: &str) -> String {
    input
        .replace('&', "&amp;")
        .replace('<', "&lt;")
        .replace('>', "&gt;")
}

"hello world" has no special characters, yet replace still walks the string three times and hands back a fresh String. In a template renderer or a parser running this over thousands of fields, that’s thousands of pointless heap allocations.

Borrow on the fast path

Cow<str> lets one return type be either a borrow or an owned String. Check first; only allocate when there’s real work:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
use std::borrow::Cow;

fn escape_html(input: &str) -> Cow<str> {
    // Fast path: nothing to escape, hand back the original borrow.
    if !input.contains(['&', '<', '>']) {
        return Cow::Borrowed(input);
    }

    // Slow path: build the escaped String exactly once.
    let mut out = String::with_capacity(input.len());
    for c in input.chars() {
        match c {
            '&' => out.push_str("&amp;"),
            '<' => out.push_str("&lt;"),
            '>' => out.push_str("&gt;"),
            _ => out.push(c),
        }
    }
    Cow::Owned(out)
}

The clean input never touches the heap; the dirty input allocates once instead of three times:

1
2
3
4
5
6
let clean = escape_html("hello world");
assert!(matches!(clean, Cow::Borrowed(_))); // zero allocation

let dirty = escape_html("a < b & c");
assert!(matches!(dirty, Cow::Owned(_)));
assert_eq!(dirty, "a &lt; b &amp; c");

Callers don’t notice

Cow<str> derefs to &str, so anything that reads the result just works — no .unwrap(), no matching:

1
2
3
4
5
fn render(field: &str) -> usize {
    escape_html(field).len() // Cow derefs to &str
}

assert_eq!(render("plain"), 5);

And when a caller genuinely needs ownership, .into_owned() allocates only if it’s still borrowed:

1
2
let owned: String = escape_html("safe").into_owned();
assert_eq!(owned, "safe");

The rule: any function that might return its input unchanged — escaping, trimming, normalizing, path canonicalization — should return Cow<str>, not String. The signature tells the caller “I’ll borrow when I can,” and the body only reaches for the heap on the path that earns it.

#189 Jun 2026

strings iterators utf-8

189. str::char_indices — Slice a String Without Panicking on Non-ASCII

chars().enumerate() hands you a character count, but &s[..] wants a byte offset. Mix them up and one accented letter blows your program apart.

Say you want everything from the underscore onward. The enumerate version looks right and works fine in tests full of ASCII:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
let s = "café_table"; // 'é' is two bytes in UTF-8

let idx = s
    .chars()
    .enumerate()
    .find(|(_, c)| *c == '_')
    .map(|(i, _)| i)
    .unwrap();

let rest = &s[idx..]; // idx == 4 (char count), but '_' starts at byte 5

idx is 4, the character position. Byte 4 lands in the middle of é, so the slice panics: byte index 4 is not a char boundary.

char_indices yields the real byte offset of each character, which is exactly what slicing expects:

1
2
3
4
5
6
7
8
let idx = s
    .char_indices()
    .find(|(_, c)| *c == '_')
    .map(|(i, _)| i)
    .unwrap();

assert_eq!(idx, 5);
assert_eq!(&s[idx..], "_table"); // no panic, correct slice

The pattern is (byte_offset, char) instead of enumerate’s (count, char). It’s also a DoubleEndedIterator, so next_back gives you the last character and where it begins:

1
2
let (last_off, last_ch) = s.char_indices().next_back().unwrap();
assert_eq!((last_off, last_ch), (10, 'e'));

Rule of thumb: the moment a character index touches &s[..], .split_at(), or any byte-indexed API, reach for char_indices — not enumerate.

#187 Jun 2026

strings fmt std

187. fmt::Write — Stop Allocating a Temp String Just to Append It

out.push_str(&format!("{name}: {score}")) builds a brand-new String, copies it into out, then throws it away — every single iteration. One use std::fmt::Write; and write! formats straight into your buffer instead.

The double-allocation habit

This pattern is everywhere, and it allocates a temporary String per call just to immediately copy and drop it:

1
2
3
4
5
6
7
let scores = [("ferris", 100), ("hermit", 42)];

let mut out = String::new();
for (name, score) in scores {
    out.push_str(&format!("{name}: {score}\n")); // temp String, copy, drop
}
assert_eq!(out, "ferris: 100\nhermit: 42\n");

Clippy even has a lint for it: format_push_string.

`write!` into the `String` directly

String implements std::fmt::Write, so the same write!/writeln! macros you use in Display impls work on it. The formatted output lands directly in the existing buffer — no intermediate allocation:

1
2
3
4
5
6
7
8
9
use std::fmt::Write; // bring the trait into scope

let scores = [("ferris", 100), ("hermit", 42)];

let mut out = String::new();
for (name, score) in scores {
    writeln!(out, "{name}: {score}").unwrap();
}
assert_eq!(out, "ferris: 100\nhermit: 42\n");

The .unwrap() looks scary but isn’t: write! returns fmt::Result because the trait allows failure, yet writing into a String can never fail — it just grows. let _ = writeln!(...) works too if you prefer.

Why it matters

The format! version allocates N temporary strings for N iterations. The write! version allocates only when out needs to grow — amortized, that’s a handful of reallocations total. In hot loops building large strings (reports, codegen, SQL), the difference shows up in profiles.

One gotcha: std::fmt::Write is for UTF-8 sinks (String); std::io::Write is for byte sinks (files, stdout). Same macro, different trait — if write!(out, ...) complains about no method named write_fmt, you imported the wrong one.

#163 May 2026

cow smart-pointers strings performance

163. Cow::to_mut — Lazy In-Place Mutation Through Cow

Cow<str> is the type everyone reaches for when a function might need to modify its input. Cow::Borrowed and Cow::Owned are the constructors that get the spotlight; to_mut is the third piece, and it’s the one that actually pays off the laziness.

What `to_mut` does

to_mut takes &mut Cow<str> and hands back &mut String:

If the Cow is already Owned, you get a direct &mut to the inner String.
If it’s Borrowed, to_mut clones the slice into a fresh String, swaps the Cow over to Owned, and then hands you the mutable reference.

That asymmetry is the whole point. Many callers borrow and never touch to_mut — they never allocate. The ones that do call it pay the allocation cost exactly once, on first write.

A walking-the-string example

Expand \t into two spaces, but only allocate if the input actually contains a tab:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
use std::borrow::Cow;

fn expand_tabs(s: &str) -> Cow<'_, str> {
    let mut out: Cow<'_, str> = Cow::Borrowed(s);
    if let Some(i) = s.find('\t') {
        // First write — `to_mut` clones the slice into a String, then we
        // rebuild from byte `i` onwards.
        let buf = out.to_mut();
        buf.truncate(i);
        for c in s[i..].chars() {
            if c == '\t' {
                buf.push_str("  ");
            } else {
                buf.push(c);
            }
        }
    }
    out
}

The happy path — input has no tab — never enters the if, never allocates, and returns the original slice wrapped in Cow::Borrowed. The unhappy path allocates exactly once.

Composing transformations

to_mut really earns its keep when you chain several optional mutations. The first one that fires flips the Cow to Owned; every following mutation sees an already-owned buffer and reuses it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
use std::borrow::Cow;

fn apply_rules<'a>(s: &'a str, rules: &[(char, &str)]) -> Cow<'a, str> {
    let mut out: Cow<'a, str> = Cow::Borrowed(s);
    for &(from, to) in rules {
        if out.contains(from) {
            let replaced = out.replace(from, to);
            *out.to_mut() = replaced;
        }
    }
    out
}

Three things worth pointing at. First, out.contains(from) works because Cow<str> derefs to str. Second, the assignment *out.to_mut() = replaced replaces the inner String, not the Cow itself. Third, once the first rule fires, all subsequent to_mut calls are a no-op &mut String — no extra clones.

Pitfall: `to_mut` always commits

There’s no “preview, then maybe commit” mode. Calling to_mut on a borrowed Cow clones immediately, even if you never end up writing through the returned reference. So this is a trap:

1
2
3
4
if !out.is_empty() {
    let _ = out.to_mut();  // allocates even though we may not change anything
    // ... maybe mutate, maybe not
}

Guard the call with the actual condition that means “I’m about to write,” not the condition that means “I might.” The mental shortcut: to_mut is the moment you trade your &str for a String. Reach for it lazily, but commit completely.

#135 May 2026

strings std

135. str::strip_prefix — Trim a Prefix Without Slicing by Hand

Reaching for if s.starts_with("foo") { &s[3..] } to drop a prefix? That’s an off-by-one waiting to happen — and a panic the first time someone passes in an emoji. str::strip_prefix returns Option<&str> and gets it right by construction.

The Problem

You want the part of a string after a known prefix:

1
2
3
4
5
6
7
8
let s = "Bearer abc123";

let token = if s.starts_with("Bearer ") {
    &s[7..]
} else {
    s
};
assert_eq!(token, "abc123");

Two things wrong here: the literal 7 has to stay in sync with the literal "Bearer ", and slicing by byte offset will panic if the prefix ever lands mid-codepoint. Even using prefix.len() only saves you from the first bug, not the second when the prefix is dynamic.

The Fix: `strip_prefix`

1
2
3
4
let s = "Bearer abc123";

let token = s.strip_prefix("Bearer ").unwrap_or(s);
assert_eq!(token, "abc123");

strip_prefix returns Some(&str) if the prefix matched (giving you the rest), or None if it didn’t. No magic numbers, no slicing, no UTF-8 footguns — the prefix length comes from the prefix itself.

Pattern Matching, Not Just Strings

The argument is anything implementing Pattern, so a char, a closure, or even an array of chars all work:

1
2
3
4
5
6
assert_eq!("-x".strip_prefix('-'), Some("x"));
assert_eq!("x".strip_prefix('-'), None);

// Trim any leading whitespace character
let s = "\t  hello".strip_prefix(|c: char| c.is_whitespace());
assert_eq!(s, Some("  hello"));

Note this only strips one match — the char form doesn’t loop. For “strip every leading space,” reach for trim_start_matches.

The Twin: `strip_suffix`

Same shape, other end:

1
2
3
4
let filename = "report.tar.gz";

let stem = filename.strip_suffix(".gz").unwrap_or(filename);
assert_eq!(stem, "report.tar");

Together they replace half the manual &s[..s.len() - 3] arithmetic you’d otherwise write — and the Option return makes “did it actually have the prefix?” a value, not a separate starts_with call.

#070 Apr 2026

iterators intersperse strings functional

70. Iterator::intersperse — Join Elements Without Collecting First

Tired of collecting into a Vec just to call .join(",")? intersperse inserts a separator between every pair of elements — lazily, right inside the iterator chain.

The problem

You have an iterator of strings and want to join them with a separator. The classic approach forces you to collect first:

1
2
3
4
5
6
7
8
fn main() {
    let words = vec!["hello", "world", "from", "rust"];

    // Works, but allocates an intermediate Vec<&str> just to join it
    let sentence = words.iter().copied().collect::<Vec<_>>().join(" ");

    assert_eq!(sentence, "hello world from rust");
}

It gets the job done, but that intermediate Vec allocation is wasteful — you’re collecting just to immediately consume it again.

The clean way

intersperse inserts a separator value between every adjacent pair of elements, returning a new iterator:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fn main() {
    let words = vec!["hello", "world", "from", "rust"];

    let sentence: String = words
        .iter()
        .copied()
        .intersperse(" ")
        .collect();

    assert_eq!(sentence, "hello world from rust");
}

No intermediate Vec. The separator is lazily inserted as you iterate, and collect builds the final String directly.

It works with any type

intersperse isn’t just for strings — it works with any iterator where the element type implements Clone:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fn main() {
    let numbers = vec![1, 2, 3, 4];

    let with_zeros: Vec<i32> = numbers
        .iter()
        .copied()
        .intersperse(0)
        .collect();

    assert_eq!(with_zeros, vec![1, 0, 2, 0, 3, 0, 4]);
}

This is handy for building sequences with delimiters, padding, or sentinel values between real data.

When the separator is expensive to create

If your separator is costly to clone, use intersperse_with — it takes a closure that produces the separator on demand:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fn main() {
    let parts = vec!["one", "two", "three"];

    let result: String = parts
        .iter()
        .copied()
        .intersperse_with(|| " | ")
        .collect();

    assert_eq!(result, "one | two | three");
}

The closure is only called when a separator is actually needed, so you pay zero cost for single-element or empty iterators.

Edge cases

intersperse handles the corners gracefully — empty iterators stay empty, and single-element iterators pass through unchanged:

1
2
3
4
5
6
7
8
9
fn main() {
    let empty: Vec<&str> = Vec::new();
    let result: String = empty.iter().copied().intersperse(", ").collect();
    assert_eq!(result, "");

    let single = vec!["alone"];
    let result: String = single.iter().copied().intersperse(", ").collect();
    assert_eq!(result, "alone");
}

Next time you reach for .collect::<Vec<_>>().join(...), try intersperse instead — it’s one less allocation and reads just as clearly.

#055 Apr 2026

strings utf-8 std

55. floor_char_boundary — Truncate Strings Without Breaking UTF-8

Ever tried to truncate a string to a byte limit and got a panic because you sliced in the middle of a multi-byte character? floor_char_boundary fixes that.

The Problem

Slicing a string at an arbitrary byte index panics if that index lands inside a multi-byte UTF-8 character:

1
2
3
4
5
6
let s = "Héllo 🦀 world";
// This panics at runtime!
// let truncated = &s[..5]; // 'é' spans bytes 1..3, index 5 is fine here
// but what if we don't know the content?
let s = "🦀🦀🦀"; // each crab is 4 bytes
// &s[..5] would panic — byte 5 is inside the second crab!

You could scan backward byte-by-byte checking is_char_boundary(), but that’s tedious and easy to get wrong.

The Fix: `floor_char_boundary`

str::floor_char_boundary(index) returns the largest byte position at or before index that sits on a valid character boundary. Its counterpart ceil_char_boundary gives you the smallest position at or after the index.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
fn main() {
    let s = "🦀🦀🦀"; // each 🦀 is 4 bytes, total 12 bytes

    // We want ~6 bytes, but byte 6 is inside the second crab
    let i = s.floor_char_boundary(6);
    assert_eq!(i, 4); // rounds down to end of first 🦀
    assert_eq!(&s[..i], "🦀");

    // ceil_char_boundary rounds up instead
    let j = s.ceil_char_boundary(6);
    assert_eq!(j, 8); // rounds up to end of second 🦀
    assert_eq!(&s[..j], "🦀🦀");
}

Real-World Use: Safe Truncation

Here’s a practical helper that truncates a string to fit a byte budget, adding an ellipsis if it was shortened:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
fn truncate(s: &str, max_bytes: usize) -> String {
    if s.len() <= max_bytes {
        return s.to_string();
    }
    let end = s.floor_char_boundary(max_bytes.saturating_sub(3));
    format!("{}...", &s[..end])
}

fn main() {
    let bio = "I love Rust 🦀 and crabs!";
    let short = truncate(bio, 16);
    assert_eq!(short, "I love Rust 🦀...");
    // 'I love Rust 🦀' = 15 bytes + '...' = 18 total
    // Safe! No panics, no broken characters.

    // Short strings pass through unchanged
    assert_eq!(truncate("hi", 10), "hi");
}

No more manual boundary scanning — these two methods handle the UTF-8 dance for you.

#044 Mar 2026

strings parsing stdlib

44. split_once — Split a String Exactly Once

When you need to split a string on the first occurrence of a delimiter, split_once is cleaner than anything you’d write by hand. Stable since Rust 1.52.

Parsing key=value pairs, HTTP headers, file paths — almost everywhere you split a string, you only care about the first separator. Before split_once, you’d reach for .find() plus index arithmetic:

The old way

1
2
3
4
5
6
7
8
let s = "Content-Type: application/json; charset=utf-8";

let colon = s.find(':').unwrap();
let header = &s[..colon];
let value = s[colon + 1..].trim();

assert_eq!(header, "Content-Type");
assert_eq!(value, "application/json; charset=utf-8");

Works, but it’s four lines of noise. The index arithmetic is easy to get wrong, and .trim() is a separate step.

With split_once

1
2
3
4
5
6
let s = "Content-Type: application/json; charset=utf-8";

let (header, value) = s.split_once(": ").unwrap();

assert_eq!(header, "Content-Type");
assert_eq!(value, "application/json; charset=utf-8");

One line. The delimiter is consumed, both sides are returned, and you pattern-match directly into named bindings.

Handling missing delimiters

split_once returns Option<(&str, &str)> — None if the delimiter isn’t found. This makes it composable with ? or if let:

1
2
3
4
5
6
7
fn parse_env_var(s: &str) -> Option<(&str, &str)> {
    s.split_once('=')
}

assert_eq!(parse_env_var("HOME=/root"), Some(("HOME", "/root")));
assert_eq!(parse_env_var("NOVALUE"), None);
assert_eq!(parse_env_var("KEY=a=b=c"), Some(("KEY", "a=b=c")));

Note the last case: split_once stops at the first =. The rest of the string — a=b=c — is kept intact in the second half. That’s usually exactly what you want.

rsplit_once — split from the right

When you need the last delimiter instead of the first, rsplit_once has you covered:

1
2
3
4
5
6
let path = "/home/martin/projects/rustbites/content/posts/bite-044.md";

let (dir, filename) = path.rsplit_once('/').unwrap();

assert_eq!(dir, "/home/martin/projects/rustbites/content/posts");
assert_eq!(filename, "bite-044.md");

Multi-char delimiters work too

The delimiter can be any pattern — a char, a &str, or even a closure:

1
2
3
4
5
6
7
8
let record = "alice::42::engineer";

let (name, rest) = record.split_once("::").unwrap();
let (age_str, role) = rest.split_once("::").unwrap();

assert_eq!(name, "alice");
assert_eq!(age_str, "42");
assert_eq!(role, "engineer");

Whenever you reach for .splitn(2, ...) just to grab two halves, replace it with split_once — the intent is clearer and the return type is more ergonomic.

#036 Mar 2026

cow smart-pointers strings performance

36. Cow<str> — Clone on Write

Stop cloning strings “just in case” — Cow<str> lets you borrow when you can and clone only when you must.

The problem

You’re writing a function that sometimes needs to modify a string and sometimes doesn’t. The easy fix? Clone every time:

1
2
3
4
5
6
7
fn ensure_greeting(name: &str) -> String {
    if name.starts_with("Hello") {
        name.to_string() // unnecessary clone!
    } else {
        format!("Hello, {name}!")
    }
}

This works, but that first branch allocates a brand-new String even though name is already perfect as-is. In a hot loop, those wasted allocations add up.

Enter `Cow<str>`

Cow stands for Clone on Write. It holds either a borrowed reference or an owned value, and only clones when you actually need to mutate or take ownership:

1
2
3
4
5
6
7
8
9
use std::borrow::Cow;

fn ensure_greeting(name: &str) -> Cow<str> {
    if name.starts_with("Hello") {
        Cow::Borrowed(name) // zero-cost: just wraps the reference
    } else {
        Cow::Owned(format!("Hello, {name}!"))
    }
}

Now the happy path (name already starts with “Hello”) does zero allocation. The caller gets a Cow<str> that derefs to &str transparently — most code won’t even notice the difference.

Using `Cow` values

Because Cow<str> implements Deref<Target = str>, you can use it anywhere a &str is expected:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
use std::borrow::Cow;

fn ensure_greeting(name: &str) -> Cow<str> {
    if name.starts_with("Hello") {
        Cow::Borrowed(name)
    } else {
        Cow::Owned(format!("Hello, {name}!"))
    }
}

fn main() {
    let greeting = ensure_greeting("Hello, world!");
    assert_eq!(&*greeting, "Hello, world!");

    // Call &str methods directly on Cow
    assert!(greeting.contains("world"));

    // Only clone into String when you truly need ownership
    let _owned: String = greeting.into_owned();

    let greeting2 = ensure_greeting("Rust");
    assert_eq!(&*greeting2, "Hello, Rust!");
}

When to reach for `Cow`

Cow shines in these situations:

Conditional transformations — functions that modify input only sometimes (normalization, trimming, escaping)
Config/lookup values — return a static default or a dynamically built string
Parser outputs — most tokens are slices of the input, but some need unescaping

The Cow type works with any ToOwned pair, not just strings. You can use Cow<[u8]>, Cow<Path>, or Cow<[T]> the same way.

Quick reference

Operation	Cost
`Cow::Borrowed(s)`	Free — wraps a reference
`Cow::Owned(s)`	Whatever creating the owned value costs
`*cow` (deref)	Free
`cow.into_owned()`	Free if already owned, clones if borrowed
`cow.to_mut()`	Clones if borrowed, then gives `&mut` access

Strings

The trap

The fix

One caveat

The trap

The fix

One caveat

The trap

The fix

One caveat

The trap

The fix

One caveat

The trap

The fix

One caveat

The trap

The fix

Unicode-aware, with an ASCII fast path

The hand-rolled trim

One end at a time

The pattern can be a closure or a set of chars

The assumption that costs allocations

You only pay on the rare path

Don’t undo it with a reflexive .to_string()

The Problem

The Fix: strip_prefix

Its Mirror: strip_suffix

A fresh allocation every iteration

.clear() keeps the capacity

The read-into-a-reused-buffer pattern

The wasteful version

Borrow on the fast path

Callers don’t notice

The double-allocation habit

write! into the String directly

Why it matters

What to_mut does

A walking-the-string example

Composing transformations

Pitfall: to_mut always commits

The Problem

The Fix: strip_prefix

Pattern Matching, Not Just Strings

The Twin: strip_suffix

The problem

The clean way

It works with any type

When the separator is expensive to create

Edge cases

The Problem

The Fix: floor_char_boundary

Real-World Use: Safe Truncation

The old way

With split_once

Handling missing delimiters

rsplit_once — split from the right

Multi-char delimiters work too

The problem

Enter Cow<str>

Using Cow values

When to reach for Cow

Quick reference

The Fix: `strip_prefix`

Its Mirror: `strip_suffix`

`.clear()` keeps the capacity

`write!` into the `String` directly

What `to_mut` does

Pitfall: `to_mut` always commits

The Fix: `strip_prefix`

The Twin: `strip_suffix`

The Fix: `floor_char_boundary`

Enter `Cow<str>`

Using `Cow` values

When to reach for `Cow`