Atomics

#277 Jul 2026

277. Release/Acquire — Publish Data Through a Flag, Not Just the Flag

This morning’s swap bite ended with a warning: once the winning thread writes data that other threads read, Relaxed stops being enough. Here’s the two-line fix.

The classic pattern: one thread computes a result, then raises a flag so others know it’s ready. With Relaxed everywhere, that’s broken:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};

static RESULT: AtomicU64 = AtomicU64::new(0);
static READY: AtomicBool = AtomicBool::new(false);

// writer thread
RESULT.store(42, Ordering::Relaxed);
READY.store(true, Ordering::Relaxed); // may be reordered!

// reader thread
if READY.load(Ordering::Relaxed) {
    // can legally observe RESULT == 0 here
    let r = RESULT.load(Ordering::Relaxed);
}

Relaxed only makes each individual operation atomic — it says nothing about the order two different atomics become visible in. The compiler or CPU may commit the READY store before the RESULT store, so a reader sees the flag up but the data stale.

The fix is a pair, one ordering on each side of the flag:

1
2
3
4
5
6
7
8
9
// writer: Release on the store that publishes
RESULT.store(42, Ordering::Relaxed);
READY.store(true, Ordering::Release);

// reader: Acquire on the load that checks
while !READY.load(Ordering::Acquire) {
    std::hint::spin_loop();
}
assert_eq!(RESULT.load(Ordering::Relaxed), 42); // guaranteed

When an Acquire load sees the value written by a Release store, everything the writer did before the store is visible to the reader after the load. The flag becomes a one-way gate for all the writes behind it — note RESULT itself can stay Relaxed; the flag pair carries the synchronization.

Full picture, verified with a scoped thread:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
std::thread::scope(|s| {
    s.spawn(|| {
        RESULT.store(42, Ordering::Relaxed);
        READY.store(true, Ordering::Release);
    });
    s.spawn(|| {
        while !READY.load(Ordering::Acquire) {
            std::hint::spin_loop();
        }
        assert_eq!(RESULT.load(Ordering::Relaxed), 42);
    });
});

Two things worth knowing. First, the pairing only kicks in when the Acquire load actually sees the Released value — that’s why the reader spins. Second, you won’t catch the Relaxed bug on your x86 laptop: the hardware is strongly ordered and hides it. Your ARM build server is not so forgiving, and the compiler is allowed to reorder either way.

Rule of thumb: Release on the store that publishes, Acquire on the load that consumes, Relaxed for lone counters and flags that guard nothing. This pair is exactly what Mutex unlock/lock and OnceLock do for you under the hood — reach for those first; reach for the raw pair when you can name what’s being published.

Stable since Rust 1.0, on every Atomic* type.

#276 Jul 2026

atomics concurrency

276. AtomicBool::swap — Let Exactly One Thread Claim the Job

“Check the flag, then set it” has a gap where two threads both see false — and your run-once code runs twice. swap sets and reads the flag in one atomic step.

Say only one thread should print a deprecation warning, spawn the background worker, or run cleanup. The obvious flag check is broken:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
use std::sync::atomic::{AtomicBool, Ordering};

static WARNED: AtomicBool = AtomicBool::new(false);

fn warn_once() {
    // RACE: both threads can see false
    if !WARNED.load(Ordering::Relaxed) {
        WARNED.store(true, Ordering::Relaxed);
        eprintln!("legacy config detected");
    }
}

Same check-then-act race as yesterday’s fetch_max story: two threads load false, both pass the if, both warn. The fix is again a single atomic call:

1
2
3
4
5
6
fn warn_once() {
    // one atomic op: first caller wins
    if !WARNED.swap(true, Ordering::Relaxed) {
        eprintln!("legacy config detected");
    }
}

swap stores the new value and returns the previous one as a single atomic operation. Only the first caller gets false back — everyone else sees true and skips the block. No mutex, no window to slip through:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
use std::sync::atomic::{AtomicBool, AtomicU32, Ordering};

static CLAIMED: AtomicBool = AtomicBool::new(false);
static RUNS: AtomicU32 = AtomicU32::new(0);

std::thread::scope(|s| {
    for _ in 0..8 {
        s.spawn(|| {
            if !CLAIMED.swap(true, Ordering::Relaxed) {
                RUNS.fetch_add(1, Ordering::Relaxed);
            }
        });
    }
});
assert_eq!(RUNS.load(Ordering::Relaxed), 1);

Relaxed is fine while the flag itself is the only shared state. The moment the winning thread writes data that losers will read, you need Acquire/Release ordering — or just reach for OnceLock, which handles that (and blocking until init finishes) for you.

Where swap beats Once/OnceLock: it produces no value, it never blocks the losers, and you can re-arm it — store(false, ...) resets the flag for the next round. Think “claim ticket”, not “lazy init”.

Works on every Atomic* type, stable since Rust 1.0.

#275 Jul 2026

atomics concurrency

275. fetch_max — Track a High-Water Mark Without the Race

“Check if it’s bigger, then store it” is two operations — and another thread can sneak between them. fetch_max is the whole thing in one atomic call.

Tracking peak latency across worker threads looks harmless:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
use std::sync::atomic::{AtomicU64, Ordering};

static PEAK: AtomicU64 = AtomicU64::new(0);

fn record(latency: u64) {
    // RACE: another thread can store
    // between the load and the store
    if latency > PEAK.load(Ordering::Relaxed) {
        PEAK.store(latency, Ordering::Relaxed);
    }
}

Thread A reads 10, decides its 31 is a new peak. Thread B reads 10, decides its 50 is too. B stores 50, then A stores 31 — your peak just went down. Classic check-then-act race, and it’ll pass every test on your laptop.

The fix is one line:

1
2
3
fn record(latency: u64) {
    PEAK.fetch_max(latency, Ordering::Relaxed);
}

fetch_max compares and stores as a single atomic operation — no window for another thread to slip through. Like every fetch_* method it returns the previous value, which gives you new-record detection for free:

1
2
3
4
let prev = PEAK.fetch_max(latency, Ordering::Relaxed);
if prev < latency {
    println!("new record: {latency}ms");
}

fetch_min is the mirror image for low-water marks. One gotcha: initialize it to the identity for min — u64::MAX, not 0 — or nothing will ever be smaller than your starting value:

1
2
3
4
5
let floor = AtomicU64::new(u64::MAX);
for l in [12, 5, 9] {
    floor.fetch_min(l, Ordering::Relaxed);
}
assert_eq!(floor.load(Ordering::Relaxed), 5);

This morning’s bite 274 covered fetch_update for arbitrary update rules — but check the shelf first: if your rule is just “keep the bigger one”, fetch_max is a single hardware-friendly call with no closure and no retry loop.

Available on all integer Atomic* types, stable since Rust 1.45.

#274 Jul 2026

atomics concurrency

274. fetch_update — The CAS Loop You Keep Writing by Hand

There’s no fetch_add variant that stops at a cap, so you write a compare_exchange_weak loop by hand. fetch_update is that loop, done right, in one call.

The Atomic* types ship fetch_add, fetch_or, fetch_min — but the moment your update rule isn’t one of those, you’re hand-rolling a compare-and-swap loop. A rate limiter that counts hits but saturates at a cap:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
use std::sync::atomic::{AtomicU32, Ordering};

const CAP: u32 = 3;
let hits = AtomicU32::new(0);

let mut cur = hits.load(Ordering::Relaxed);
while cur < CAP {
    match hits.compare_exchange_weak(
        cur,
        cur + 1,
        Ordering::Relaxed,
        Ordering::Relaxed,
    ) {
        Ok(_) => break,
        Err(actual) => cur = actual,
    }
}

Easy to get subtly wrong: forget to reload on failure, spin forever, or check the cap on the stale value. fetch_update owns the loop — you supply only the transform, returning Some(new) to store or None to leave it alone:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
let hits = AtomicU32::new(0);

for _ in 0..5 {
    let _ = hits.fetch_update(
        Ordering::Relaxed,
        Ordering::Relaxed,
        |n| (n < CAP).then(|| n + 1),
    );
}

assert_eq!(hits.load(Ordering::Relaxed), CAP);

The return value tells you what happened: Ok(prev) if your closure returned Some and the store went through, Err(prev) if it returned None — either way you get the previous value, so “did we hit the cap?” is just .is_err().

Two things to keep in mind. The closure can run more than once under contention (another thread changed the value between your load and the swap), so keep it pure — no side effects. And it takes two Orderings: the first for the successful store, the second for the loads; (Relaxed, Relaxed) is fine for counters.

Works on every Atomic* type, stable since Rust 1.45.

#157 May 2026

atomics concurrency sync std

157. Atomic* — The Thread-Safe Cell for Scalars

A Cell<T> lets a single thread mutate through &self — get/set instead of &mut. The atomic types in std::sync::atomic are the same shape, just Sync: a counter, flag, or pointer many threads can poke at without a Mutex, no lock acquisition, no guard, no panic on contention.

The pain: `Mutex<u64>` for a single counter

A request counter shared across worker threads is the textbook reach-for-Arc<Mutex<_>> case — and the textbook overkill:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
use std::sync::{Arc, Mutex};
use std::thread;

let hits = Arc::new(Mutex::new(0u64));
let mut handles = Vec::new();

for _ in 0..8 {
    let h = Arc::clone(&hits);
    handles.push(thread::spawn(move || {
        for _ in 0..1000 {
            let mut g = h.lock().unwrap();   // lock, increment, unlock — 1000 times
            *g += 1;
        }
    }));
}
for h in handles { h.join().unwrap(); }
assert_eq!(*hits.lock().unwrap(), 8_000);

Eight threads contending on a lock for an n += 1 is a lot of ceremony to add one to an integer. The CPU has a single instruction for this. Rust exposes it.

The fix: `AtomicU64` (or `AtomicUsize`, `AtomicBool`, …)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
use std::sync::Arc;
use std::sync::atomic::{AtomicU64, Ordering};
use std::thread;

let hits = Arc::new(AtomicU64::new(0));
let mut handles = Vec::new();

for _ in 0..8 {
    let h = Arc::clone(&hits);
    handles.push(thread::spawn(move || {
        for _ in 0..1000 {
            h.fetch_add(1, Ordering::Relaxed);   // one instruction, no lock
        }
    }));
}
for h in handles { h.join().unwrap(); }
assert_eq!(hits.load(Ordering::Relaxed), 8_000);

No lock(), no guard, no unwrap. fetch_add is a single read-modify-write — on x86 it’s literally lock xadd. The Arc is still there because the threads need shared ownership, but the interior is lock-free.

The API is just `Cell`’s API, with orderings

Every atomic has the same small surface:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
use std::sync::atomic::{AtomicUsize, Ordering};

let n = AtomicUsize::new(7);

// like Cell::get / Cell::set
let v = n.load(Ordering::Relaxed);     assert_eq!(v, 7);
n.store(42, Ordering::Relaxed);
assert_eq!(n.load(Ordering::Relaxed), 42);

// like Cell::replace
let old = n.swap(100, Ordering::Relaxed);
assert_eq!(old, 42);
assert_eq!(n.load(Ordering::Relaxed), 100);

Notice what’s missing: there is no &mut T anywhere. You never borrow the inside. You read out a copy or write one in. That’s why this works across threads at all — there’s nothing to alias.

Read-modify-write: the real reason atomics exist

The fetch_* family is where atomics earn their keep. Each is a single uninterruptible round-trip:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
use std::sync::atomic::{AtomicI32, Ordering};

let n = AtomicI32::new(10);

assert_eq!(n.fetch_add(5, Ordering::Relaxed), 10);  // returns old
assert_eq!(n.load(Ordering::Relaxed), 15);

assert_eq!(n.fetch_sub(3, Ordering::Relaxed), 15);
assert_eq!(n.fetch_or(0b1000, Ordering::Relaxed), 12);
assert_eq!(n.fetch_and(0b1100, Ordering::Relaxed), 0b1100);
assert_eq!(n.load(Ordering::Relaxed), 0b1100);

fetch_add, fetch_sub, fetch_or, fetch_and, fetch_xor, fetch_min, fetch_max — each one returns the value before the operation. That “before” is what makes them composable: you know exactly which thread did the increment that took you from 999 to 1000.

For anything more complex than a single op (clamp, toggle a state machine, transform), reach for update instead of hand-rolling a compare_exchange loop.

`AtomicBool`: the flag that doesn’t need a `Mutex`

The most common “I just want one bit” case:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
use std::sync::atomic::{AtomicBool, Ordering};

let stop = AtomicBool::new(false);

// thread A
stop.store(true, Ordering::Release);

// thread B's hot loop
if stop.load(Ordering::Acquire) {
    // shut down
}
# assert!(stop.load(Ordering::Acquire));

Release on the writer + Acquire on the reader pairs everything written before the store with everything read after the load — the standard cancellation-flag pattern. Relaxed would be fine if stop is the only thing the two threads share; use Acquire/Release when the flag is gating other writes.

std::sync::atomic ships an atomic for every primitive size:

Type	Notes
`AtomicBool`	Lock-free flags
`AtomicU8` / `U16` / `U32` / `U64` / `Usize`	Unsigned counters, bitmasks
`AtomicI8` / `I16` / `I32` / `I64` / `Isize`	Signed deltas
`AtomicPtr<T>`	Raw `*mut T`, for hand-rolled lock-free structures

Not every target supports every width lock-free (32-bit ARM lacks 64-bit CAS, for example). cfg(target_has_atomic = "64") lets you gate code that requires it. On modern x86_64 and aarch64, all of the above are lock-free.

What you give up vs `Mutex<T>`

Atomics work only on values the CPU already knows how to swap in one instruction. The moment you need to atomically update two fields together — a counter and a timestamp, say — you’re back to Mutex<T>. There is no AtomicStruct. You can’t fetch_push a Vec.

The other thing you give up is loud failure. A Mutex poisoned by a panic returns an Err; a deadlock blocks forever and shows up in a stack dump. An atomic happily does the wrong thing forever if you pick the wrong Ordering — the bug manifests as a flaky test under heavy load on a weakly-ordered CPU, and not at all on your laptop. Use SeqCst when in doubt; reach for Relaxed/Acquire/Release only when you can name what’s being synchronized with what.

When to reach for atomics

Counters, flags, generation numbers, fetch_add-style ID allocators, the “is this initialized yet” bit. Anything where the value fits in a register and the only operation is read / write / one-shot RMW.

Anything fatter — a config map, a parsed AST, a connection pool — wants a Mutex<T> or RwLock<T> wrapped in an Arc. And for the “compute once, then read forever” case across threads, there’s a purpose-built tool — that’s this afternoon’s bite.

#087 Apr 2026

concurrency atomics std

87. Atomic update — Kill the Compare-and-Swap Loop

Every Rust developer who’s written lock-free code has written the same compare_exchange loop. Rust 1.95 finally gives atomics an update method that does it for you.

The old way

Atomically doubling a counter used to mean writing a retry loop yourself:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
use std::sync::atomic::{AtomicUsize, Ordering};

let counter = AtomicUsize::new(10);

loop {
    let current = counter.load(Ordering::Relaxed);
    let new_val = current * 2;
    match counter.compare_exchange(
        current, new_val,
        Ordering::SeqCst, Ordering::Relaxed,
    ) {
        Ok(_) => break,
        Err(_) => continue,
    }
}
// counter is now 20

It works, but it’s boilerplate — and easy to get wrong (use the wrong ordering, forget to retry, etc.).

The new way: `update`

1
2
3
4
5
6
use std::sync::atomic::{AtomicUsize, Ordering};

let counter = AtomicUsize::new(10);

counter.update(Ordering::SeqCst, Ordering::SeqCst, |x| x * 2);
// counter is now 20

One line. No loop. No chance of forgetting to retry on contention.

The method takes two orderings (one for the store on success, one for the load on failure) and a closure that transforms the current value. It handles the compare-and-swap retry loop internally.

It returns the previous value

Just like fetch_add and friends, update returns the value before the update:

1
2
3
4
5
6
7
use std::sync::atomic::{AtomicUsize, Ordering};

let counter = AtomicUsize::new(5);

let prev = counter.update(Ordering::SeqCst, Ordering::SeqCst, |x| x + 3);
assert_eq!(prev, 5);  // was 5
assert_eq!(counter.load(Ordering::SeqCst), 8);  // now 8

This makes it perfect for “fetch-and-modify” patterns where you need the old value.

Works on all atomic types

update isn’t just for AtomicUsize — it’s available on AtomicBool, AtomicIsize, AtomicUsize, and AtomicPtr too:

1
2
3
4
5
use std::sync::atomic::{AtomicBool, Ordering};

let flag = AtomicBool::new(false);
flag.update(Ordering::SeqCst, Ordering::SeqCst, |x| !x);
assert_eq!(flag.load(Ordering::SeqCst), true);

When to use `update` vs `fetch_add`

If your operation is a simple add, sub, or bitwise op, the specialized fetch_* methods are still better — they compile down to a single atomic instruction on most architectures.

Use update when your transformation is more complex: clamping, toggling state machines, applying arbitrary functions. Anywhere you’d previously hand-roll a CAS loop.

Summary

Method	Use when
`fetch_add`, `fetch_or`, etc.	Simple arithmetic/bitwise ops
`update`	Arbitrary transformations (Rust 1.95+)
Manual CAS loop	Never again (mostly)

Available on stable since Rust 1.95.0 for AtomicBool, AtomicIsize, AtomicUsize, and AtomicPtr.

Atomics

277. Release/Acquire — Publish Data Through a Flag, Not Just the Flag

276. AtomicBool::swap — Let Exactly One Thread Claim the Job

275. fetch_max — Track a High-Water Mark Without the Race

274. fetch_update — The CAS Loop You Keep Writing by Hand

157. Atomic* — The Thread-Safe Cell for Scalars

The pain: Mutex<u64> for a single counter

The fix: AtomicU64 (or AtomicUsize, AtomicBool, …)

The API is just Cell’s API, with orderings

Read-modify-write: the real reason atomics exist

AtomicBool: the flag that doesn’t need a Mutex

The full menu

What you give up vs Mutex<T>

When to reach for atomics

87. Atomic update — Kill the Compare-and-Swap Loop

The old way

The new way: update

It returns the previous value

Works on all atomic types

When to use update vs fetch_add

Summary

The pain: `Mutex<u64>` for a single counter

The fix: `AtomicU64` (or `AtomicUsize`, `AtomicBool`, …)

The API is just `Cell`’s API, with orderings

`AtomicBool`: the flag that doesn’t need a `Mutex`

What you give up vs `Mutex<T>`

The new way: `update`

When to use `update` vs `fetch_add`