98. sort_by_cached_key — Stop Recomputing Expensive Sort Keys
sort_by_key sounds like it computes the key once per element. It doesn’t — it calls your closure at every comparison, so an n-element sort can pay for that key O(n log n) times. If the key is expensive, sort_by_cached_key is the fix you’ve been looking for.
The trap
The signature reads nicely: “sort by this key.” The implementation, less so — the closure fires on every comparison, not once per element:
| |
For identity keys that cost a pointer-deref, nobody cares. For anything that allocates — .to_string(), .to_lowercase(), format!(...), a regex capture, a trimmed-and-lowered filename — the cost compounds quickly. I’ve seen a profile where 80% of total runtime was the key closure being called 40,000 times to sort 2,000 items.
The fix
slice::sort_by_cached_key runs your closure exactly once per element, stashes the results in a scratch buffer, then sorts against the cache. This is the Schwartzian transform, wrapped up in a method call:
| |
Same result, linear key-function calls. The memory trade is a Vec<(K, usize)> the size of the slice — cheap next to the cost of re-running an allocating closure on every compare.
When to reach for which
The rule is about where your time goes, not how fancy the key looks:
| |
Use sort_by_key for cheap, Copy-ish keys. Use sort_by_cached_key the moment your closure allocates, hashes, parses, or otherwise does real work — it’s the difference between O(n log n) and O(n) calls to that closure.