Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
בס״ד
לכבוד הקדוש ברוך הוא — for the glory of the Holy One, blessed be He
captrack
Capacity telemetry for Rust collections — call-site macros that record actual observed capacities, with zero overhead when disabled.
Two ways to use it
- Library —
t*!macros in your source. ReplaceVec::with_capacity(64)withtvec!("my_module/rows", 64)at the call-sites you care about. Zero overhead by default; flip thetelemetryfeature on in a bench profile to collect real capacity data. Start at Quick start. - Tool —
captrack-pgoon unmodified code. No source changes, no macros: the CLI temporarily instruments your whole workspace, runs your tests and/or benches, collects a capacity profile, and rewrites bare constructors to data-drivenwith_capacity(N)values — then removes every trace of itself. Start at the end-to-end walkthrough.
What it does
captrack wraps every major Rust collection constructor with a named macro.
When the telemetry feature is off (the default) each macro expands to
the bare constructor — the compiler sees exactly Vec::with_capacity(n) etc.
and optimises accordingly. When telemetry is on, each macro returns a
thin Tracked* wrapper that records per-call-site statistics in a global
lock-free registry (using scc::HashMap):
- Registry key —
(file, line, column)of the macro call-site, not the name string. Each distinct source location is one independent entry. creation_count— total number of instances created at that call-site (atomicu64, updated on construction).samples— a reservoir-sampled list of observed capacities/lengths (Vitter Algorithm R, bounded toCAPTRACK_SAMPLE_CAP, default 4096). Pushed on everyDrop,into_iter, orcap_inspect_atcall:- For capacity-based collections (
Vec,VecDeque,String,HashMap,HashSet,IndexMap,IndexSet,BytesMut,SmallVec,hashbrown::HashMap,BinaryHeap): recordsinner.capacity()— this is the allocated backing-store size, which grows monotonically, so the final value equals the peak. - For length-based collections (
BTreeMap,BTreeSet,DashMap,scc::HashMap,scc::HashSet,scc::TreeIndex): recordsinner.len()at the moment of Drop. This is NOT peak occupancy — if the collection shrinks before Drop the recorded value undercounts the true peak.
- For capacity-based collections (
total_observed— the true population count of all capacity observations (including those beyond the reservoir cap). Whentotal_observed > samples.len()the reservoir is saturated and percentile statistics should be read as approximate (±5% for uniform distributions, per regression tests).
At the end of a benchmark call captrack::dump_capacity_stats("path.json") to
write a sorted JSON report.
total_observed is omitted when the reservoir was never saturated
(total_observed == samples.len()), keeping the format backward-compatible
with pre-reservoir readers.
Entries are sorted by max(samples) descending so the largest allocations
surface first. The samples list is a reservoir snapshot; aggregate
statistics (median, p95, p99, stddev) are computed in post-processing:
use SampleStats;
// After deserialising the JSON — for each entry:
if let Some = from_samples
Set CAPTRACK_SAMPLE_CAP at process start to override the reservoir size
(e.g. CAPTRACK_SAMPLE_CAP=128 for minimal memory, CAPTRACK_SAMPLE_CAP=65536
for high-accuracy profiling of skewed distributions).
Use the data to replace guesses like Vec::with_capacity(16) with
data-driven values.
Quick start
[]
= "0.1"
= "2" # for tmap!/tset!
# To use TrackedIndexMap as a type alias even without telemetry:
# captrack = { version = "0.1", features = ["indexmap"] }
use ;
// Named, zero-overhead in production:
let mut v = tvec!;
let mut m = tmap!;
let mut b = tbtreemap!;
Three orthogonal axes
Axis 1 — telemetry on/off
# Enable telemetry (e.g. in a benchmark profile):
[]
= { = "0.1", = ["telemetry"] }
Off (default) = zero overhead, bare constructors. TrackedX names are still
available as type aliases to the underlying std/third-party types via
src/aliases.rs.
On = Tracked* wrapper structs, global lock-free registry (scc::HashMap
keyed by call-site location), JSON dump. Enabling telemetry automatically
activates all optional mirror features (bytes, indexmap, dashmap, scc,
smallvec, hashbrown).
The TrackedX alias mirror features let you use e.g. TrackedIndexMap in
off-feature mode without pulling in telemetry overhead:
[]
= { = "0.1", = ["indexmap"] } # TrackedIndexMap alias, no telemetry
// Works in both modes — no #[cfg] needed:
dump_capacity_stats?;
Axis 2 — hasher choice
Three levels, from coarsest to finest:
Level A — global default via feature flag
| Feature | CapHasher |
|---|---|
| (none) | RandomState (DoS-safe, std default) |
fxhash |
fxhash::FxBuildHasher (fast, non-cryptographic) |
ahash |
ahash::RandomState |
foldhash |
foldhash::fast::RandomState |
rustc-hash |
rustc_hash::FxBuildHasher |
Select at most one — compile_error! fires otherwise.
# Your Cargo.toml:
= { = "0.1", = ["ahash"] }
// All hash macros now use ahash as the default:
let m = tmap!;
Level B — per-call override via ;-arm
use ;
// uses CapHasher (global default):
let m1 = tmap!;
// this one call uses ahash regardless of CapHasher:
let m2 = tmap!;
All 8 hash macros (tfxmap!, tfxset!, tmap!, tset!, tdashmap!,
tsccmap!, tsccset!, thashbrownmap!) support the ;-arm.
Level C — custom family via declare_collections!
// In your crate root (once) — requires captrack in [dependencies]:
declare_collections!
// Generated macros:
// my_vec! my_vecdeque! my_btreemap! my_btreeset! my_bytesmut!
// my_fxmap! my_fxset! my_map! my_set!
// my_dashmap! my_sccmap! my_sccset! my_scctree!
let rows = my_vec!;
let index = my_map!;
// index uses MyExoticHasher by default
The generated macros delegate to ::captrack::t*! with the custom hasher
injected via the ;-arm. The telemetry on/off decision is made by captrack's
own feature flag, not yours.
Axis 3 — enforcing the discipline (clippy)
Copy clippy.toml.example (fully or partially) into your project's
clippy.toml to ban bare collection constructors. The captrack macros carry
#[allow(clippy::disallowed_methods, clippy::disallowed_types)] internally so
they are always safe — the ban applies only to hand-written bare constructors.
# clippy.toml (your project) — partial example:
= [
{ = "std::vec::Vec::with_capacity",
reason = "use captrack::tvec!(\"name\", cap)" },
{ = "std::collections::HashMap::with_capacity_and_hasher",
reason = "use captrack::tfxmap!(\"name\", cap)" },
# ... see clippy.toml.example for the full list
]
All 17 macros
| Macro | Collection | Notes |
|---|---|---|
tvec! |
Vec<T> |
|
tvecdeque! |
VecDeque<T> |
|
tbtreemap! |
BTreeMap<K,V> |
cap hint accepted, ignored |
tbtreeset! |
BTreeSet<T> |
cap hint accepted, ignored |
tbytesmut! |
bytes::BytesMut |
requires bytes crate |
tfxmap! |
std::HashMap<K,V,S> |
;-arm supported |
tfxset! |
std::HashSet<T,S> |
;-arm supported |
tmap! |
indexmap::IndexMap<K,V,S> |
;-arm, requires indexmap |
tset! |
indexmap::IndexSet<T,S> |
;-arm, requires indexmap |
tdashmap! |
dashmap::DashMap<K,V,S> |
;-arm, requires dashmap |
tsccmap! |
scc::HashMap<K,V,S> |
;-arm, requires scc |
tsccset! |
scc::HashSet<T,S> |
;-arm, requires scc |
tscctree! |
scc::TreeIndex<K,V> |
cap hint accepted, ignored |
tstring! |
String |
capacity-based |
tbinaryheap! |
BinaryHeap<T> |
capacity-based |
thashbrownmap! |
hashbrown::HashMap<K,V,S> |
;-arm supported; capacity-based |
tsmallvec! |
smallvec::SmallVec<A> |
requires smallvec crate; capacity-based |
t*_owned! — initial-capacity-only siblings
Ten of the macros above have an _owned sibling: tvec_owned!,
tvecdeque_owned!, tbytesmut_owned!, tfxmap_owned!, tfxset_owned!,
tmap_owned!, tset_owned!, tdashmap_owned!, tsccmap_owned!,
tsccset_owned!. Unlike their t*! counterparts, these:
- always return the bare collection type (
Vec<T>,HashMap<K,V,S>, …) in both feature modes — noTracked*wrapper, no.into_inner()call; - record only the initial requested capacity as a single sample, instead of tracking the Drop-time peak.
Use them at call-sites where the capacity you pass in already is the final
size (e.g. tvec_owned!("name", input.len())) and you want a plain
collection at the function boundary. tbtreemap!, tbtreeset!, and
tscctree! have no _owned variant — those types have no with_capacity
constructor, so an initial-capacity sample would always be 0.
Tracked types (telemetry mode)
When telemetry is enabled the macros return Tracked* wrappers:
TrackedVec<T>, TrackedHashMap<K,V,S>, TrackedIndexMap<K,V,S>, etc.
All wrappers implement Deref/DerefMut to the underlying collection so
existing code continues to work transparently.
Profile-guided capacity optimization (captrack-pgo)
captrack-pgo is a companion CLI that closes the measure-apply loop on an
unmodified codebase — you don't adopt the t*! macros; the tool
instruments a throwaway state of your tree, profiles it, and leaves only
plain-Rust with_capacity(N) numbers behind.
Setup (once)
# The Dylint plugin compiles against rustc_private and is distributed as
# source (see "Not published on crates.io" below) — clone this repo to get it:
# → the plugin lives at captrack/captrack-pgo-lint/ (nightly toolchain
# auto-installs on first use via its rust-toolchain.toml)
Every instrument/apply/string-reuse invocation takes
--lint-path <path>/captrack/captrack-pgo-lint (or finds it automatically
when it's a sibling directory of your current directory).
End-to-end walkthrough
The full manual pipeline, step by step. This is the same sequence the
one-command measure orchestration runs, but split out so you can profile
tests as well as benches (test suites often exercise far more call-sites
than benches do) and inspect intermediate state:
# 1. Wire: add captrack (with the telemetry feature) to every member's
# Cargo.toml. Reverted by `unwire` at the end.
# 2. Instrument: the Dylint plugin wraps every collection constructor in
# TrackedX::wrap_from(...) — tests and benches included (--all-targets).
# Reverted by `uninstrument`.
# 3. Run whatever exercises your real workloads. Each process writes its own
# dump file (profile-<binary>-<pid>-<start_ms>.json) into CAPTRACK_DUMP_DIR.
# Use an absolute path and a FRESH directory per profiling session — dumps
# accumulate and merge picks up everything matching the glob.
# 4. Merge all per-process dumps into one profile.
# 5. Restore the tree — instrumentation was throwaway state.
# 6. (Optional) classify per-site distributions and inject per-site policies.
# 7. Review, then apply. --force is needed here: the staleness guard compares
# against the instrument-time snapshot, and step 5 legitimately changed the
# files back.
# 8. Verify, and keep what you like.
Notes:
- The lint's rules engine is conservative by design: sites with fewer than 10
observations, dynamic capacity expressions (
Vec::with_capacity(n)wherenis a runtime value), and sites whose observed peak is within 4× of the current literal are all left alone. A large profiled workspace legitimately producing only a handful of rewrites usually means the code was already well-sized — checkapply --dry-runoutput for the per-site reasoning. vec![x; n]andvec![a, b, c]are never rewritten (their length is fixed by the macro arguments); the emptyvec![]is treated asVec::new().
One-command variant (measure, bench-only)
When benches alone are representative, measure runs steps 1–5 in one
RAII-guarded command (cleanup runs even on panic):
captrack-pgo measure --workspace <path> \
--bench tx_pipeline --bench wal_append --bench filter_eval
# → target/captrack-pgo/merged.json
Internally: wire → instrument → for each --bench, cargo bench --no-run then run the binary with CAPTRACK_DUMP_DIR set + a per-bench
timeout → merge the per-bench dumps into one profile → uninstrument →
unwire. Pass --captrack-path <checkout> to wire against a local
captrack checkout instead of the crates.io release.
Analyze — distribution shapes
Once you have a profile, you can optionally classify per-site distribution and inject tailored cap policies before apply:
captrack-pgo analyze --profile merged.json --write-policy
# UnimodalTight → cap_from=max (zero realloc, minimal waste)
# UnimodalSpread → cap_from=p95 (global default)
# Bimodal → cap_from=median × 2.0 (size typical case)
# HeavyTail → cap_from=p95 × 0.5 (don't follow the tail)
# MostlyZero → cap_from=p99 over non-zero subset
Apply — flags and hasher swap
captrack-pgo apply --profile merged.json [--workspace <path>] \
[--hasher fx|ahash|foldhash|none] \
[--cap-from max|mean|median|p95|p99] \
[--cap-mul <float>] \
[--cap-round pow2|to8|exact] \
[--dry-run]
- Without
--hasher(or--hasher none): only capacity hints are updated (Vec::new()→Vec::with_capacity(N), etc.). - With
--hasher fx: matchedHashMap/HashSet/IndexMap/IndexSet/DashMap/scc::HashMap/scc::HashSet/hashbrown::HashMapconstructors are upgraded towith_capacity_and_hasher(N, ::fxhash::FxBuildHasher::default()). Other hasher options:ahash(::ahash::RandomState::new()) andfoldhash(::foldhash::fast::RandomState::default()). - The chosen hasher crate must already be a dependency of your workspace (captrack-pgo emits a reminder).
- Type-ascribed lets are rewritten via multi-span suggestions
(Phase N):
let m: HashMap<K, V> = HashMap::new();becomeslet m: HashMap<K, V, FxBuildHasher> = HashMap::with_capacity_and_hasher(N, FxBuildHasher::default());— the ascription's generic argument list and the constructor are rewritten atomically throughcargo fix, so the diff never goes through an intermediateE0308state. - Already-fast hashers are detected and skipped (Phase O): a
binding ascribed
HashMap<K, V, FxBuildHasher>or anyBuildHasherDefault<FxHasher>alias (e.g. shamir-db'sTHasher) is recognised as already-fast and the hasher-swap suggestion is suppressed. Capacity-only rewrite still fires.
Apply — capacity policy knobs
Defaults reproduce the pre-M11 behaviour exactly (next_pow2(p95)):
| Flag | Env var | Values | Default |
|---|---|---|---|
--cap-from |
CAPTRACK_PGO_CAP_FROM |
max | mean | median | p95 | p99 |
p95 |
--cap-mul |
CAPTRACK_PGO_CAP_MUL |
any float > 0 | 1.0 |
--cap-round |
CAPTRACK_PGO_CAP_ROUND |
pow2 | to8 | exact |
pow2 |
Formula: cap = round_mode( source_statistic × cap_mul ).
Examples:
# Never reallocate (capacity = peak observed value):
# Conservative: median × 2, rounded to next power of two:
# Exact 99th-percentile value (no rounding):
Per-site policy override in the profile JSON:
Individual hot-path sites can override the global policy by adding a policy
field to their entry in the profile JSON. Each sub-field is independent;
missing ones fall back to the global CLI defaults.
A site with "policy": { "cap_from": "max" } will always use its peak value,
regardless of the --cap-from flag passed on the command line. Other fields
(cap_mul, cap_round) not listed in the per-site policy still come from
the CLI defaults.
Internally this runs cargo dylint --fix with the captrack-pgo-lint
plugin, which resolves collection constructors at the semantic (HIR) level
and emits rustfix suggestions. A before/after manifest is written to
target/captrack-pgo/last-apply.json.
Undo
Revert the last apply or instrument:
captrack-pgo undo
String buffer-reuse (independent of the profile)
Rewrite a String that is fully reassigned inside a loop
(s = format!(...)) into buffer-reusing { s.clear(); s.push_str(&(...)); }
form:
captrack-pgo string-reuse --workspace <path> --dry-run
Gated by CAPTRACK_PGO_STRING_REUSE=1. Conservative: fires only for a
single top-level s = <expr>; in a loop body where <expr> does not read
s. The suggestion is MaybeIncorrect, so review it (--dry-run) before
applying. See CLAUDE.md → "String buffer-reuse lint" for details.
Architecture — wrap_from + CapInspect
captrack-pgo instrument does NOT replace the constructor. It WRAPS
the original expression:
// before
let v = vec!;
// after
let v = wrap_from;
This is universal — it works for any constructor shape (X::new(),
X::with_capacity(N), X::default(), vec![…], smallvec![…],
X::from_iter(…), custom builders) without losing element
initialisation. TrackedX derefs to X, so receiver / &v /
indexing positions continue to compile against the wrapped value.
The data-flow guard skips by-value escape positions (return, struct
field init, function argument, type-ascribed let init) because
TrackedX != X there would trip E0308. For those sites, the lint
instead injects a borrow-only inspection at the consumption point
(Phase L), leaving the binding's type untouched:
// before
// after
The block expression evaluates cap_inspect_at(&v, …) for its
side-effect (record v.capacity() against the binding's call-site)
then yields v unchanged. Type stays Vec<u8>; no E0308.
Requirements: cargo install cargo-dylint dylint-link and the
nightly-2026-04-16 toolchain (pinned in captrack-pgo-lint/rust-toolchain.toml).
Not published on crates.io: captrack-pgo-lint is a Dylint plugin — it
compiles against rustc_private on a pinned nightly, which crates.io's
stable-only toolchain can't build. It's excluded from the workspace
(Cargo.toml's [workspace] exclude) and distributed as source only,
cloned alongside this repo (as a sibling directory) and built on demand via
cargo dylint --path <lint-path>. captrack-pgo instrument/apply resolve
<lint-path> automatically when captrack-pgo-lint/ sits next to
captrack-pgo/, or accept --lint-path to point elsewhere. captrack,
captrack-macros, and captrack-pgo are ordinary crates and do publish to
crates.io.
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT License (LICENSE-MIT)
at your option.