What it does
mod-alloc is a #[global_allocator] wrapper that tracks every
allocation and answers four questions for the code that runs while
it is installed:
- How many allocations did this code path make?
- How many total bytes did it allocate?
- What was the peak resident memory?
- Which call sites did most of the allocating? (with the
backtracesfeature)
The whole crate is std-only. No backtrace, no addr2line, no
gimli, no libc. Inline frame-pointer walking on x86_64 and
aarch64 for call-site capture; raw mmap / VirtualAlloc for
the per-thread arena and the global aggregation table.
Quick start
use ;
static GLOBAL: ModAlloc = new;
Feature flags
[]
= "0.9" # Tier 1: counters (default)
= { = "0.9", = ["backtraces"] } # Tier 2: call-site capture
= { = "0.9", = ["dhat-compat"] } # Tier 3: DHAT JSON output (v0.9.3)
| Feature | What it adds | Status |
|---|---|---|
counters |
Four lock-free counters via GlobalAlloc (default) |
shipped (v0.9.0) |
backtraces |
Inline FP walk + per-call-site aggregation | shipped (v0.9.1) |
dhat-compat |
Emit JSON for the official DHAT viewer | planned (v0.9.3) |
Backtraces
Enabling the backtraces feature requires frame pointers in the
caller's build:
# .cargo/config.toml
[]
= ["-C", "force-frame-pointers=yes"]
The crate's build.rs emits a cargo:warning= at compile time if
RUSTFLAGS is missing this. Without it the walker degrades
gracefully (returns shallow or empty traces) but does not crash.
The aggregation-table size is configurable at process start:
MOD_ALLOC_BUCKETS=16384
Default is 4,096 buckets (~384 KB). Range [64, 1_048_576],
rounded up to the next power of two.
Performance
Measured per allocation, end to end, on a Windows x86_64 dev host
with cargo run --release --example bench_overhead:
| Build | Per alloc + dealloc cycle |
|---|---|
Tier 1 only (counters, default) |
34.9 ns |
Tier 1 + Tier 2 (backtraces) |
~1,950 ns |
Tier 1 comes in well under the 50 ns target from the spec
(REPS.md section 6). Tier 2 is currently above the
200 ns target in that section; closing that gap is tracked for
v0.9.1.1. The Tier 2 path is correct and recursion-safe in the
current release; the optimisation is a separate, focused pass.
Why a new allocation profiler
dhat is the de facto standard for allocation profiling in Rust,
but its dependency chain (backtrace 0.3.76 → addr2line 0.25.1)
forces consumers to MSRV 1.85+. For projects with a broader MSRV
target, that cost is real.
mod-alloc provides the same core capability with inline
backtrace capture (frame-pointer-based, x86_64 + aarch64) and
no external dependencies. The trade-off is fewer architectures
supported in 1.0; ARM32, RISC-V, and others land based on
demand.
Status
| Milestone | Version | State |
|---|---|---|
| Name-claim placeholder | v0.1.0 |
shipped |
Real GlobalAlloc + Tier 1 counters |
v0.9.0 |
shipped |
| Tier 2: inline backtrace capture | v0.9.1 |
shipped |
| Tier 2 perf optimisation | v0.9.1.1 |
planned |
| Symbolication for reports | v0.9.2 |
planned |
| Tier 3: DHAT-compatible JSON output | v0.9.3 |
planned |
dev-bench integration (drop dhat) |
v0.9.4 |
planned |
Stable API (1.0) |
v1.0.0 |
planned |
The 1.0 release freezes the public API and the wire format.
Breaking changes after that require a major bump.
Out of scope
- Replacing the system allocator. Use
mimallocorjemallocatorfor that. - Use-after-free / double-free detection. Use AddressSanitizer.
- Source-level instrumentation (build.rs, proc macros). The one build.rs in this crate exists solely to detect missing frame pointers at compile time.
Minimum supported Rust version
1.75, pinned in Cargo.toml and verified by CI on every push.
License
Apache-2.0. See LICENSE.