hypomnesis
ὑπόμνησις — External RAM and VRAM, measured.
🚀
0.2.1is the first dogfooding-driven patch. Five wear-and-tear additions surfaced byhf-fetch-model 0.10.1's adoption: atest-helpers-featureGpuDeviceInfoBuilderso downstream tests can synthesise#[non_exhaustive]GpuDeviceInfofixtures, aname_or_unknown()convenience to settle consumer divergence on the fallback phrase,format_total/format_usedparity helpers forreport-feature consumers, anHypomnesisErrorDisplay-vs-structured-fields contract codified in the doc-comment, and aREADME.md"Used by" + brief refresh. All additive under the#[non_exhaustive]policy carried over from v0.2.0 (Snapshot::all,gpu_processes,hmnCLI,format_free/print_free). SeeCHANGELOG.mdfor the v0.2.1 entry,docs/roadmap-v0.2.1.mdfor the wave-by-wave rationale, anddocs/hypomnesis-adoption.mdfor the underlying dogfooding report.
Table of Contents
Install
[]
= "0.2"
The default feature set (nvml, dxgi, nvidia-smi-fallback) covers process RSS and per-process / device-wide GPU memory on both Windows (IDXGIAdapter3 + NVML) and Linux (NVML), with a nvidia-smi subprocess fallback. The dxgi dependency on the windows crate is target-conditional — Linux users pay nothing for it.
For candle-mi-compatible delta and printing helpers (MemoryReport, print_delta, print_before_after, ram_mb, vram_mb):
= { = "0.2", = ["report"] }
For a stripped-down build (process RSS only, no GPU backends):
= { = "0.2", = false }
Usage
use Snapshot;
Expected output (RTX 5060 Ti, Windows, idle process):
RAM: 142475264 bytes
GPU 0 [NVIDIA GeForce RTX 5060 Ti]: 1.8 / 16.0 GiB used
This process: 119 MiB (per-process)
Binary (hmn)
hypomnesis ships a small CLI binary, hmn, behind the default-off cli feature. Install it with:
Two subcommands:
Example default output (single NVIDIA dGPU, the maintainer's reference machine — Ryzen 9 5950X has no iGPU, so only one adapter surfaces):
GPU 0 [NVIDIA GeForce RTX 5060 Ti]: free 13284 MiB / 16384 MiB
Illustrative output on a heterogeneous machine (NVIDIA dGPU + Intel/AMD iGPU on Windows). Not yet verified end-to-end on real hardware — see docs/roadmap-v0.2.0.md "Verification plan":
GPU 0 [NVIDIA GeForce RTX 5060 Ti]: free 13284 MiB / 16384 MiB
GPU 1 [Intel Iris Xe Graphics]: free 32768 MiB / 32768 MiB
hmn ps (illustrative — empty on machines with no active CUDA workload):
PID NAME VRAM DEVICE
12345 lm-studio.exe 8.2 GiB NVIDIA GeForce RTX 5060 Ti
67890 python.exe 1.4 GiB NVIDIA GeForce RTX 5060 Ti
A one-line summary is written to stderr after each hmn ps run:
hmn: 2 compute processes found.
hmn: 0 compute processes found matching pid=99 device=0. # with filters
The stderr summary is always printed, even when the table is empty, so interactive users get an unambiguous "command worked, here's the count" line without breaking stdout's scriptability. Pipelines like hmn ps | awk 'NR>1 {print $1}' or hmn ps --json | jq work as expected. Redirect 2>/dev/null to suppress the summary.
Limitations (intrinsic to the underlying data sources, not bugs):
- Compute-only.
hmn psenumerates only processes with an activeCUDAcontext. Browsers using GPU compositing, games, and pure-graphics apps do not appear. This is a property of theNVMLandnvidia-smi --query-compute-appsdata sources. - Windows process names may be
?.nvidia-smiwrites a literal?for protected processes whose image name it cannot read. The library preserves this asSome("?")rather than failing the row. - WDDM bug parity. The
R570u64::MAXsentinel andused > totalcorruption checks the library handles for the calling process are applied per-row inhmn ps; affected rows are dropped rather than reported as garbage. - Windows compute-process attribution is
nvidia-smi-backed.IDXGIAdapter3::QueryVideoMemoryInfoonly answers for the calling process, andNVML's per-process query returnsNVML_VALUE_NOT_AVAILABLEunderWDDM. Sohmn pson Windows is honest-but-second-class compared to Linux's cleanNVMLenumeration.
Capabilities
| Metric | Windows | Linux |
|---|---|---|
| Process RSS | K32GetProcessMemoryInfo |
/proc/self/status (no unsafe) |
| Device-wide GPU memory | NVML (nvml.dll) |
NVML (libnvidia-ml.so.1) |
| Per-process GPU memory | DXGI (IDXGIAdapter3::QueryVideoMemoryInfo) |
NVML (nvmlDeviceGetComputeRunningProcesses) |
| Fallback | nvidia-smi subprocess |
nvidia-smi subprocess |
hypomnesis uses IDXGIAdapter3 on Windows because WDDM means the kernel memory manager — not the NVIDIA driver — owns GPU allocations, so NVML's per-process query returns NOT_AVAILABLE under Windows. DXGI 1.4 is the only reliable per-process source. On Linux, NVML's nvmlDeviceGetComputeRunningProcesses_v3 returns true per-process figures.
The crate handles two known driver bugs out of the box:
NVMLu64::MAXsentinel — someR570-series drivers report0xFFFFFFFFFFFFFFFFfor every running process's memory (observed onRTX 5060 Ti).hypomnesisdetects this and falls back tonvidia-smi.used > totalcorruption — sanity-checks each per-process reading against the device-wide total; falls back tonvidia-smion detected corruption.
Feature Flags
| Feature | Default | Description |
|---|---|---|
nvml |
yes | NVML dynamic load via libloading (Linux + Windows-WDDM device-wide) |
dxgi |
yes | Windows per-process VRAM via IDXGIAdapter3 (no-op on non-Windows) |
nvidia-smi-fallback |
yes | Subprocess fallback when NVML / DXGI fail or are disabled |
report |
no | MemoryReport delta + print_delta / print_before_after / ram_mb / vram_mb helpers (candle-mi parity, candidate for candle-mi v0.2 migration via Cargo flag flip); format_free / print_free / format_total / format_used formatting helpers on GpuDeviceInfo |
debug-output |
no | Print raw NVML / DXGI values to stderr (diagnostic) |
cli |
no | Build the hmn CLI binary (pulls clap 4 as a dep). Library users do not need this; install via cargo install hypomnesis --features cli. |
test-helpers |
no | Expose GpuDeviceInfoBuilder for downstream tests that need synthetic GpuDeviceInfo fixtures. Default-off, additive — production code must never enable it. |
Used by
- hf-fetch-model — Hugging Face model weights and metadata fetcher (uses
device_infoforinspect --check-gpu)
Forthcoming: candle-mi is expected to migrate its in-tree memory module to hypomnesis (features = ["report"]) after v0.2.1 lands.
License
Licensed under either of Apache License, Version 2.0 or MIT License at your option.
Development
- Exclusively developed with Claude Code (dev) and Augment Code (review)
- Git workflow managed with Fork
- All code follows CONVENTIONS.md, derived from Amphigraphic-Strict's Grit — a strict Rust subset designed to improve AI coding accuracy.