#[non_exhaustive]pub struct Snapshot {
pub ram_bytes: u64,
pub gpu: Option<ProcessGpuInfo>,
pub gpu_device: Option<GpuDeviceInfo>,
}Expand description
Combined snapshot of process RAM and GPU memory state at a point in time.
Constructed via Snapshot::now (one device) or Snapshot::all
(every visible GPU). RAM measurement is mandatory; both GPU fields
are best-effort and set to None when no backend is usable.
#[non_exhaustive]: fields may be added in future releases.
Fields (Non-exhaustive)§
This struct is marked as non-exhaustive
Struct { .. } syntax; cannot be matched against without a wildcard ..; and struct update syntax will not work.ram_bytes: u64Process resident set size in bytes.
gpu: Option<ProcessGpuInfo>Per-process GPU memory information for the requested device.
None when no GPU source is usable.
gpu_device: Option<GpuDeviceInfo>Device-wide GPU information for the requested device.
None when no GPU source is usable.
Implementations§
Source§impl Snapshot
impl Snapshot
Sourcepub fn now(device_index: u32) -> Result<Self>
pub fn now(device_index: u32) -> Result<Self>
Capture a fresh snapshot of process RAM and GPU memory for the given device index.
RAM is always measured. GPU measurement failures are non-fatal —
the corresponding fields are set to None rather than producing an error.
§Performance
Each call performs a full NVML init/shutdown cycle (and, on Windows,
a fresh IDXGIFactory1 walk). This adds a few milliseconds of
overhead per call — fine for occasional sampling around training
steps or model loads, less ideal for tight per-frame polling. A
long-lived NVML context is planned for v0.2.
§Per-process vs device-wide
When DXGI (Windows) or NVML (Linux) succeeds, gpu.used_bytes
is genuinely per-process and gpu.is_per_process is true. When
the dispatcher falls back to nvidia-smi (no NVML/DXGI
available, or WDDM NVML_VALUE_NOT_AVAILABLE), gpu.used_bytes
reflects the device-wide total and gpu.is_per_process is
false. Callers that need true per-process accounting should
check is_per_process before interpreting the value.
§Errors
Returns crate::HypomnesisError::Ram if the platform RAM
query fails — including the Linux path, where /proc/self/status
read errors and VmRSS parse failures are wrapped into the
Ram variant rather than surfaced as Io.
Examples found in repository?
19fn main() {
20 println!("--- hypomnesis print_demo ---");
21
22 let before = Snapshot::now(0).expect("Snapshot::now failed");
23
24 // Allocate ~50 MiB on the heap to produce a visible RAM delta.
25 // Use vec![0_u8; ...] (zeroed allocation) so the OS commits pages.
26 let hold: Vec<u8> = vec![0_u8; 50 * 1024 * 1024];
27
28 let after = Snapshot::now(0).expect("Snapshot::now failed");
29 let report = MemoryReport::new(before, after);
30
31 println!("--- print_delta ---");
32 report.print_delta("alloc 50 MiB");
33
34 println!("--- print_before_after ---");
35 report.print_before_after("alloc 50 MiB");
36
37 println!("--- format_delta (returned as String, no newline added by us) ---");
38 print!("{}", report.format_delta("alloc 50 MiB"));
39
40 println!("--- format_before_after (returned as String, no newline added by us) ---");
41 print!("{}", report.format_before_after("alloc 50 MiB"));
42
43 // Keep the allocation alive until here so the after-snapshot still
44 // reflects it (otherwise the optimizer could elide `hold`).
45 drop(hold);
46}Sourcepub fn all() -> Result<Vec<Self>>
pub fn all() -> Result<Vec<Self>>
Capture a fresh snapshot of process RAM and GPU memory for every
visible GPU.
On Linux: enumerates NVIDIA dGPU(s) via NVML. AMD / Intel iGPUs
do not surface — there is no AMD / Intel backend yet (an AMD
ROCm SMI backend and an Apple Metal backend are possibilities
for a later release; see docs/roadmap-v0.2.0.md).
On Windows: enumerates NVIDIA dGPU(s) via NVML plus every
other DXGI adapter that exposes
DedicatedVideoMemory > 0 or SharedSystemMemory > 0 (e.g.
AMD / Intel iGPUs). For non-NVIDIA adapters, total_bytes is
DedicatedVideoMemory when non-zero (matches what dGPUs and
UMA-allocated iGPUs expose), otherwise the WDDM shared-memory
budget (SharedSystemMemory). The semantics of total_bytes
therefore differ subtly between dGPUs and iGPUs. The Microsoft
Basic Render Driver (VendorId = 0x1414) is always skipped — it
has no real GPU memory to report.
Each returned Snapshot carries the same ram_bytes: a single
RSS measurement is taken once and reused, since the wall-time
delta across the GPU walk is microseconds and per-snapshot
re-measurement would add no useful precision.
Returns an empty Vec when no GPUs are visible. Callers needing
RAM-only state should use crate::process_rss or
Self::now (which returns a single Snapshot with gpu and
gpu_device set to None).
§Performance
Each NVIDIA device queried calls crate::process_gpu_info and
crate::device_info, each of which performs a fresh NVML
init / shutdown cycle (and, on Windows, a fresh IDXGIFactory1
walk). On Windows, Snapshot::all additionally walks
IDXGIFactory1 once more for the non-NVIDIA enumeration. For an
N-GPU system the worst-case cost is therefore
N × (NVML init + shutdown + DXGI walk) + 1 × DXGI walk on
Windows, and N × (NVML init + shutdown) on Linux. A long-lived
NVML context is planned for a later release.
§Errors
Returns crate::HypomnesisError::Ram if the platform RAM
query fails — including the Linux path, where /proc/self/status
read errors and VmRSS parse failures are wrapped into the
Ram variant rather than surfaced as Io.
Source§impl Snapshot
Convenience formatting helpers, available with features = ["report"].
impl Snapshot
Convenience formatting helpers, available with features = ["report"].
Located on Snapshot (rather than MemoryReport) for parity with
candle-mi’s MemorySnapshot::ram_mb / vram_mb API surface, so
candle-mi v0.2 can adopt hypomnesis with a thin adapter wrapper
rather than relocating the methods.
Sourcepub fn vram_mb(&self) -> Option<f64>
pub fn vram_mb(&self) -> Option<f64>
Per-process VRAM usage as megabytes, if available.
Returns None when gpu is None (no GPU source succeeded).
Reflects the dispatcher’s mixed semantics: per-process when
DXGI / NVML produced the value, device-wide when the
nvidia-smi fallback was used (check gpu.is_per_process).