Skip to main content

Module gpu

Module gpu 

Source
Expand description

GPU Monitoring Module (MLOPS-005)

btop-inspired GPU monitoring for terminal training dashboard.

§Toyota Way: Andon

Visual alerting system for immediate problem detection. Thermal throttling, memory pressure, and power limits trigger alerts.

§Example

use entrenar::monitor::gpu::{GpuMonitor, GpuMetrics, GpuAlert};

let monitor = GpuMonitor::new()?;
let metrics = monitor.sample();
for m in &metrics {
    println!("GPU {}: {}°C, {}% util", m.device_id, m.temperature_celsius, m.utilization_percent);
}

Structs§

AndonThresholds
Andon thresholds for GPU alerts
GpuAndonSystem
GPU Andon system for alert management
GpuMetrics
GPU metrics snapshot (inspired by btop’s GPU visualization)
GpuMetricsBuffer
GPU metrics history buffer (ring buffer)
GpuMonitor
GPU monitor that collects metrics
GpuProcess
Process using GPU resources

Enums§

GpuAlert
GPU alert types for Andon system

Functions§

format_gpu_panel
Format GPU metrics for terminal display
render_progress_bar
Render a progress bar for terminal display
render_sparkline
Render a sparkline from values