pub struct NvidiaGpu {
pub device_count: u32,
pub processes: u32,
pub gpu_time: u32,
pub mem_time: u32,
pub mem_total: u64,
pub mem_free: u64,
pub ecc_errors: u32,
pub energy: u32,
pub temperature: u32,
pub fan_speed: u32,
}Expand description
NVIDIA GPU Statistics - Format (5703,1)
GPU performance metrics from NVIDIA Management Library (NVML)
§XDR Definition (sFlow NVML)
/* NVIDIA GPU statistics */
/* opaque = counter_data; enterprise = 5703; format = 1 */
struct nvidia_gpu {
unsigned int device_count; /* see nvmlDeviceGetCount */
unsigned int processes; /* see nvmlDeviceGetComputeRunningProcesses */
unsigned int gpu_time; /* total milliseconds in which one or more
kernels was executing on GPU
sum across all devices */
unsigned int mem_time; /* total milliseconds during which global device
memory was being read/written
sum across all devices */
unsigned hyper mem_total; /* sum of framebuffer memory across devices
see nvmlDeviceGetMemoryInfo */
unsigned hyper mem_free; /* sum of free framebuffer memory across devices
see nvmlDeviceGetMemoryInfo */
unsigned int ecc_errors; /* sum of volatile ECC errors across devices
see nvmlDeviceGetTotalEccErrors */
unsigned int energy; /* sum of millijoules across devices
see nvmlDeviceGetPowerUsage */
unsigned int temperature; /* maximum temperature in degrees Celsius
across devices
see nvmlDeviceGetTemperature */
unsigned int fan_speed; /* maximum fan speed in percent across devices
see nvmlDeviceGetFanSpeed */
}ERRATUM: The specification uses a comma instead of a semicolon in the format comment
(enterprise = 5703, format=1 should be enterprise = 5703; format = 1), which is
inconsistent with all other sFlow specifications. The corrected version is shown above.
Fields§
§device_count: u32Number of GPU devices
processes: u32Number of running compute processes
gpu_time: u32Total GPU time in milliseconds (sum across all devices)
mem_time: u32Total memory access time in milliseconds (sum across all devices)
mem_total: u64Total framebuffer memory in bytes (sum across all devices)
mem_free: u64Free framebuffer memory in bytes (sum across all devices)
ecc_errors: u32Sum of volatile ECC errors across all devices
energy: u32Total energy consumption in millijoules (sum across all devices)
temperature: u32Maximum temperature in degrees Celsius across all devices
fan_speed: u32Maximum fan speed in percent across all devices