trueno-cupti: CUDA Profiling for ComputeBrick Analysis
Rust bindings for NVIDIA CUPTI (CUDA Profiling Tools Interface). Enables detailed GPU profiling for cbtop visualization.
Features
- Activity Tracing: Kernel execution, memory copies, synchronization
- Metrics Collection: SM utilization, warp occupancy, memory throughput
- Callback API: Real-time notifications of CUDA operations
- PC Sampling: Instruction-level performance analysis
Example
use trueno_cupti::{Profiler, ActivityKind};
let profiler = Profiler::new()?;
profiler.enable(ActivityKind::Kernel)?;
profiler.enable(ActivityKind::MemoryCopy)?;
// Run CUDA workload...
let records = profiler.flush()?;
for record in records {
println!("{}: {} ns", record.name, record.duration_ns);
}
Hardware Requirements
- NVIDIA GPU (Compute Capability 3.0+)
- CUDA Toolkit 11.0+ with CUPTI
- Linux (primary), Windows (experimental)