Expand description
CUDA profiler: wraps ncu, nsys, and CUPTI. See spec sections 4.1.1 (ncu), 4.1.2 (nsys), 4.1.3 (CUPTI).
Structs§
- NcuProfiler
- Wraps
ncuCLI for kernel-level profiling. - Nsys
Kernel Stat - Parsed nsys stats output — one entry per kernel.
- Nsys
Profiler - Wraps
nsysCLI for system-wide timeline profiling.
Enums§
- NcuSection
- ncu metric sections — lazily collect only what’s requested.
Functions§
- ncu_
metrics_ to_ profile - Build a FullProfile from ncu metrics.
- parse_
ncu_ csv - Parse ncu CSV output into a metric name → value map. ncu CSV format: “ID”,“Metric Name”,“Metric Unit”,“Metric Value”
- profile_
binary - Profile an arbitrary binary via nsys.
- profile_
cublas - Profile cuBLAS operations.
- profile_
kernel - Profile a CUDA PTX kernel via ncu.
- profile_
python - Profile a Python script via nsys + perf stat.
- run_
trace - Run
cgp trace— system-wide timeline via nsys.