pub fn profile_kernel( name: &str, size: u32, roofline: bool, _metrics: Option<&str>, ) -> Result<()>
Profile a CUDA PTX kernel via ncu.