Expand description
CUDA kernel symbol canonicalization.
Kernels come from nsys / ncu in two shapes:
- Demangled C++:
void sgemm_128x128<float>(float const*, float const*, float*, int, int, int)โ canonicalsgemm_128x128<float>(drop leading return type, drop parenthesized parameter list, KEEP template parameters โ they distinguish instantiations). - Raw mangled (
_Z...) โ delegate toc++filtlike cxx does, then apply the same normalization.
Template parameters are MANDATORY to preserve: agents correlate
sgemm<float> and sgemm<half> as different rows and a canonical
form that drops them would merge and lose the distinction.