iro-cuda-ffi-kernels
Reference CUDA kernels for iro-cuda-ffi.
These kernels are compiled with nvcc at build time and demonstrate the expected ABI and wrapper patterns. They are intended as examples and integration tests, not as a production math library.
Notes:
- Requires a CUDA toolkit (nvcc) to build.
- Tests require
--features cuda-tests. - Benchmarks (including cudarc cross-validation) live in
iro-cuda-ffi-benchmarks. - iro-cuda-ffi vs cudarc benchmarks are sanity checks, not a competition.
- Run benchmarks in release mode and serially to avoid GPU contention.