rocmrc
Minimal safe ROCm bindings for Rust — the AMD analog of cudarc.
Status: pre-release (
0.1). Surface mirrorscudarcclosely so call sites stay near-identical when porting from NVIDIA targets. License: dualMIT OR Apache-2.0.
Why
ROCm exposes HIP, rocBLAS, and hipBLASLt — the AMD equivalents of CUDA, cuBLAS, and cuBLASLt. cudarc covers the NVIDIA side of the ML stack with idiomatic Rust wrappers; rocmrc is the matching set of wrappers for AMD GPUs. The goal is that a downstream crate (e.g. a ML runtime) can keep separate #[cfg(feature = "cuda")] / #[cfg(feature = "rocm")] backends whose code reads almost identically.
Design choices that mirror cudarc:
- Stream-bound handles (
RocblasHandleowns anArc<HipStream>, etc.). - RAII descriptors for hipBLASLt (
MatrixLayout,MatmulDesc,MatmulPref). - Trait-based dispatch over scalar
Tfor the BLAS surface:Gemm<T>,Gemv<T>,Axpy<T>,Scal<T>,Nrm2<T>,Dot<T>,Copy<T>. - Descriptor pattern for fused matmul with bias / ReLU / GELU epilogues.
Modules
| Module | What it wraps | Feature gate |
|---|---|---|
driver |
HIP runtime API (streams, memory, modules, kernel launch) | always-on |
hiprtc |
hipRTC (runtime compilation of HIP C++ → hsaco code objects) | always-on |
rocblas |
rocBLAS L1/L2/L3 via Gemm<T> / Gemv<T> / Axpy<T> / Scal<T> / ... |
rocblas |
hipblaslt |
hipBLASLt fused matmul (bias / ReLU / GELU epilogues) via RAII descriptors | hipblaslt |
Quickstart
A minimal HIP kernel compiled at runtime via hipRTC, loaded as a module, and launched on the default stream — see examples/vec_add.rs for the full code:
use ;
const SRC: &str = r#"
extern "C" __global__
void vec_add(float* out, const float* a, const float* b, int n) {
int i = blockIdx.x * blockDim.x + threadIdx.x;
if (i < n) out[i] = a[i] + b[i];
}
"#;
let ctx = new?;
let = compile?;
let module = from_hsaco?;
let func = module.get_function?;
// ...allocate slices, copy data, launch, copy back...
Run it:
ROCM_PATH=/opt/rocm
Feature flags
| Feature | Purpose |
|---|---|
rocblas |
Enable the rocBLAS module + link librocblas (or load it at runtime with dynamic-loading). |
hipblaslt |
Enable the hipBLASLt module + link libhipblaslt (or load it at runtime with dynamic-loading). |
dynamic-loading |
Resolve ROCm symbols at runtime via libloading instead of link-time. Skips the -l directives in build.rs. |
rocm-XYYYY |
Selects which ROCm patch the committed bindings target. Pick exactly one matching your installed ROCm runtime. |
The rocm-XYYYY features encode major.minor.patch as XX YY Z (e.g. 7.2.1 → rocm-07021).
ROCm version axis
The version axis follows what AMD actually publishes on repo.radeon.com/rocm/apt/<ver>/ubuntu jammy main. AMD does not publish .0 patches on jammy, so the axis is patch-suffixed:
| Feature | ROCm version |
|---|---|
rocm-06002 |
6.0.2 |
rocm-06015 |
6.1.5 |
rocm-06024 |
6.2.4 |
rocm-06033 |
6.3.3 |
rocm-06043 |
6.4.3 |
rocm-07002 |
7.0.2 |
rocm-07011 |
7.1.1 |
rocm-07021 |
7.2.1 |
rocm-07022 |
7.2.2 |
rocm-07023 |
7.2.3 |
Versions are added on demand as new ROCm releases are processed by the bindings generator.
Linking modes
Static (default). build.rs emits cargo:rustc-link-lib=dylib=… for each enabled module (amdhip64, hiprtc, and rocblas / hipblaslt when their features are on). Requires the matching .so files to be on the linker search path at build time — typically /opt/rocm/lib, which build.rs adds via cargo:rustc-link-search=native=$ROCM_PATH/lib.
Dynamic (--features dynamic-loading). build.rs skips the -l directives entirely; instead, each generated sys/mod.rs ships rocmlib() and is_rocmlib_present() helpers (powered by libloading) that resolve the ROCm libraries on first call. The search-path logic lives in get_lib_name_candidates — it tries lib<name>.so first (resolved via LD_LIBRARY_PATH / ldconfig), then versioned sonames, then /opt/rocm/lib/ prefixes. Set ROCM_PATH=/path/to/rocm at runtime to point at a non-default install.
Examples
# driver + hiprtc only — no rocm-* feature needed
# rocBLAS GEMM + L1 round trip
# hipBLASLt matmul with bias + ReLU epilogue
The examples live in examples/ and are kept in lock-step with the safe-wrapper API.
Regenerating bindings
Bindings are committed in-tree under src/<mod>/sys/mod.rs so contributors don't need ROCm installed locally to build rocmrc (modulo the rocm-* feature gating). When AMD ships a new ROCm patch and you want to add it to the version axis, regenerate via the dockerized bindings_generator:
The container holds slices of every supported ROCm patch (rocBLAS, hipBLASLt, HIP, and a shared rocm-llvm), runs bindgen per version for the BLAS modules, and merges per-version outputs into a unified mod.rs with #[cfg(feature = "rocm-XYYYY")] gates where symbols changed. bindings_generator/regen-bindings.sh is the entrypoint; the merge logic lives in bindings_generator/src/merge.rs.
License
Dual-licensed under either:
at your option.
Acknowledgments
API shape, layout, and the dual-license arrangement follow cudarc by Chelsea Lowman and contributors. rocmrc exists so the Rust ML/GPU stack can target AMD with the same ergonomics.