Expand description
§baracuda-forge
Build-time CUDA kernel compiler for the baracuda ecosystem. Drop it into
your [build-dependencies] and turn .cu files into a static library or
PTX with nvcc, with incremental rebuilds, parallel compilation, GPU
compute-capability auto-detection, and integrated CUTLASS support.
baracuda-forge is the build-time companion to baracuda’s runtime
wrappers (baracuda-driver, baracuda-runtime, …). Use forge to turn
your .cu files into a library; use the runtime crates to launch the
kernels from Rust.
§Features
- Compute Capability Detection — auto-detect from
nvidia-smiorCUDA_COMPUTE_CAP, with per-file overrides for mixed architectures. - Incremental Builds — only recompile modified kernels using SHA-256 content hashing.
- CUDA Toolkit Auto-Detection — find
nvccand include paths via the sharedbaracuda_builddetector. - C++ Standard Auto-Select — defaults to
c++20on CUDA ≥ 12.0,c++17on older toolkits. Override viaKernelBuilder::cpp_std. - External Dependencies — built-in CUTLASS support, or fetch any git repo.
- Parallel Compilation — configurable thread percentage for parallel builds.
- Flexible Source Selection — directory, glob, files, or exclude patterns.
§Quick start
use baracuda_forge::KernelBuilder;
fn main() {
let out_dir = std::env::var("OUT_DIR").expect("OUT_DIR must be set");
KernelBuilder::new()
.source_dir("src/kernels")
.exclude(&["*_test.cu"])
.arg("-O3")
.thread_percentage(0.5)
.build_lib(format!("{}/libkernels.a", out_dir))
.expect("CUDA compilation failed");
println!("cargo:rustc-link-search={}", out_dir);
println!("cargo:rustc-link-lib=kernels");
}§Acknowledgments
baracuda-forge is a vendored fork of cudaforge by Guoqing Bao,
adapted to baracuda’s workspace conventions. See the NOTICE file at the
crate root for full provenance.
Structs§
- Build
Cache - Build cache for tracking file modifications.
- Compute
Capability - Compute capability configuration.
- Cuda
Toolkit - CUDA toolkit information.
- Dependency
Manager - Dependency manager for handling multiple external dependencies.
- External
Dependency - External dependency configuration.
- GpuArch
- GPU architecture specification.
- Kernel
Builder - Main builder for CUDA kernel compilation.
- Parallel
Config - Parallel build configuration.
- PtxOutput
- Output from PTX compilation.
- Source
Selector - Source file selection configuration.
Enums§
- Error
- Errors that can occur during CUDA kernel building.
Functions§
- collect_
headers - Collect header files (
.cuh) from directories. - detect_
compute_ cap - Detect compute capability from system.
- get_
gpu_ arch_ string - Get GPU architecture string for nvcc (e.g., “sm_90a” or “sm_80”).
- resolve_
cutlass_ from_ cargo_ checkouts - Try to resolve CUTLASS from cargo’s git checkouts directory.
Type Aliases§
- Result
- Result type alias for baracuda-forge operations.