Skip to main content

Crate baracuda_forge

Crate baracuda_forge 

Source
Expand description

§baracuda-forge

Build-time CUDA kernel compiler for the baracuda ecosystem. Drop it into your [build-dependencies] and turn .cu files into a static library or PTX with nvcc, with incremental rebuilds, parallel compilation, GPU compute-capability auto-detection, and integrated CUTLASS support.

baracuda-forge is the build-time companion to baracuda’s runtime wrappers (baracuda-driver, baracuda-runtime, …). Use forge to turn your .cu files into a library; use the runtime crates to launch the kernels from Rust.

§Features

  • Compute Capability Detection — auto-detect from nvidia-smi or CUDA_COMPUTE_CAP, with per-file overrides for mixed architectures.
  • Incremental Builds — only recompile modified kernels using SHA-256 content hashing.
  • CUDA Toolkit Auto-Detection — find nvcc and include paths via the shared baracuda_build detector.
  • C++ Standard Auto-Select — defaults to c++20 on CUDA ≥ 12.0, c++17 on older toolkits. Override via KernelBuilder::cpp_std.
  • External Dependencies — built-in CUTLASS support, or fetch any git repo.
  • Parallel Compilation — configurable thread percentage for parallel builds.
  • Flexible Source Selection — directory, glob, files, or exclude patterns.

§Quick start

use baracuda_forge::KernelBuilder;

fn main() {
    let out_dir = std::env::var("OUT_DIR").expect("OUT_DIR must be set");

    KernelBuilder::new()
        .source_dir("src/kernels")
        .exclude(&["*_test.cu"])
        .arg("-O3")
        .thread_percentage(0.5)
        .build_lib(format!("{}/libkernels.a", out_dir))
        .expect("CUDA compilation failed");

    println!("cargo:rustc-link-search={}", out_dir);
    println!("cargo:rustc-link-lib=kernels");
}

§Acknowledgments

baracuda-forge is a vendored fork of cudaforge by Guoqing Bao, adapted to baracuda’s workspace conventions. See the NOTICE file at the crate root for full provenance.

Structs§

BuildCache
Build cache for tracking file modifications.
ComputeCapability
Compute capability configuration.
CudaToolkit
CUDA toolkit information.
DependencyManager
Dependency manager for handling multiple external dependencies.
ExternalDependency
External dependency configuration.
GpuArch
GPU architecture specification.
KernelBuilder
Main builder for CUDA kernel compilation.
ParallelConfig
Parallel build configuration.
PtxOutput
Output from PTX compilation.
SourceSelector
Source file selection configuration.

Enums§

Error
Errors that can occur during CUDA kernel building.

Functions§

collect_headers
Collect header files (.cuh) from directories.
detect_compute_cap
Detect compute capability from system.
get_gpu_arch_string
Get GPU architecture string for nvcc (e.g., “sm_90a” or “sm_80”).
resolve_cutlass_from_cargo_checkouts
Try to resolve CUTLASS from cargo’s git checkouts directory.

Type Aliases§

Result
Result type alias for baracuda-forge operations.