baracuda_forge/lib.rs
1//! # baracuda-forge
2//!
3//! Build-time CUDA kernel compiler for the [baracuda] ecosystem. Drop it into
4//! your `[build-dependencies]` and turn `.cu` files into a static library or
5//! PTX with `nvcc`, with incremental rebuilds, parallel compilation, GPU
6//! compute-capability auto-detection, and integrated CUTLASS support.
7//!
8//! `baracuda-forge` is the **build-time** companion to baracuda's runtime
9//! wrappers (`baracuda-driver`, `baracuda-runtime`, ...). Use forge to turn
10//! your `.cu` files into a library; use the runtime crates to launch the
11//! kernels from Rust.
12//!
13//! [baracuda]: https://github.com/ciresnave/baracuda
14//!
15//! ## Features
16//!
17//! - **Compute Capability Detection** — auto-detect from `nvidia-smi` or
18//! `CUDA_COMPUTE_CAP`, with per-file overrides for mixed architectures.
19//! - **Incremental Builds** — only recompile modified kernels using SHA-256
20//! content hashing.
21//! - **CUDA Toolkit Auto-Detection** — find `nvcc` and include paths via the
22//! shared [`baracuda_build`] detector.
23//! - **C++ Standard Auto-Select** — defaults to `c++20` on CUDA ≥ 12.0,
24//! `c++17` on older toolkits. Override via [`KernelBuilder::cpp_std`].
25//! - **External Dependencies** — built-in CUTLASS support, or fetch any git repo.
26//! - **Parallel Compilation** — configurable thread percentage for parallel builds.
27//! - **Flexible Source Selection** — directory, glob, files, or exclude patterns.
28//!
29//! ## Quick start
30//!
31//! ```no_run
32//! use baracuda_forge::KernelBuilder;
33//!
34//! fn main() {
35//! let out_dir = std::env::var("OUT_DIR").expect("OUT_DIR must be set");
36//!
37//! KernelBuilder::new()
38//! .source_dir("src/kernels")
39//! .exclude(&["*_test.cu"])
40//! .arg("-O3")
41//! .thread_percentage(0.5)
42//! .build_lib(format!("{}/libkernels.a", out_dir))
43//! .expect("CUDA compilation failed");
44//!
45//! println!("cargo:rustc-link-search={}", out_dir);
46//! println!("cargo:rustc-link-lib=kernels");
47//! }
48//! ```
49//!
50//! ## Acknowledgments
51//!
52//! `baracuda-forge` is a vendored fork of [`cudaforge`] by Guoqing Bao,
53//! adapted to baracuda's workspace conventions. See the `NOTICE` file at the
54//! crate root for full provenance.
55//!
56//! [`cudaforge`]: https://github.com/guoqingbao/cudaforge
57
58#![deny(missing_docs)]
59
60mod builder;
61mod compute_cap;
62mod dependency;
63mod error;
64mod hash;
65mod parallel;
66mod source;
67mod toolkit;
68
69pub use builder::{KernelBuilder, PtxOutput};
70pub use compute_cap::{detect_compute_cap, get_gpu_arch_string, ComputeCapability, GpuArch};
71pub use dependency::{resolve_cutlass_from_cargo_checkouts, DependencyManager, ExternalDependency};
72pub use error::{Error, Result};
73pub use hash::BuildCache;
74pub use parallel::ParallelConfig;
75pub use source::{collect_headers, SourceSelector};
76pub use toolkit::CudaToolkit;