rakka_accel/lib.rs
1//! # rakka-accel
2//!
3//! Actor-shaped face for compute-acceleration backends, on top of the
4//! [rakka](../../rakka) actor runtime. NVIDIA CUDA is the first
5//! shipping implementation ([`rakka-accel-cuda`](../rakka_accel_cuda));
6//! the same trait surface accommodates AMD ROCm, Apple Metal, Intel
7//! oneAPI, and Vulkan compute when those land.
8//!
9//! ```toml
10//! [dependencies]
11//! rakka-accel = { version = "0.0", features = ["cuda"] }
12//! ```
13//!
14//! ```ignore
15//! use rakka_accel::prelude::*;
16//! use rakka_accel::cuda; // re-export of `rakka-accel-cuda`
17//! ```
18//!
19//! ## What this crate is
20//!
21//! A **thin core** that names the abstractions every backend has to
22//! satisfy:
23//!
24//! - [`AccelBackend`] — marker trait identifying a backend, with
25//! associated `Device`, `Stream`, `Event`, `Error` types.
26//! - [`AccelRef`] — generation-validated typed device pointer
27//! parametric over the backend.
28//! - [`AccelError`] — typed error enum, `#[non_exhaustive]` so
29//! backends can add `LibraryError` variants without breaking core.
30//! - [`CompletionStrategy`] — async wakeup contract for kernel
31//! completion (host-fn callback, sync, polled).
32//! - [`KernelOp`] — marker trait for typed op envelopes (Sgemm,
33//! RngFillUniform, etc.).
34//!
35//! The core deliberately ships **no concrete actors**. Each backend
36//! crate provides its own `DeviceActor`, `KernelActor` family, and
37//! library wrappers. The umbrella re-exports the active backend at
38//! `rakka_accel::cuda` (and, eventually, `rakka_accel::rocm`,
39//! `rakka_accel::metal`, etc.) so users have one stable import path.
40//!
41//! ## What this crate is not
42//!
43//! - A least-common-denominator API. Backends expose more than the
44//! trait surface — `rakka_accel::cuda::kernel::CudnnActor` has a
45//! richer message set than `KernelOp` knows about, and that's
46//! fine. The trait surface is for portable code; backend-specific
47//! work uses the concrete crate directly.
48//! - A device-abstraction layer like wgpu or SYCL. We don't try to
49//! compile one shader to many targets. We supervise the right
50//! library on the right hardware.
51
52#![cfg_attr(docsrs, feature(doc_cfg))]
53
54pub mod backend;
55pub mod completion;
56pub mod error;
57pub mod gpu_ref;
58pub mod kernel;
59
60pub use backend::{AccelBackend, AccelDevice, AccelStream};
61pub use completion::CompletionStrategy;
62pub use error::AccelError;
63pub use gpu_ref::AccelRef;
64pub use kernel::KernelOp;
65
66/// Re-export the CUDA backend at a stable path. Active when the
67/// `cuda` feature is on.
68#[cfg(feature = "cuda")]
69#[cfg_attr(docsrs, doc(cfg(feature = "cuda")))]
70pub use rakka_accel_cuda as cuda;
71
72pub mod prelude {
73 //! Canonical re-exports. `use rakka_accel::prelude::*;`.
74 pub use crate::backend::{AccelBackend, AccelDevice, AccelStream};
75 pub use crate::completion::CompletionStrategy;
76 pub use crate::error::{AccelError, AccelResult};
77 pub use crate::gpu_ref::AccelRef;
78 pub use crate::kernel::KernelOp;
79}