1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
//! # rakka-accel
//!
//! Actor-shaped face for compute-acceleration backends, on top of the
//! [rakka](../../rakka) actor runtime. NVIDIA CUDA is the first
//! shipping implementation ([`rakka-accel-cuda`](../rakka_accel_cuda));
//! the same trait surface accommodates AMD ROCm, Apple Metal, Intel
//! oneAPI, and Vulkan compute when those land.
//!
//! ```toml
//! [dependencies]
//! rakka-accel = { version = "0.0", features = ["cuda"] }
//! ```
//!
//! ```ignore
//! use rakka_accel::prelude::*;
//! use rakka_accel::cuda; // re-export of `rakka-accel-cuda`
//! ```
//!
//! ## What this crate is
//!
//! A **thin core** that names the abstractions every backend has to
//! satisfy:
//!
//! - [`AccelBackend`] — marker trait identifying a backend, with
//! associated `Device`, `Stream`, `Event`, `Error` types.
//! - [`AccelRef`] — generation-validated typed device pointer
//! parametric over the backend.
//! - [`AccelError`] — typed error enum, `#[non_exhaustive]` so
//! backends can add `LibraryError` variants without breaking core.
//! - [`CompletionStrategy`] — async wakeup contract for kernel
//! completion (host-fn callback, sync, polled).
//! - [`KernelOp`] — marker trait for typed op envelopes (Sgemm,
//! RngFillUniform, etc.).
//!
//! The core deliberately ships **no concrete actors**. Each backend
//! crate provides its own `DeviceActor`, `KernelActor` family, and
//! library wrappers. The umbrella re-exports the active backend at
//! `rakka_accel::cuda` (and, eventually, `rakka_accel::rocm`,
//! `rakka_accel::metal`, etc.) so users have one stable import path.
//!
//! ## What this crate is not
//!
//! - A least-common-denominator API. Backends expose more than the
//! trait surface — `rakka_accel::cuda::kernel::CudnnActor` has a
//! richer message set than `KernelOp` knows about, and that's
//! fine. The trait surface is for portable code; backend-specific
//! work uses the concrete crate directly.
//! - A device-abstraction layer like wgpu or SYCL. We don't try to
//! compile one shader to many targets. We supervise the right
//! library on the right hardware.
pub use ;
pub use CompletionStrategy;
pub use AccelError;
pub use AccelRef;
pub use KernelOp;
/// Re-export the CUDA backend at a stable path. Active when the
/// `cuda` feature is on.
pub use rakka_accel_cuda as cuda;