atomr_accel/
lib.rs

1//! # atomr-accel
2//!
3//! Actor-shaped face for compute-acceleration backends, on top of the
4//! [atomr](../../atomr) actor runtime. NVIDIA CUDA is the first
5//! shipping implementation ([`atomr-accel-cuda`](../atomr_accel_cuda));
6//! the same trait surface accommodates AMD ROCm, Apple Metal, Intel
7//! oneAPI, and Vulkan compute when those land.
8//!
9//! ```toml
10//! [dependencies]
11//! atomr-accel      = "0.1"
12//! atomr-accel-cuda = "0.1"   # active backend
13//! ```
14//!
15//! ```ignore
16//! use atomr_accel::prelude::*;
17//! use atomr_accel_cuda as cuda;
18//! ```
19//!
20//! ## What this crate is
21//!
22//! A **thin core** that names the abstractions every backend has to
23//! satisfy:
24//!
25//! - [`AccelBackend`] — marker trait identifying a backend, with
26//!   associated `Device`, `Stream`, `Event`, `Error` types.
27//! - [`AccelDtype`] / [`DType`] — backend-agnostic numeric data-type
28//!   trait + discriminant. Backends layer their own `*Dtype` trait
29//!   on top with FFI mappings.
30//! - [`AccelRef`] — generation-validated typed device pointer
31//!   parametric over the backend.
32//! - [`AccelError`] — typed error enum, `#[non_exhaustive]` so
33//!   backends can add `LibraryError` variants without breaking core.
34//! - [`CompletionStrategy`] — async wakeup contract for kernel
35//!   completion (host-fn callback, sync, polled).
36//! - [`KernelOp`] — marker trait for typed op envelopes (Sgemm,
37//!   RngFillUniform, etc.).
38//!
39//! The core deliberately ships **no concrete actors**. Each backend
40//! crate (`atomr-accel-cuda`, future `atomr-accel-rocm`,
41//! `atomr-accel-metal`, …) provides its own `DeviceActor`,
42//! `KernelActor` family, and library wrappers, and depends on this
43//! crate for the trait surface.
44//!
45//! ## What this crate is not
46//!
47//! - A least-common-denominator API. Backends expose more than the
48//!   trait surface — `atomr_accel_cuda::kernel::CudnnActor` has a
49//!   richer message set than `KernelOp` knows about, and that's
50//!   fine. The trait surface is for portable code; backend-specific
51//!   work uses the concrete crate directly.
52//! - A device-abstraction layer like wgpu or SYCL. We don't try to
53//!   compile one shader to many targets. We supervise the right
54//!   library on the right hardware.
55
56#![cfg_attr(docsrs, feature(doc_cfg))]
57
58pub mod backend;
59pub mod completion;
60pub mod dtype;
61pub mod error;
62pub mod gpu_ref;
63pub mod kernel;
64
65pub use backend::{AccelBackend, AccelDevice, AccelStream};
66pub use completion::CompletionStrategy;
67pub use dtype::{AccelDtype, DType};
68pub use error::AccelError;
69pub use gpu_ref::AccelRef;
70pub use kernel::KernelOp;
71
72pub mod prelude {
73    //! Canonical re-exports. `use atomr_accel::prelude::*;`.
74    pub use crate::backend::{AccelBackend, AccelDevice, AccelStream};
75    pub use crate::completion::CompletionStrategy;
76    pub use crate::error::{AccelError, AccelResult};
77    pub use crate::gpu_ref::AccelRef;
78    pub use crate::kernel::KernelOp;
79}
atomr_accel/lib.rs

atomr_accel/
lib.rs