Skip to main content

Crate rakka_accel_cuda

Crate rakka_accel_cuda 

Source
Expand description

§rakka-accel-cuda

GPU acceleration via the actor model. Wraps NVIDIA CUDA libraries as actors on top of rakka. See README.md and the architecture document under docs/ for the full design.

§Foundation Phase F1 (current)

Phases F2–F5 (cuDNN, cuFFT, NCCL, TensorRT, the PythonGpuBridge) and the four blueprint sub-crates are deferred.

Modules§

completion
Completion strategies (§5.10).
device
DeviceActor (outer tier) + ContextActor (inner tier) — §5.11.
dispatcher
GpuDispatcher (§5.1) — pinned single-thread runtime that ensures the actor’s CUDA context stays current on the same OS thread for the actor’s whole lifetime.
error
Error taxonomy and the supervisor decider for context-poisoning recovery (§5.3, §5.11 of the architecture document).
gpu_ref
GpuRef<T> — opaque, message-friendly handle to a GPU buffer (§5.8).
graph
GraphActor — record a CUDA stream-capture once, replay many.
host
Host-side support: pinned (page-locked) memory pool + PinnedBuf<T>.
kernel
Kernel-actor wrappers around CUDA library handles (§3.2).
memory
Managed (unified) memory.
p2p
P2P (peer-to-peer) topology + cross-device async memcpy.
pipeline
Multi-stream pipeline pattern.
placement
PlacementActor — picks the best-fit DeviceActor for each request based on a configurable PlacementPolicy.
prelude
Common imports for users of rakka-accel-cuda.
replay
Deterministic-replay harness.
stream
Stream allocation strategies (§5.7).