Expand description
GPU-facing configuration and transfer descriptors.
The crate keeps these types in the public API so storage callers can describe KV-cache chunk transfers without depending on a CUDA runtime in the core crate. The actual GPU execution layer is intentionally outside this package.
Structs§
- Cuda
Chunk Transfer Descriptor - Precomputed routing metadata for a chunk that should be transferred to a GPU destination in layer order.
- Cuda
Chunk Transfer Hit - Cuda
Config - Runtime configuration for the optional CUDA/GPU tier.
- Cuda
Session Transfer Request - A session-scoped transfer request for streaming KV chunks in layer order to a GPU-facing consumer.
- Cuda
Session Transfer Stats - Aggregate outcome for a session-scoped streaming transfer.