pub struct Arena {
pub buffer: Buffer,
pub f16_buffer: Option<Buffer>,
pub offsets: HashMap<NodeId, usize>,
pub lens: HashMap<NodeId, usize>,
pub size: usize,
}Expand description
One contiguous arena buffer + per-node byte offsets. Lives for the entire executable graph’s lifetime.
Fields§
§buffer: BufferUnderlying GPU buffer. Bound as a single STORAGE_READ_WRITE resource for every kernel; offsets disambiguate per-node access.
f16_buffer: Option<Buffer>Optional shadow buffer holding f16 versions of every value
written via write_f32. Sized at half the arena byte budget
(each f32 element pairs with an f16 element at the same logical
index — i.e. f16_off = f32_off / 2). Created only when the
device exposes the SHADER_F16 feature; matmul kernels with
f16-typed B input bind both buffer (for f32 activations) and
f16_buffer (for f16 weights). Halves global memory traffic
on the dominant matmul reads.
offsets: HashMap<NodeId, usize>Per-node byte offset into buffer.
lens: HashMap<NodeId, usize>Per-node byte length.
size: usizeTotal arena size in bytes.
Implementations§
Source§impl Arena
impl Arena
Sourcepub fn from_plan(device: &Device, plan: &MemoryPlan) -> Self
pub fn from_plan(device: &Device, plan: &MemoryPlan) -> Self
Build an arena from a memory plan. Allocates one big buffer sized to fit every node’s offset+length.
pub fn has(&self, id: NodeId) -> bool
pub fn offset(&self, id: NodeId) -> usize
pub fn len_of(&self, id: NodeId) -> usize
Sourcepub fn set_actual_len(&mut self, id: NodeId, bytes: usize)
pub fn set_actual_len(&mut self, id: NodeId, bytes: usize)
Override the actual data length (in bytes) for a node. The backend calls this after planning to record true elem*4 sizes instead of the alignment-padded slot sizes.
Sourcepub fn write_f32(&self, queue: &Queue, id: NodeId, data: &[f32])
pub fn write_f32(&self, queue: &Queue, id: NodeId, data: &[f32])
Write f32 data into the node’s slot. The queue performs an
async transfer; subsequent kernel dispatches on the same queue
see the new bytes. When the device supports SHADER_F16, also
downcasts and writes the same data into the f16 shadow buffer
at offset f32_offset / 2 — so matmul kernels with f16 weight
bindings can read directly from there at half the bandwidth.
Sourcepub fn read_f32(&self, device: &Device, queue: &Queue, id: NodeId) -> Vec<f32>
pub fn read_f32(&self, device: &Device, queue: &Queue, id: NodeId) -> Vec<f32>
Read a node’s bytes back to host f32 via a staging buffer +
blocking map. Used by run() for output extraction.
Sourcepub fn read_bytes_range(
&self,
device: &Device,
queue: &Queue,
byte_off: usize,
len: usize,
) -> Vec<u8> ⓘ
pub fn read_bytes_range( &self, device: &Device, queue: &Queue, byte_off: usize, len: usize, ) -> Vec<u8> ⓘ
Read a byte range from the arena (used for packed GGUF weights).
Sourcepub fn write_bytes_range(&self, queue: &Queue, byte_off: usize, data: &[u8])
pub fn write_bytes_range(&self, queue: &Queue, byte_off: usize, data: &[u8])
Write raw bytes into the arena at byte_off.
Auto Trait Implementations§
impl Freeze for Arena
impl !RefUnwindSafe for Arena
impl Send for Arena
impl Sync for Arena
impl Unpin for Arena
impl UnsafeUnpin for Arena
impl !UnwindSafe for Arena
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more