Skip to main content

CpuBackend

Struct CpuBackend 

Source
pub struct CpuBackend { /* private fields */ }

Implementations§

Source§

impl CpuBackend

Source

pub fn new() -> Self

Trait Implementations§

Source§

impl ComputeBackend for CpuBackend

Source§

fn matmul( &self, a: &BufferHandle, b: &BufferHandle, out: &BufferHandle, m: u32, n: u32, k: u32, ) -> Result<()>

Matrix multiply: C[m,n] = A[m,k] * B[k,n] For single-token decode (m=1), this is a GEMV. Buffers hold row-major f32 data.

Source§

fn softmax( &self, input: &BufferHandle, output: &BufferHandle, size: u32, ) -> Result<()>

Softmax: out[i] = exp(input[i] - max) / sum(exp(input - max)) Numerically stable via max subtraction.

Source§

fn rms_norm( &self, input: &BufferHandle, weight: &BufferHandle, output: &BufferHandle, size: u32, eps: f32, ) -> Result<()>

RMSNorm: out = (input / rms) * weight where rms = sqrt(mean(input^2) + eps)

Source§

fn rope( &self, q: &BufferHandle, k: &BufferHandle, pos: u32, head_dim: u32, freq_base: f32, _n_heads_q: u32, _n_heads_k: u32, ) -> Result<()>

RoPE (Rotary Position Embedding) applied in-place to Q and K buffers. head_dim: dimension per attention head (typically 128). Applies rotation in pairs: (q[2i], q[2i+1]) rotated by pos * freq.

Source§

fn silu( &self, input: &BufferHandle, output: &BufferHandle, size: u32, ) -> Result<()>

SiLU (Sigmoid Linear Unit): out[i] = input[i] * sigmoid(input[i]) Also known as swish. Used in SwiGLU FFN.

Source§

fn element_mul( &self, a: &BufferHandle, b: &BufferHandle, output: &BufferHandle, size: u32, ) -> Result<()>

Element-wise multiply: out[i] = a[i] * b[i]

Source§

fn add( &self, a: &BufferHandle, b: &BufferHandle, output: &BufferHandle, size: u32, ) -> Result<()>

Element-wise add: out[i] = a[i] + b[i]

Source§

fn name(&self) -> &str

Source§

fn device_info(&self) -> DeviceInfo

Source§

fn allocate(&self, size_bytes: usize) -> Result<BufferHandle>

Source§

fn free(&self, handle: BufferHandle) -> Result<()>

Source§

fn copy_to_device(&self, data: &[u8], handle: &BufferHandle) -> Result<()>

Source§

fn copy_from_device(&self, handle: &BufferHandle, data: &mut [u8]) -> Result<()>

Source§

fn copy_buffer( &self, src: &BufferHandle, dst: &BufferHandle, size: usize, ) -> Result<()>

Source§

fn copy_buffer_offset( &self, src: &BufferHandle, dst: &BufferHandle, src_offset: usize, dst_offset: usize, size: usize, ) -> Result<()>

Source§

fn synchronize(&self) -> Result<()>

Source§

fn attn_score( &self, _q: &BufferHandle, _k_cache: &BufferHandle, _scores: &BufferHandle, _head_dim: u32, _seq_len: u32, _head_offset: u32, _kv_offset: u32, _kv_stride: u32, ) -> Result<()>

Compute attention scores: scores[pos] = Q[head_offset..] · K_cache[pos*kv_stride+kv_offset..]
Source§

fn attn_value( &self, _weights: &BufferHandle, _v_cache: &BufferHandle, _output: &BufferHandle, _head_dim: u32, _seq_len: u32, _kv_offset: u32, _kv_stride: u32, _out_offset: u32, ) -> Result<()>

Compute weighted value aggregation: out[out_offset+d] = sum_pos(weights[pos] * V[pos*kv_stride+kv_offset+d])
Source§

fn quantized_matmul( &self, _weights: &BufferHandle, _input: &BufferHandle, _output: &BufferHandle, _n_rows: u32, _n_cols: u32, _dtype: DType, ) -> Result<()>

Fused dequantize + matrix-vector multiply for quantized weights. GPU backends override this for fused VRAM kernels. Default impl falls back to regular matmul (assumes pre-dequantized data).
Source§

impl Default for CpuBackend

Source§

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more