pub enum LlmBackendDispatch {
Cpu(CpuLlmBackend),
Metal(MetalLlmBackend),
}Variants§
Cpu(CpuLlmBackend)
Metal(MetalLlmBackend)
Implementations§
Source§impl LlmBackendDispatch
impl LlmBackendDispatch
Sourcepub fn is_cpu(&self) -> bool
pub fn is_cpu(&self) -> bool
Returns true when the backend is CPU-only (thread-safe for concurrent
compute calls). Returns false for GPU backends (Metal/MLX) whose
command buffers do not support concurrent encoding from multiple threads.
pub fn from_kind(kind: LlmBackendKind) -> Result<Self>
Sourcepub fn from_kind_with_model_bytes(
kind: LlmBackendKind,
model_bytes: u64,
) -> Result<Self>
pub fn from_kind_with_model_bytes( kind: LlmBackendKind, model_bytes: u64, ) -> Result<Self>
Select a backend, optionally accounting for the model’s weight footprint.
model_bytes is the total weight size in bytes (0 = unknown). On Apple
Silicon (unified memory), Metal is chosen for Auto when the model fits
with a 1.5× KV-cache headroom factor; otherwise CPU is used to avoid
swapping GPU memory which kills throughput.
Trait Implementations§
Source§impl Clone for LlmBackendDispatch
impl Clone for LlmBackendDispatch
Source§fn clone(&self) -> LlmBackendDispatch
fn clone(&self) -> LlmBackendDispatch
Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for LlmBackendDispatch
impl Debug for LlmBackendDispatch
Source§impl LlmBackend for LlmBackendDispatch
impl LlmBackend for LlmBackendDispatch
fn name(&self) -> &'static str
fn linear_3d(&self, x: &Tensor, weight: &Tensor) -> Result<Tensor>
fn rms_norm(&self, x: &Tensor, weight: &Tensor, eps: f32) -> Result<Tensor>
fn layer_norm( &self, x: &Tensor, weight: &Tensor, bias: Option<&Tensor>, eps: f32, ) -> Result<Tensor>
fn silu(&self, x: &Tensor) -> Result<Tensor>
fn gelu(&self, x: &Tensor) -> Result<Tensor>
fn add(&self, a: &Tensor, b: &Tensor) -> Result<Tensor>
fn mul(&self, a: &Tensor, b: &Tensor) -> Result<Tensor>
fn apply_rope_positions( &self, x: &Tensor, positions: &[usize], base: f32, ) -> Result<Tensor>
fn gqa_attention( &self, q: &Tensor, k: &Tensor, v: &Tensor, n_kv_heads: usize, causal: bool, ) -> Result<Tensor>
Compute logits for ALL positions in the sequence.
Returns
seq_len vectors each of length vocab_size.
Default impl delegates to the CPU reference kernel; backends may override.Source§fn linear_3d_bias(
&self,
x: &Tensor,
weight: &Tensor,
bias: Option<&Tensor>,
) -> Result<Tensor>
fn linear_3d_bias( &self, x: &Tensor, weight: &Tensor, bias: Option<&Tensor>, ) -> Result<Tensor>
Linear projection with an optional bias added over the last dimension.
Backend-agnostic: computes
linear_3d then folds in the bias on the host,
so every backend gets correct bias handling for free.Source§fn apply_rope_partial(
&self,
x: &Tensor,
positions: &[usize],
base: f32,
rotary_dim: usize,
) -> Result<Tensor>
fn apply_rope_partial( &self, x: &Tensor, positions: &[usize], base: f32, rotary_dim: usize, ) -> Result<Tensor>
RoPE over only the first
rotary_dim channels (Phi partial rotary).
Computed on the CPU reference kernel for all backends — it is cheap and
avoids backend-specific partial-rotary support.Auto Trait Implementations§
impl Freeze for LlmBackendDispatch
impl RefUnwindSafe for LlmBackendDispatch
impl Send for LlmBackendDispatch
impl Sync for LlmBackendDispatch
impl Unpin for LlmBackendDispatch
impl UnsafeUnpin for LlmBackendDispatch
impl UnwindSafe for LlmBackendDispatch
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more