Struct StubModelExecutor

Source

pub struct StubModelExecutor { /* private fields */ }

Expand description

Stub model executor - MVP implementation

Returns dummy outputs to allow pipeline testing without real models.

Implementations§

Source §

impl StubModelExecutor

Source

pub fn new( model_id: impl Into<ModelId>, vocab_size: usize, tensor_factory: Arc<dyn TensorFactory>, ) -> Self

Trait Implementations§

Source §

impl ModelExecutor for StubModelExecutor

Source §

fn info(&self) -> &ModelInfo

Get model information and metadata

Source §

fn prefill<'life0, 'life1, 'async_trait>( &'life0 self, input: &'life1 PrefillInput, ) -> Pin<Box<dyn Future<Output = Result<PrefillOutput>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

Execute prefill phase (process initial prompt)

Source §

fn decode<'life0, 'life1, 'async_trait>( &'life0 self, input: &'life1 DecodeInput, ) -> Pin<Box<dyn Future<Output = Result<DecodeOutput>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

Execute decode phase (generate next token)

Source §

fn capabilities(&self) -> ExecutorCapabilities

Get executor capabilities

Source §

fn status(&self) -> ExecutorStatus

Get current executor status

Source §

fn kv_capacity(&self) -> Option<usize>

Per-request KV capacity in tokens when the executor owns a smaller runtime cache window than the model’s declared context length.

Source §

fn batch_prefill<'life0, 'life1, 'async_trait>( &'life0 self, inputs: &'life1 [PrefillInput], ) -> Pin<Box<dyn Future<Output = Result<Vec<PrefillOutput>, FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

Batch prefill: process multiple prompts’ prefill in ONE forward pass. Read more

Source §

fn batch_decode<'life0, 'life1, 'async_trait>( &'life0 self, inputs: &'life1 [DecodeInput], ) -> Pin<Box<dyn Future<Output = Result<Vec<DecodeOutput>, FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

Batch decode: process multiple sequences in one forward pass. Read more

Source §

fn unified_decode<'life0, 'life1, 'async_trait>( &'life0 self, _batch: &'life1 UnifiedBatch, ) -> Pin<Box<dyn Future<Output = Result<Vec<Option<Vec<f32>>>, FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

Unified mixed-batch forward: process a UnifiedBatch containing any combination of prefill chunks (one or more q_tokens per item, possibly continuing from pos_offset > 0) and decode steps (q_tokens.len() == 1, is_final_chunk = true) in a single model forward pass. Read more

Source §

fn forward<'life0, 'life1, 'async_trait>( &'life0 self, _input: &'life1 Arc<dyn TensorLike>, ) -> Pin<Box<dyn Future<Output = Result<Arc<dyn TensorLike>, FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

Optional: full forward pass (for non-autoregressive use cases)

Source §

fn truncate_kv<'life0, 'life1, 'async_trait>( &'life0 self, _kv_cache: &'life1 Arc<dyn KvCacheHandle>, _new_len: usize, ) -> Pin<Box<dyn Future<Output = Result<(), FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

Roll the KV cache for this executor’s sequence back to new_len. Used by speculative decoding on partial rejection so the next iteration sees a KV prefix that matches the accepted token stream. Default: Ok(()) — executors that don’t cache per-sequence state (stub, mock) are inherently tolerant; real LLM executors override.

Source §

fn forward_verify<'life0, 'life1, 'async_trait>( &'life0 self, inputs: &'life1 [DecodeInput], ) -> Pin<Box<dyn Future<Output = Result<Vec<DecodeOutput>, FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

Multi-position decode-verify: one forward over N+1 tokens, producing one logits row per position. Used by speculative decoding’s target path so we don’t pay N+1 sequential forwards. Read more

Source §

fn cache_metrics_snapshot(&self) -> Option<Value>

Optional model/executor cache metrics. Read more

Source §

fn lora_metrics_snapshot(&self) -> Option<Value>

Optional LoRA runtime metrics.

Source §

fn warmup<'life0, 'async_trait>( &'life0 mut self, ) -> Pin<Box<dyn Future<Output = Result<(), FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, Self: 'async_trait,

Warm up executor (load model, allocate memory, etc.)

Source §

fn shutdown<'life0, 'async_trait>( &'life0 mut self, ) -> Pin<Box<dyn Future<Output = Result<(), FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, Self: 'async_trait,

Shutdown executor gracefully

Source §

fn release_cache(&self, _cache_id: &str)

Release KV cache and state for a completed sequence. Read more

Auto Trait Implementations§

§

impl UnsafeUnpin for StubModelExecutor

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> ErasedDestructor for T
where T: 'static,

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T> Instrument for T

Source §

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more

Source §

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

impl<F, T> IntoSample<T> for F
where T: FromSample<F>,

Source §

fn into_sample(self) -> T

Source §

impl<T> Pointable for T

Source §

const ALIGN: usize

The alignment of pointer.

Source §

type Init = T

The type for initializers.

Source §

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more

Source §

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more

Source §

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more

Source §

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more

Source §

impl<T> PolicyExt for T
where T: ?Sized,

Source §

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more

Source §

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more

Source §

impl<T> Same for T

Source §

type Output = T

Should always be Self

Source §

impl<T, U> TryFrom for T
where U: Into<T>,

Source §

type Error = Infallible

The type returned in the event of a conversion error.

Source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

Source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

Source §

type Error = >::Error

The type returned in the event of a conversion error.

Source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

Source §

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source §

fn vzip(self) -> V

Source §

impl<T> WithSubscriber for T

Source §

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more

Source §

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more

StubModelExecutor

Struct StubModelExecutor Copy item path

Implementations§

impl StubModelExecutor

pub fn new( model_id: impl Into<ModelId>, vocab_size: usize, tensor_factory: Arc<dyn TensorFactory>, ) -> Self

Trait Implementations§

impl ModelExecutor for StubModelExecutor

fn info(&self) -> &ModelInfo

fn prefill<'life0, 'life1, 'async_trait>( &'life0 self, input: &'life1 PrefillInput, ) -> Pin<Box<dyn Future<Output = Result<PrefillOutput>> + Send + 'async_trait>>where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

fn decode<'life0, 'life1, 'async_trait>( &'life0 self, input: &'life1 DecodeInput, ) -> Pin<Box<dyn Future<Output = Result<DecodeOutput>> + Send + 'async_trait>>where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

fn capabilities(&self) -> ExecutorCapabilities

fn status(&self) -> ExecutorStatus

fn kv_capacity(&self) -> Option<usize>

fn batch_prefill<'life0, 'life1, 'async_trait>( &'life0 self, inputs: &'life1 [PrefillInput], ) -> Pin<Box<dyn Future<Output = Result<Vec<PrefillOutput>, FerrumError>> + Send + 'async_trait>>where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn batch_decode<'life0, 'life1, 'async_trait>( &'life0 self, inputs: &'life1 [DecodeInput], ) -> Pin<Box<dyn Future<Output = Result<Vec<DecodeOutput>, FerrumError>> + Send + 'async_trait>>where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn unified_decode<'life0, 'life1, 'async_trait>( &'life0 self, _batch: &'life1 UnifiedBatch, ) -> Pin<Box<dyn Future<Output = Result<Vec<Option<Vec<f32>>>, FerrumError>> + Send + 'async_trait>>where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn forward<'life0, 'life1, 'async_trait>( &'life0 self, _input: &'life1 Arc<dyn TensorLike>, ) -> Pin<Box<dyn Future<Output = Result<Arc<dyn TensorLike>, FerrumError>> + Send + 'async_trait>>where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn truncate_kv<'life0, 'life1, 'async_trait>( &'life0 self, _kv_cache: &'life1 Arc<dyn KvCacheHandle>, _new_len: usize, ) -> Pin<Box<dyn Future<Output = Result<(), FerrumError>> + Send + 'async_trait>>where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn forward_verify<'life0, 'life1, 'async_trait>( &'life0 self, inputs: &'life1 [DecodeInput], ) -> Pin<Box<dyn Future<Output = Result<Vec<DecodeOutput>, FerrumError>> + Send + 'async_trait>>where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn cache_metrics_snapshot(&self) -> Option<Value>

fn lora_metrics_snapshot(&self) -> Option<Value>

fn warmup<'life0, 'async_trait>( &'life0 mut self, ) -> Pin<Box<dyn Future<Output = Result<(), FerrumError>> + Send + 'async_trait>>where 'life0: 'async_trait, Self: 'async_trait,

fn shutdown<'life0, 'async_trait>( &'life0 mut self, ) -> Pin<Box<dyn Future<Output = Result<(), FerrumError>> + Send + 'async_trait>>where 'life0: 'async_trait, Self: 'async_trait,

fn release_cache(&self, _cache_id: &str)

Auto Trait Implementations§

impl !RefUnwindSafe for StubModelExecutor

impl !UnwindSafe for StubModelExecutor

impl Freeze for StubModelExecutor

impl Send for StubModelExecutor

impl Sync for StubModelExecutor

impl Unpin for StubModelExecutor

impl UnsafeUnpin for StubModelExecutor

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> ErasedDestructor for Twhere T: 'static,

impl<T> From<T> for T

fn from(t: T) -> T

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

fn in_current_span(self) -> Instrumented<Self>

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>where F: FnOnce(&Self) -> bool,

impl<F, T> IntoSample<T> for Fwhere T: FromSample<F>,

fn into_sample(self) -> T

impl<T> Pointable for T

const ALIGN: usize

type Init = T

unsafe fn init(init: <T as Pointable>::Init) -> usize

unsafe fn deref<'a>(ptr: usize) -> &'a T

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

unsafe fn drop(ptr: usize)

impl<T> PolicyExt for Twhere T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>where T: Sized + Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>where T: Sized + Policy<B, E>, P: Policy<B, E>,

impl<T> Same for T

type Output = T

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,

fn vzip(self) -> V

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>where S: Into<Dispatch>,

fn with_current_subscriber(self) -> WithDispatch<Self>

Struct StubModelExecutor

fn prefill<'life0, 'life1, 'async_trait>( &'life0 self, input: &'life1 PrefillInput, ) -> Pin<Box<dyn Future<Output = Result<PrefillOutput>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

fn decode<'life0, 'life1, 'async_trait>( &'life0 self, input: &'life1 DecodeInput, ) -> Pin<Box<dyn Future<Output = Result<DecodeOutput>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

fn batch_prefill<'life0, 'life1, 'async_trait>( &'life0 self, inputs: &'life1 [PrefillInput], ) -> Pin<Box<dyn Future<Output = Result<Vec<PrefillOutput>, FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn batch_decode<'life0, 'life1, 'async_trait>( &'life0 self, inputs: &'life1 [DecodeInput], ) -> Pin<Box<dyn Future<Output = Result<Vec<DecodeOutput>, FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn unified_decode<'life0, 'life1, 'async_trait>( &'life0 self, _batch: &'life1 UnifiedBatch, ) -> Pin<Box<dyn Future<Output = Result<Vec<Option<Vec<f32>>>, FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn forward<'life0, 'life1, 'async_trait>( &'life0 self, _input: &'life1 Arc<dyn TensorLike>, ) -> Pin<Box<dyn Future<Output = Result<Arc<dyn TensorLike>, FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn truncate_kv<'life0, 'life1, 'async_trait>( &'life0 self, _kv_cache: &'life1 Arc<dyn KvCacheHandle>, _new_len: usize, ) -> Pin<Box<dyn Future<Output = Result<(), FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn forward_verify<'life0, 'life1, 'async_trait>( &'life0 self, inputs: &'life1 [DecodeInput], ) -> Pin<Box<dyn Future<Output = Result<Vec<DecodeOutput>, FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, 'life1: 'async_trait, Self: 'async_trait,

fn warmup<'life0, 'async_trait>( &'life0 mut self, ) -> Pin<Box<dyn Future<Output = Result<(), FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, Self: 'async_trait,

fn shutdown<'life0, 'async_trait>( &'life0 mut self, ) -> Pin<Box<dyn Future<Output = Result<(), FerrumError>> + Send + 'async_trait>>
where 'life0: 'async_trait, Self: 'async_trait,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> ErasedDestructor for T
where T: 'static,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<F, T> IntoSample<T> for F
where T: FromSample<F>,

impl<T> PolicyExt for T
where T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,