Skip to main content

Eagle3Session

Struct Eagle3Session 

Source
pub struct Eagle3Session { /* private fields */ }
Expand description

Owned EAGLE-3 draft session.

Drops the underlying speculative context when freed.

§Lifetime contract (manual)

The session holds raw pointers to both the target and draft LlamaContexts. The caller must keep both contexts alive (i.e. not drop them) for as long as the session exists.

Implementations§

Source§

impl Eagle3Session

Source

pub fn new( target: &LlamaContext<'_>, draft: &LlamaContext<'_>, n_seq: u32, n_draft_max: i32, ) -> Result<Self, Eagle3SessionError>

Construct an EAGLE-3 draft session with upstream defaults for n_min and p_min.

Equivalent to new_with_config(target, draft, Eagle3SessionConfig::new(n_seq, n_draft_max)).

§Errors

Returns Eagle3SessionError::Init or Eagle3SessionError::InvalidConfig.

Source

pub fn new_with_config( target: &LlamaContext<'_>, draft: &LlamaContext<'_>, config: Eagle3SessionConfig, ) -> Result<Self, Eagle3SessionError>

Construct an EAGLE-3 draft session with full speculative draft parameters.

target must be a LlamaContextType::Default context over the main model. draft must be a Default context over a separate EAGLE-3 draft model trained against that target.

§Errors

Returns Eagle3SessionError::Init (e.g. the draft model is not a valid EAGLE-3 model) or Eagle3SessionError::InvalidConfig.

Source

pub fn config(&self) -> Eagle3SessionConfig

Session configuration passed at construction.

Source

pub fn need_embd(&self) -> bool

True when the speculative backend needs post-norm embeddings on the target context (llama_set_embeddings).

Source

pub fn need_embd_pre_norm(&self) -> bool

True when the speculative backend needs pre-norm hidden states on the target context (llama_set_embeddings_pre_norm).

Configured automatically during session init; callers normally do not need to set it manually.

Source

pub fn n_draft_max(&self) -> i32

Configured maximum number of tokens drafted per draft call.

Source

pub fn n_min(&self) -> i32

Configured minimum draft tokens (n_min).

Source

pub fn p_min(&self) -> f32

Configured draft probability floor (p_min).

Source

pub fn n_seq(&self) -> u32

Configured number of sequences.

Source

pub fn print_stats(&self)

Log speculative-decoding statistics (draft/accept counts and timings) via llama.cpp LOG_INF. Install a log callback with crate::log_set to capture output.

Source

pub fn begin( &mut self, seq_id: i32, prompt: &[LlamaToken], ) -> Result<(), Eagle3SessionError>

Optional: call once at the start of a fresh generation with the prompt tokens that were just decoded into the target context.

§Errors

Returns Eagle3SessionError::BadSeqId if seq_id is out of range.

Source

pub fn process(&mut self, batch: &LlamaBatch) -> Result<(), Eagle3SessionError>

Hand the session a batch that was just decoded on the target context.

Call this after every successful target.decode(batch) so upstream can harvest the target hidden states EAGLE-3 drafts from.

§Errors

Returns Eagle3SessionError::Process if the underlying call fails.

Source

pub fn draft( &mut self, seq_id: i32, n_past: i32, id_last: LlamaToken, ) -> Result<Vec<LlamaToken>, Eagle3SessionError>

Generate up to n_draft_max speculative tokens.

n_past is the number of tokens already in the target KV cache for seq_id. id_last is the last token accepted on the target (usually the token you just sampled).

§Errors

Returns Eagle3SessionError::BadSeqId if seq_id is out of range.

Source

pub fn accept( &mut self, seq_id: i32, n_accepted: u16, ) -> Result<(), Eagle3SessionError>

Inform the session how many draft tokens the target verifier accepted.

Pass 0 when every draft was rejected.

§Errors

Returns Eagle3SessionError::BadSeqId if seq_id is out of range.

Trait Implementations§

Source§

impl Debug for Eagle3Session

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Drop for Eagle3Session

Source§

fn drop(&mut self)

Executes the destructor for this type. Read more
Source§

fn pin_drop(self: Pin<&mut Self>)

🔬This is a nightly-only experimental API. (pin_ergonomics)
Execute the destructor for this type, but different to Drop::drop, it requires self to be pinned. Read more
Source§

impl Send for Eagle3Session

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more