Skip to main content

Module eagle

Module eagle 

Source
Expand description

Safe wrapper around the C++ EAGLE-3 draft session.

Eagle3Session drives EAGLE-3 speculative decoding (COMMON_SPECULATIVE_TYPE_DRAFT_EAGLE3 in upstream llama.cpp). EAGLE-3 pairs a target model with a small, separately-trained EAGLE-3 draft model that predicts the next tokens from hidden states extracted out of the target model.

The draft algorithm lives in upstream’s common/speculative.cpp (common_speculative_impl_draft_eagle3). This module wraps it through the same stable C shim used for MTP (llama-cpp-sys-4/mtp_shim/); the two techniques share an identical session lifecycle and differ only in how the draft context is built.

§EAGLE-3 vs MTP

EAGLE-3 (Eagle3Session)MTP (crate::mtp::MtpSession)
Draft weightsa separate EAGLE-3 draft modelthe same model as the target
Draft context typeLlamaContextType::DefaultLlamaContextType::Mtp
Requirementdraft model must expose 3 target-extract layerstarget model must have MTP heads

§Setup

use llama_cpp_4::context::params::LlamaContextParams;
use llama_cpp_4::eagle::{Eagle3Session, Eagle3SessionConfig};

let n_draft_max = 3;

// Target: the main model, a normal (default) context.
let target = main_model.new_context(&backend, LlamaContextParams::default())?;

// Draft: a SEPARATE EAGLE-3 draft model, also a default context.
let draft = eagle3_model.new_context(&backend, LlamaContextParams::default())?;

let config = Eagle3SessionConfig::new(1, n_draft_max);
let mut session = Eagle3Session::new_with_config(&target, &draft, config)?;

§Speculative loop

Identical in shape to MTP: after each decode on the target context call process, then draft to get candidate tokens, verify them on the target, and report how many were accepted with accept.

target.decode(&mut batch)?;
session.process(&batch)?;
let drafts = session.draft(0, n_past, last_token)?;
// verify `drafts` against the target, count acceptances ...
session.accept(0, n_accepted)?;

§Hidden-state extraction

EAGLE-3 needs the target model to expose internal hidden states. The session configures the required extraction on both contexts at construction time; need_embd and need_embd_pre_norm report which kind the active backend requested (rarely needed by callers).

Structs§

Eagle3Session
Owned EAGLE-3 draft session.
Eagle3SessionConfig
Parameters for Eagle3Session::new_with_config.

Enums§

Eagle3SessionError
Errors raised by the EAGLE-3 draft session.