pub trait Backend: Send + Sync {
// Required methods
fn name(&self) -> &str;
fn ready(&self) -> bool;
fn generate<'life0, 'async_trait>(
&'life0 self,
req: Resolved,
) -> Pin<Box<dyn Future<Output = Result<TokenStream, GenerateError>> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait;
// Provided methods
fn capabilities(&self) -> BackendCapabilities { ... }
fn generate_v2<'life0, 'async_trait>(
&'life0 self,
_req: ResolvedV2,
) -> Pin<Box<dyn Future<Output = Result<TokenStreamV2, GenerateError>> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait { ... }
fn embed<'life0, 'async_trait>(
&'life0 self,
_req: EmbedResolved,
) -> Pin<Box<dyn Future<Output = Result<EmbedResult, EmbedError>> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait { ... }
fn stop<'life0, 'async_trait>(
&'life0 self,
_timeout: Duration,
) -> Pin<Box<dyn Future<Output = Result<(), GenerateError>> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait { ... }
}Expand description
An inference backend.
Implementations are owned by the daemon and shared across requests through
Arc<dyn Backend>. Methods take &self; concurrent invocations of
generate() are serialised by the daemon’s admission queue, not by the
trait.
Required Methods§
Sourcefn name(&self) -> &str
fn name(&self) -> &str
Stable identifier for the backend, e.g. "mock", "llamacpp",
"anthropic". Echoed in Response::Done::backend for diagnostic
purposes (ADR 0007).
Sourcefn ready(&self) -> bool
fn ready(&self) -> bool
Whether the backend has finished its boot sequence and can serve
requests. The daemon does not create its inference listener until
every registered backend reports true (see THREAT_MODEL.md F-13).
Sourcefn generate<'life0, 'async_trait>(
&'life0 self,
req: Resolved,
) -> Pin<Box<dyn Future<Output = Result<TokenStream, GenerateError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn generate<'life0, 'async_trait>(
&'life0 self,
req: Resolved,
) -> Pin<Box<dyn Future<Output = Result<TokenStream, GenerateError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Begin a generation and return a stream of TokenEvent values.
Errors returned here surface as Response::Error before any tokens
reach the client. Errors that occur after the first token has streamed
terminate the stream without a Done.
Provided Methods§
Sourcefn capabilities(&self) -> BackendCapabilities
fn capabilities(&self) -> BackendCapabilities
Capabilities the backend advertises to the daemon and (via
the admin status surface) to consumers. Default: text-only v1
backend, no v2, no multimodal, no tools — matches the v0.1
mock and llamacpp shape so existing implementors compile
unchanged.
Sourcefn generate_v2<'life0, 'async_trait>(
&'life0 self,
_req: ResolvedV2,
) -> Pin<Box<dyn Future<Output = Result<TokenStreamV2, GenerateError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn generate_v2<'life0, 'async_trait>(
&'life0 self,
_req: ResolvedV2,
) -> Pin<Box<dyn Future<Output = Result<TokenStreamV2, GenerateError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Begin a v2 generation and return a stream of TokenEventV2
values. Default impl returns GenerateError::Internal("v2 not supported by this backend") — adapters opt in by overriding.
The daemon checks capabilities().v2 before calling this on
the v2 path; the default false capability prevents dispatch
from reaching here for non-v2 backends.
Sourcefn embed<'life0, 'async_trait>(
&'life0 self,
_req: EmbedResolved,
) -> Pin<Box<dyn Future<Output = Result<EmbedResult, EmbedError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn embed<'life0, 'async_trait>(
&'life0 self,
_req: EmbedResolved,
) -> Pin<Box<dyn Future<Output = Result<EmbedResult, EmbedError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Compute embeddings for the request’s input strings (per
ADR 0017). Default impl returns EmbedError::Unsupported —
adapters opt in by overriding and setting
capabilities().embed = true. The daemon binds the embed
socket only when the active backend’s capability is true,
so reaching this default impl in production is a fail-safe
for misconfiguration.
Sourcefn stop<'life0, 'async_trait>(
&'life0 self,
_timeout: Duration,
) -> Pin<Box<dyn Future<Output = Result<(), GenerateError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn stop<'life0, 'async_trait>(
&'life0 self,
_timeout: Duration,
) -> Pin<Box<dyn Future<Output = Result<(), GenerateError>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Best-effort graceful shutdown. The daemon calls this on stop; the adapter should release model memory, terminate worker threads, and any other long-lived resources within the deadline.
Dyn Compatibility§
This trait is dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety".