pub trait EmbedBackend: Send + Sync {
// Required methods
fn embed_batch(&self, encodings: &[Encoding]) -> Result<Vec<Vec<f32>>>;
fn supports_clone(&self) -> bool;
fn clone_backend(&self) -> Box<dyn EmbedBackend>;
fn is_gpu(&self) -> bool;
// Provided method
fn max_tokens(&self) -> usize { ... }
}Expand description
Trait for embedding backends.
Implementations must be Send so they can be moved across threads (e.g.
into a ring-buffer pipeline). The trait is object-safe — callers use
&dyn EmbedBackend or Box<dyn EmbedBackend>.
§GPU vs CPU scheduling
- CPU backends (
is_gpu() == false): cloned per rayon thread viaclone_backend. - GPU backends (
is_gpu() == true): use a ring-buffer pipeline withRING_SIZE = 4for bounded memory.
Required Methods§
Sourcefn supports_clone(&self) -> bool
fn supports_clone(&self) -> bool
Whether this backend supports cheap cloning for per-thread instances.
CPU backends return true; GPU backends typically return false.
Sourcefn clone_backend(&self) -> Box<dyn EmbedBackend>
fn clone_backend(&self) -> Box<dyn EmbedBackend>
Create a cheap clone of this backend for per-thread use in rayon.
§Panics
May panic if supports_clone returns
false. Callers must check supports_clone() first.
Provided Methods§
Sourcefn max_tokens(&self) -> usize
fn max_tokens(&self) -> usize
Maximum token count this model supports (position embedding limit).
ClassicBert: 512. ModernBERT: up to model config. Tokens beyond this
are truncated during tokenization.