pub trait RerankBackend: Send + Sync {
// Required methods
fn score_batch(&self, encodings: &[Encoding]) -> Result<Vec<f32>>;
fn is_gpu(&self) -> bool;
// Provided methods
fn max_tokens(&self) -> usize { ... }
fn name(&self) -> &'static str { ... }
}Expand description
Trait for cross-encoder rerank backends.
Parallel to [EmbedBackend], but the forward pass terminates in a
scalar relevance score per pair instead of a pooled vector. Used by
the retrieve-then-rerank pipeline: a bi-encoder ([EmbedBackend])
retrieves top-K cheaply, then RerankBackend re-scores those K
candidates with the cross-encoder’s higher-quality cross-attention
over the concatenated [CLS] query [SEP] doc [SEP] sequence.
§Why a separate trait
Cross-encoders share BERT’s trunk with bi-encoders, but the head and
pooling differ: bi-encoder = CLS pool + L2-normalize, cross-encoder
= CLS pool + linear(hidden -> 1) + sigmoid. The two return shapes are
incompatible (Vec<Vec<f32>> vs Vec<f32>), so unifying them under
a single trait would force every caller to handle an awkward sum
type. Sibling traits keep both call sites direct.
Required Methods§
Sourcefn score_batch(&self, encodings: &[Encoding]) -> Result<Vec<f32>>
fn score_batch(&self, encodings: &[Encoding]) -> Result<Vec<f32>>
Score a batch of pre-tokenized pairs and return one score per
encoding. Scores are sigmoid-activated and lie in [0, 1].
The encoding’s token_type_ids should mark the query side as
0 and the doc side as 1 (standard BERT pair convention); this
is what tokenizers::Tokenizer::encode((query, doc), ..)
produces.
§Errors
Returns an error if tensor construction or the forward pass fails.
Provided Methods§
Sourcefn max_tokens(&self) -> usize
fn max_tokens(&self) -> usize
Maximum token count this model supports.