Skip to main content

RerankBackend

Trait RerankBackend 

Source
pub trait RerankBackend: Send + Sync {
    // Required methods
    fn score_batch(&self, encodings: &[Encoding]) -> Result<Vec<f32>>;
    fn is_gpu(&self) -> bool;

    // Provided methods
    fn max_tokens(&self) -> usize { ... }
    fn name(&self) -> &'static str { ... }
}
Expand description

Trait for cross-encoder rerank backends.

Parallel to EmbedBackend, but the forward pass terminates in a scalar relevance score per pair instead of a pooled vector. Used by the retrieve-then-rerank pipeline: a bi-encoder (EmbedBackend) retrieves top-K cheaply, then RerankBackend re-scores those K candidates with the cross-encoder’s higher-quality cross-attention over the concatenated [CLS] query [SEP] doc [SEP] sequence.

§Why a separate trait

Cross-encoders share BERT’s trunk with bi-encoders, but the head and pooling differ: bi-encoder = CLS pool + L2-normalize, cross-encoder = CLS pool + linear(hidden → 1) + sigmoid. The two return shapes are incompatible (Vec<Vec<f32>> vs Vec<f32>), so unifying them under a single trait would force every caller to handle an awkward sum type. Sibling traits keep both call sites direct.

Required Methods§

Source

fn score_batch(&self, encodings: &[Encoding]) -> Result<Vec<f32>>

Score a batch of pre-tokenized pairs and return one score per encoding. Scores are sigmoid-activated and lie in [0, 1].

The encoding’s token_type_ids should mark the query side as 0 and the doc side as 1 (standard BERT pair convention); this is what tokenizers::Tokenizer::encode((query, doc), ..) produces.

§Errors

Returns an error if tensor construction or the forward pass fails.

Source

fn is_gpu(&self) -> bool

Whether this backend runs on a GPU.

Provided Methods§

Source

fn max_tokens(&self) -> usize

Maximum token count this model supports.

Source

fn name(&self) -> &'static str

Short human-readable label for this backend.

Implementors§