Skip to main content

CrossEncoder

Enum CrossEncoder 

Source
pub enum CrossEncoder {
    Lexical {
        degraded: bool,
    },
    Neural {
        model: Arc<BertModel>,
        tokenizer: Arc<Tokenizer>,
        classifier_weight: Tensor,
        classifier_bias: Tensor,
        device: Device,
    },
}
Expand description

Cross-encoder for (query, document) relevance scoring.

Variants§

§

Lexical

Lightweight lexical cross-encoder using term overlap signals.

degraded is true when this variant exists because a configured neural cross-encoder failed to initialise (HF Hub unreachable, model checksum mismatch, etc.) and the runtime fell back. false is the originally-configured lexical tier (operator opted in to keyword-tier or smart-tier without cross-encoder reranking).

v0.7.0 R3-S2 — the distinction surfaces in the recall response’s meta.reranker_used field as "degraded_lexical" vs "lexical", so an in-band signal tells clients (MCP + HTTP) when their reranker downgraded. The original G8 fix landed tracing::warn! only; G8 closure per the playbook required an in-response field, which the prior implementation overstated.

Fields

§degraded: bool
§

Neural

Neural BERT-based cross-encoder (ms-marco-MiniLM-L-6-v2).

v0.7.0 #1084 — model is Arc<BertModel> (no mutex), same pattern as Embedder::Local. The pre-#1084 design held an Arc<Mutex<BertModel>> and locked across the full neural rerank forward pass, serialising every rerank-tier recall on a single global mutex. Candle’s BertModel::forward takes &self (inference-only; weights are read-only) so the mutex was unnecessary.

Fields

§tokenizer: Arc<Tokenizer>
§classifier_weight: Tensor
§classifier_bias: Tensor
§device: Device

Implementations§

Source§

impl CrossEncoder

Source

pub fn new() -> Self

Create a new lexical cross-encoder (no model download required).

This is the “originally lexical” path — the operator either chose keyword-/semantic-tier (no cross-encoder reranking) or explicitly opted into the lexical variant. Use Self::new_neural to attempt the neural path with fall-back-to-lexical semantics.

Source

pub fn new_neural() -> Self

Create a neural cross-encoder by downloading ms-marco-MiniLM-L-6-v2.

Falls back to lexical if download or loading fails. The fallback is marked degraded: true so the recall response surfaces reranker_used = "degraded_lexical" per R3-S2 — an in-band signal that v0.7.0 promises but pre-R3 only emitted as a tracing::warn! (a tracing-event-only fallback is not the same as a per-response field operators can branch on).

v0.6.3.1 (P3, G8): when the neural path fails (e.g. HF Hub unreachable, model checksum mismatch), emit a structured tracing event reranker.fallback so operators see the silent neural→lexical degrade. The eprintln remains for backward-compat startup logs.

Source

pub fn score(&self, query: &str, title: &str, content: &str) -> f32

Score a single (query, document) pair.

Returns a relevance score in 0.0..=1.0.

Source

pub fn is_neural(&self) -> bool

Whether this is a neural cross-encoder.

Source

pub fn is_degraded_lexical(&self) -> bool

v0.7.0 R3-S2 — whether this cross-encoder is a degraded lexical fallback (i.e., a neural variant was attempted at startup or mid-flight and the runtime fell back). false for Neural and for the originally-configured Lexical (operator opted into keyword-/semantic-tier without cross-encoder reranking). The recall response surfaces this distinction as meta.reranker_used = "degraded_lexical" so clients can detect the silent downgrade in-band — closing the G8 closure claim that tracing-event-only signalling had overstated.

Source

pub fn rerank( &self, query: &str, candidates: Vec<(Memory, f64)>, ) -> Vec<(Memory, f64)>

Rerank a set of candidates by blending their original scores with cross-encoder scores.

Blend formula: final = 0.6 * original + 0.4 * cross_encoder

#1597 pool cap: only the strongest RERANK_POOL_MAX candidates by incoming blended score are cross-encoded; the remainder keep their blended scores and rank below the reranked head (head sorted by final_score descending, tail sorted by blended score descending — no candidate is dropped). A pool at or under the cap is fully reranked and returned sorted by final_score descending, as before.

v0.7.0 L2-8 contract: the bare rerank is the pre-L2-8 behavior — no reflection boost is applied. Daemons that want the reflection-aware boost must call Self::rerank_with_reflection_boost (which is what BatchedReranker does by default with ReflectionBoostConfig::default). Keeping the bare method boost-free is a deliberate regression-pin discipline: the L2-8 recall test for boost = 1.0 uses rerank_with_reflection_boost(.., &ReflectionBoostConfig::disabled()) and asserts byte-identical output to rerank(..).

Source

pub fn rerank_with_reflection_boost( &self, query: &str, candidates: Vec<(Memory, f64)>, boost_config: &ReflectionBoostConfig, ) -> Vec<(Memory, f64)>

v0.7.0 L2-8 — rerank with a post-step reflection-aware boost.

  1. Same blend as Self::rerank (0.6 * original + 0.4 * ce).
  2. After the blend, multiply each candidate’s final_score by ReflectionBoostConfig::factor_for. Observations get a multiplier of 1.0 (unchanged); reflections get boost * (1.0 + per_depth_increment * clamp(depth, 0..=cap)).
  3. Sort descending after the boost so the output ordering reflects the post-boost ranking.

Operationally this means: a reflection that the cross-encoder scored at parity with its source observations moves up; the movement is bounded (capped per-depth multiplier, single global boost factor) so a mediocre reflection cannot leapfrog a well-matched observation — the boost is a thumb-on-the-scale, not a free pass. #1597 pool cap + batched forward pass. Only the strongest RERANK_POOL_MAX candidates by incoming blended score receive a cross-encoder score (in one batched forward pass on the Neural variant); the remainder keep their blended scores, internally sorted descending, appended after the reranked head. No candidate is ever dropped. A pool at or under the cap degenerates to the historical full rerank.

Source

pub fn rerank_batch( &self, queries: Vec<(String, Vec<(Memory, f64)>)>, ) -> Vec<Vec<(Memory, f64)>>

v0.7 G9 — batched rerank for concurrent recall.

Process all (query, candidates) jobs in a single tokenize + single forward pass on the Neural variant, holding the BERT mutex once for the whole batch instead of once per (query, candidate) pair.

Throughput target: ~3× for parallel recall vs. per-query rerank() calls.

Output ordering: result[i] corresponds to queries[i]. Each inner vector is sorted by descending blended score, identical to rerank(). Lexical variant delegates per-query (no batching win since lexical scoring is already CPU-trivial).

Source

pub fn rerank_batch_with_reflection_boost( &self, queries: Vec<(String, Vec<(Memory, f64)>)>, boost_config: &ReflectionBoostConfig, ) -> Vec<Vec<(Memory, f64)>>

v0.7.0 L2-8 — batched rerank with a post-step reflection-aware boost applied per candidate. Same boost arithmetic as Self::rerank_with_reflection_boost, factored so the boost shape lives in a single helper.

Trait Implementations§

Source§

impl Default for CrossEncoder

Source§

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> ErasedDestructor for T
where T: 'static,

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more