pub struct CrossEncoderEngine { /* private fields */ }Expand description
Cross-encoder reranking engine.
Thread-safe — shared via Arc. Maintains a pool of independent ONNX sessions
so concurrent rerank calls never contend on a single mutex.
Implementations§
Source§impl CrossEncoderEngine
impl CrossEncoderEngine
Sourcepub async fn new(cache_dir: Option<String>) -> Result<Self>
pub async fn new(cache_dir: Option<String>) -> Result<Self>
Load or download the reranker model.
Downloads Xenova/bge-reranker-base ONNX INT8 model from HuggingFace Hub
if not already cached. Builds RERANKER_POOL_SIZE independent sessions.
Sourcepub async fn score_pairs(
&self,
query: &str,
passages: &[String],
) -> Result<Vec<f32>>
pub async fn score_pairs( &self, query: &str, passages: &[String], ) -> Result<Vec<f32>>
Score a batch of (query, passage) pairs.
Passages are split into chunks of [RERANKER_CHUNK_SIZE] and dispatched
in parallel across the session pool (round-robin). Within each chunk,
pairs are processed in mini-batches of [RERANKER_ONNX_BATCH_SIZE] to
reduce sequence-padding overhead. Chunk results are reassembled in input
order.
Returns Err(InferenceError::Overloaded) immediately when more than
RERANKER_MAX_CONCURRENT callers are active — the API layer falls back
to unranked results rather than queuing indefinitely (DAK-5893 fix).
Returns a relevance score in [0, 1] for each passage.
Higher scores indicate greater relevance to the query.
Sourcepub fn onnx_batch_size(&self) -> usize
pub fn onnx_batch_size(&self) -> usize
Configured ONNX mini-batch size (pairs per session.run() call).
Sourcepub fn active_requests_count(&self) -> usize
pub fn active_requests_count(&self) -> usize
Current number of active concurrent score_pairs calls.
Used by metrics and health checks (DAK-5893).
Sourcepub fn max_concurrent(&self) -> usize
pub fn max_concurrent(&self) -> usize
Maximum concurrent score_pairs calls before Overloaded is returned.
Trait Implementations§
Auto Trait Implementations§
impl !Freeze for CrossEncoderEngine
impl !RefUnwindSafe for CrossEncoderEngine
impl Send for CrossEncoderEngine
impl Sync for CrossEncoderEngine
impl Unpin for CrossEncoderEngine
impl UnsafeUnpin for CrossEncoderEngine
impl !UnwindSafe for CrossEncoderEngine
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more