pub struct BatchedReranker { /* private fields */ }Expand description
Concurrent rerank coalescer.
Wraps a CrossEncoder and serializes concurrent recall reranks through
a single worker thread. The worker buffers up to max_batch requests
or waits up to max_wait_ms (whichever first), then issues one
rerank_batch call. The Mutex around the BERT model is held for the
whole batch instead of once per (query, candidate) — the throughput
fix mandated by G9.
Single-request latency: the worker flushes immediately when the
queue is empty after pulling the first job, so a lone request only
pays one recv_timeout(0) round-trip — no artificial waiting.
Implementations§
Source§impl BatchedReranker
impl BatchedReranker
Sourcepub fn new(encoder: CrossEncoder) -> Self
pub fn new(encoder: CrossEncoder) -> Self
Wrap an existing CrossEncoder with the default batching parameters
(max_batch = 32, max_wait_ms = 5).
Sourcepub fn with_params(
encoder: CrossEncoder,
max_batch: usize,
max_wait_ms: u64,
) -> Self
pub fn with_params( encoder: CrossEncoder, max_batch: usize, max_wait_ms: u64, ) -> Self
Wrap an existing CrossEncoder with custom batching parameters.
Sourcepub fn with_reflection_boost(
encoder: CrossEncoder,
boost: ReflectionBoostConfig,
) -> Self
pub fn with_reflection_boost( encoder: CrossEncoder, boost: ReflectionBoostConfig, ) -> Self
v0.7.0 L2-8 — wrap an existing CrossEncoder with a custom
reflection-boost config alongside default batching parameters.
Used by the recall integration tests to pin specific boost shapes
(e.g. disabled() for the regression test).
Sourcepub fn with_score_floor(
encoder: CrossEncoder,
floor: RerankerScoreFloor,
) -> Self
pub fn with_score_floor( encoder: CrossEncoder, floor: RerankerScoreFloor, ) -> Self
v0.7.0 #1319 — wrap a CrossEncoder with a post-blend score
floor. The reflection-boost knob is left at the daemon default
(1.2); use Self::with_full_params to set both at once.
Default constructors leave the floor Off — flipping it on
here is an explicit operator-opt-in.
Sourcepub fn rerank(
&self,
query: &str,
candidates: Vec<(Memory, f64)>,
) -> Vec<(Memory, f64)>
pub fn rerank( &self, query: &str, candidates: Vec<(Memory, f64)>, ) -> Vec<(Memory, f64)>
Submit a single rerank request. Blocks until the result is available.
#1579 B10 — auto-select. The wrapper keeps BOTH execution
paths and picks per call via use_batched_rerank_path:
- Direct (no worker round-trip) when the encoder is
lexical / degraded-lexical (no shared-model mutex to
amortise — criterion proved the coalescing flush window made
the batched path 12× slower at N=8: ~7.6 ms vs ~0.65 ms), or
when fewer than
BATCHED_RERANK_MIN_CONCURRENCYrequests are in flight (nothing to coalesce with). - Coalesced (worker thread, one
rerank_batchper flush) for neural encoders under real concurrency — the G9 win (~3× at N=8 neural) is preserved.
If the worker is unavailable for any reason (channel closed),
falls back to a direct rerank call on the underlying encoder
(with the wrapper’s configured reflection boost applied).
Sourcepub fn rerank_coalesced(
&self,
query: &str,
candidates: Vec<(Memory, f64)>,
) -> Vec<(Memory, f64)>
pub fn rerank_coalesced( &self, query: &str, candidates: Vec<(Memory, f64)>, ) -> Vec<(Memory, f64)>
#1579 B10 — force the COALESCED (worker) path regardless of the
auto-select. Kept public so the throughput bench
(benches/reranker_throughput.rs) and regression tests can keep
measuring the raw batched machinery after rerank started
auto-selecting away from it at small N. Applies the same
post-blend score floor as Self::rerank.
Sourcepub fn worker_submissions(&self) -> usize
pub fn worker_submissions(&self) -> usize
#1579 B10 — lifetime count of jobs submitted to the coalescing worker. Observability hook for the auto-select regression tests (“lexical traffic never reaches the worker”) and operator diagnostics.
Sourcepub fn score_floor(&self) -> RerankerScoreFloor
pub fn score_floor(&self) -> RerankerScoreFloor
v0.7.0 #1319 — accessor for the configured score floor, used by
operator-facing diagnostics. NOTE (n22): the memory_capabilities
envelope does not currently surface this value; wiring the floor
through config and exposing it in capabilities is tracked under
#1319 / n14.
Sourcepub fn reflection_boost(&self) -> &ReflectionBoostConfig
pub fn reflection_boost(&self) -> &ReflectionBoostConfig
v0.7.0 L2-8 — expose the configured boost for the
memory_capabilities reporter.
Sourcepub fn encoder(&self) -> &CrossEncoder
pub fn encoder(&self) -> &CrossEncoder
Direct access to the wrapped encoder. Useful for callers that want to bypass the coalescer (tests, benchmarks).
Sourcepub fn is_neural(&self) -> bool
pub fn is_neural(&self) -> bool
Convenience shortcut for self.encoder().is_neural(). Most
callers in the recall pipeline only need to check the variant
for capability reporting.
Sourcepub fn is_degraded_lexical(&self) -> bool
pub fn is_degraded_lexical(&self) -> bool
v0.7.0 R3-S2 — shortcut for self.encoder().is_degraded_lexical().
The recall path reads this to drive the in-band reranker_used
signal exposed via RecallMeta.
Trait Implementations§
Source§impl Drop for BatchedReranker
impl Drop for BatchedReranker
Auto Trait Implementations§
impl !Freeze for BatchedReranker
impl !RefUnwindSafe for BatchedReranker
impl !UnwindSafe for BatchedReranker
impl Send for BatchedReranker
impl Sync for BatchedReranker
impl Unpin for BatchedReranker
impl UnsafeUnpin for BatchedReranker
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<T> ErasedDestructor for Twhere
T: 'static,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more