Skip to main content

PositionalIndexer

Struct PositionalIndexer 

Source
pub struct PositionalIndexer { /* private fields */ }
Expand description

Positional indexer for cache-aware routing.

Uses a single DashMap<(usize, ContentHash), SeqEntry> — keyed by (position, content_hash). Grows unboundedly (no capacity limit). Jump search gives amortized O(D/J + W) matching complexity.

Write-path methods take a caller-owned &mut WorkerBlockMap (one per worker). This gives direct HashMap access (~5ns) instead of DashMap hash+shard locking (~25ns), achieving zero-contention for single-writer-per-worker.

Implementations§

Source§

impl PositionalIndexer

Source

pub fn new(jump_size: usize) -> Self

Create a new PositionalIndexer with the given jump size.

jump_size controls how many positions the search algorithm skips at a time. Larger values reduce lookups on long matching prefixes but increase scan range when workers drain. Default: 64.

Source

pub fn worker_id(&self, worker: &str) -> Option<u32>

Get the internal u32 ID for a worker URL, if it has been interned.

Used by consumers to look up scores in OverlapScores by worker URL. Returns None if the worker has never been seen by this indexer.

Source

pub fn apply_stored( &self, worker_id: u32, blocks: &[StoredBlock], parent_seq_hash: Option<SequenceHash>, worker_blocks: &mut WorkerBlockMap, ) -> Result<(), ApplyError>

Apply a “blocks stored” event for a worker.

worker_id: internal ID from [intern_worker]. blocks: ordered sequence of stored blocks (each with seq_hash + content_hash). parent_seq_hash: if Some, the sequence extends from the parent’s position + 1. If None, the sequence starts from position 0. worker_blocks: caller-owned reverse lookup for this worker.

Source

pub fn apply_removed( &self, worker_id: u32, seq_hashes: &[SequenceHash], worker_blocks: &mut WorkerBlockMap, )

Apply a “blocks removed” event for a worker.

worker_id: internal ID from [intern_worker]. seq_hashes: position-aware block hashes to remove (from proto). worker_blocks: caller-owned reverse lookup for this worker.

Note on orphaned entries: Removing a block at position N does not cascade to blocks at positions > N. Those entries become orphaned — they remain in the index but won’t match queries because the rolling prefix hash chain is broken at the gap. This is harmless: orphaned entries waste a small amount of memory and are cleaned up when the worker is cleared or removed. Backends typically evict from the tail (LRU), so mid-sequence gaps are rare in practice.

Source

pub fn apply_cleared(&self, worker_id: u32, worker_blocks: &mut WorkerBlockMap)

Apply a “cache cleared” event — drain blocks, clean index, caller keeps the empty map.

worker_id: internal ID from [intern_worker]. worker_blocks: caller-owned reverse lookup — drained and left empty.

Source

pub fn remove_worker(&self, worker_id: u32, worker_blocks: WorkerBlockMap)

Remove a worker entirely — takes ownership of blocks, cleans index, worker is gone.

worker_id: internal ID from [intern_worker]. worker_blocks: caller-owned reverse lookup — consumed.

Source

pub fn current_size(&self) -> usize

Get total number of blocks across all workers.

Source

pub fn find_matches( &self, content_hashes: &[ContentHash], early_exit: bool, ) -> OverlapScores

Find overlap scores for a request’s content hash sequence.

Uses jump search: strides by jump_size positions, only scanning intermediate positions when workers drain (stop matching). Complexity: amortized O(D/J + W) where D=depth, J=jump_size, W=workers.

When early_exit is true, returns immediately after finding any match at position 0 (score = 1 for all matching workers). Useful when the caller only needs to know whether any worker has cached data for this sequence.

Assumption: Block sequences are prefix-closed — if a worker has a block at position N, it has blocks at all positions 0..N. This holds when backends evict from the tail (LRU). If apply_removed creates a mid-sequence gap, the rolling prefix hash detects it (the chain breaks at the gap), but the jump heuristic may over-count if it lands past the gap. In practice, backends only evict tail blocks.

Source

pub fn intern_worker(&self, worker: &str) -> u32

Intern a worker URL to an internal u32 ID. Fast path: DashMap shard read (no lock). Slow path: DashMap entry API (once per worker).

Trait Implementations§

Source§

impl Debug for PositionalIndexer

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for PositionalIndexer

Source§

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more