pub struct PrimaryKeyIndex { /* private fields */ }Expand description
Thread-safe primary key deduplication index.
Sync dedup in the hot path: BloomFilter::may_contain(),
FxHashSet::contains(), and TextDictReader::ordinal() are all sync.
Interior mutability for the mutable state (bloom + uncommitted set) is
behind parking_lot::Mutex. The committed data is only mutated via
&mut self methods (commit/abort path), so no lock is needed for it.
Implementations§
Source§impl PrimaryKeyIndex
impl PrimaryKeyIndex
Sourcepub fn new(
field: Field,
pk_data: Vec<PkSegmentData>,
snapshot: SegmentSnapshot,
) -> Self
pub fn new( field: Field, pk_data: Vec<PkSegmentData>, snapshot: SegmentSnapshot, ) -> Self
Create a new PrimaryKeyIndex by scanning committed segments.
Iterates each segment’s fast-field text dictionary to populate the bloom filter with all existing primary key values. The snapshot keeps ref counts alive so segments aren’t deleted while we hold data.
CPU-intensive — call from spawn_blocking, not the async runtime.
Sourcepub fn from_persisted(
field: Field,
bloom: BloomFilter,
pk_data: Vec<PkSegmentData>,
new_data: &[PkSegmentData],
snapshot: SegmentSnapshot,
) -> Self
pub fn from_persisted( field: Field, bloom: BloomFilter, pk_data: Vec<PkSegmentData>, new_data: &[PkSegmentData], snapshot: SegmentSnapshot, ) -> Self
Create from a pre-loaded bloom filter (loaded from pk_bloom.bin).
Skips dictionary iteration entirely when the persisted bloom covers
all current segments. pk_data contains data for ALL current segments.
If new_data is non-empty, their keys are inserted into the bloom
before returning (incremental update). new_data is a borrowed slice
pointing to the subset of segments not covered by the persisted bloom.
Sourcepub fn bloom_to_bytes(&self) -> Vec<u8> ⓘ
pub fn bloom_to_bytes(&self) -> Vec<u8> ⓘ
Serialize the bloom filter for persistence to pk_bloom.bin.
Sourcepub fn memory_bytes(&self) -> usize
pub fn memory_bytes(&self) -> usize
Memory used by the bloom filter and uncommitted set.
Sourcepub fn check_and_insert(&self, doc: &Document) -> Result<()>
pub fn check_and_insert(&self, doc: &Document) -> Result<()>
Check whether a document’s primary key is unique, and if so, register it.
Returns Ok(()) if the key is new (inserted into bloom + uncommitted set).
Returns Err(DuplicatePrimaryKey) if the key already exists.
Returns Err(Document) if the primary key field is missing or empty.
Sourcepub fn refresh_incremental(
&mut self,
new_data: Vec<PkSegmentData>,
snapshot: SegmentSnapshot,
)
pub fn refresh_incremental( &mut self, new_data: Vec<PkSegmentData>, snapshot: SegmentSnapshot, )
Refresh after commit: merge new segment data, prune removed segments, insert new keys into bloom, and clear uncommitted set.
Only new_data (segments not already held) need to be loaded by the
caller. Existing data for segments still in snapshot is retained.
The snapshot keeps ref counts alive so segments aren’t deleted.
Sourcepub fn committed_segment_ids(&self) -> impl Iterator<Item = &str>
pub fn committed_segment_ids(&self) -> impl Iterator<Item = &str>
Iterator over segment IDs already held in this PK index.
Sourcepub fn rollback_uncommitted_key(&self, doc: &Document)
pub fn rollback_uncommitted_key(&self, doc: &Document)
Roll back an uncommitted key registration (e.g. when channel send fails after check_and_insert succeeded). Bloom may retain the key but that only causes harmless false positives, never missed duplicates.
Sourcepub fn clear_uncommitted(&mut self)
pub fn clear_uncommitted(&mut self)
Clear uncommitted keys (e.g. on abort). Bloom may retain stale entries but that only causes harmless false positives (extra committed-segment lookups), never missed duplicates.
Auto Trait Implementations§
impl !Freeze for PrimaryKeyIndex
impl !RefUnwindSafe for PrimaryKeyIndex
impl Send for PrimaryKeyIndex
impl Sync for PrimaryKeyIndex
impl Unpin for PrimaryKeyIndex
impl UnsafeUnpin for PrimaryKeyIndex
impl !UnwindSafe for PrimaryKeyIndex
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.