pub struct FtsIndex<B>where
B: FtsBackend,{ /* private fields */ }Expand description
Full-text search index generic over storage backend.
Provides identical indexing, search, and highlighting logic for Origin (redb), Lite (in-memory), and WASM deployments.
Writes accumulate in an in-memory Memtable. When the memtable
exceeds its threshold, it is flushed to an immutable segment
stored via the backend. Queries merge the active memtable with
all persisted segments.
An optional MemoryGovernor can be injected via FtsIndex::set_governor
to enforce per-engine memory budgets on large allocations (compaction,
segment merge, query term collection). When no governor is set, allocations
proceed without budget enforcement — which is the correct behaviour for
NodeDB-Lite and WASM deployments where nodedb-mem is not available.
Implementations§
Source§impl<B> FtsIndex<B>where
B: FtsBackend,
impl<B> FtsIndex<B>where
B: FtsBackend,
Source§impl<B> FtsIndex<B>where
B: FtsBackend,
impl<B> FtsIndex<B>where
B: FtsBackend,
Sourcepub fn set_collection_analyzer(
&self,
tid: u64,
collection: &str,
analyzer_name: &str,
) -> Result<(), <B as FtsBackend>::Error>
pub fn set_collection_analyzer( &self, tid: u64, collection: &str, analyzer_name: &str, ) -> Result<(), <B as FtsBackend>::Error>
Set the analyzer for a collection. Persists to backend metadata.
Sourcepub fn set_collection_language(
&self,
tid: u64,
collection: &str,
lang_code: &str,
) -> Result<(), <B as FtsBackend>::Error>
pub fn set_collection_language( &self, tid: u64, collection: &str, lang_code: &str, ) -> Result<(), <B as FtsBackend>::Error>
Set the language for a collection. Persists to backend metadata.
Sourcepub fn get_collection_analyzer(
&self,
tid: u64,
collection: &str,
) -> Result<Option<String>, <B as FtsBackend>::Error>
pub fn get_collection_analyzer( &self, tid: u64, collection: &str, ) -> Result<Option<String>, <B as FtsBackend>::Error>
Get the configured analyzer name for a collection.
Sourcepub fn get_collection_language(
&self,
tid: u64,
collection: &str,
) -> Result<Option<String>, <B as FtsBackend>::Error>
pub fn get_collection_language( &self, tid: u64, collection: &str, ) -> Result<Option<String>, <B as FtsBackend>::Error>
Get the configured language for a collection.
Sourcepub fn analyze_for_collection(
&self,
tid: u64,
collection: &str,
text: &str,
) -> Result<Vec<String>, <B as FtsBackend>::Error>
pub fn analyze_for_collection( &self, tid: u64, collection: &str, text: &str, ) -> Result<Vec<String>, <B as FtsBackend>::Error>
Analyze text using the collection’s configured analyzer.
Falls back to the standard English analyzer if no analyzer is configured.
Sourcepub fn tokenize_raw_for_collection(
&self,
tid: u64,
collection: &str,
text: &str,
) -> Result<Vec<String>, <B as FtsBackend>::Error>
pub fn tokenize_raw_for_collection( &self, tid: u64, collection: &str, text: &str, ) -> Result<Vec<String>, <B as FtsBackend>::Error>
Tokenize text WITHOUT stemming for fuzzy matching.
Source§impl<B> FtsIndex<B>where
B: FtsBackend,
impl<B> FtsIndex<B>where
B: FtsBackend,
Sourcepub fn read_fieldnorm(
&self,
tid: u64,
collection: &str,
doc_id: Surrogate,
) -> Result<Option<u32>, <B as FtsBackend>::Error>
pub fn read_fieldnorm( &self, tid: u64, collection: &str, doc_id: Surrogate, ) -> Result<Option<u32>, <B as FtsBackend>::Error>
Get the fieldnorm (SmallFloat-encoded doc length) for a doc.
Returns the decoded approximate u32 length, or None if not stored.
Sourcepub fn write_fieldnorm(
&self,
tid: u64,
collection: &str,
doc_id: Surrogate,
doc_length: u32,
) -> Result<(), <B as FtsBackend>::Error>
pub fn write_fieldnorm( &self, tid: u64, collection: &str, doc_id: Surrogate, doc_length: u32, ) -> Result<(), <B as FtsBackend>::Error>
Write a fieldnorm byte for a surrogate. Grows the array if needed.
Source§impl<B> FtsIndex<B>where
B: FtsBackend,
impl<B> FtsIndex<B>where
B: FtsBackend,
Sourcepub fn index_stats(
&self,
tid: u64,
collection: &str,
) -> Result<(u32, f32), <B as FtsBackend>::Error>
pub fn index_stats( &self, tid: u64, collection: &str, ) -> Result<(u32, f32), <B as FtsBackend>::Error>
Get total document count and average document length for a collection.
Returns (total_docs, avg_doc_len). If the collection is empty,
returns (0, 1.0) to avoid division by zero.
Source§impl<B> FtsIndex<B>where
B: FtsBackend,
impl<B> FtsIndex<B>where
B: FtsBackend,
Sourcepub fn put_synonym_group(
&self,
tid: u64,
record: &SynonymGroupRecord,
) -> Result<(), <B as FtsBackend>::Error>
pub fn put_synonym_group( &self, tid: u64, record: &SynonymGroupRecord, ) -> Result<(), <B as FtsBackend>::Error>
Persist a synonym group. Overwrites any existing group with the same name.
Sourcepub fn delete_synonym_group(
&self,
tid: u64,
name: &str,
) -> Result<bool, <B as FtsBackend>::Error>
pub fn delete_synonym_group( &self, tid: u64, name: &str, ) -> Result<bool, <B as FtsBackend>::Error>
Delete a synonym group. Returns true if it existed.
Sourcepub fn get_synonym_group(
&self,
tid: u64,
name: &str,
) -> Result<Option<SynonymGroupRecord>, <B as FtsBackend>::Error>
pub fn get_synonym_group( &self, tid: u64, name: &str, ) -> Result<Option<SynonymGroupRecord>, <B as FtsBackend>::Error>
Read a single synonym group by name. Returns None if not found or tombstoned.
Sourcepub fn list_synonym_groups(
&self,
tid: u64,
) -> Result<Vec<SynonymGroupRecord>, <B as FtsBackend>::Error>
pub fn list_synonym_groups( &self, tid: u64, ) -> Result<Vec<SynonymGroupRecord>, <B as FtsBackend>::Error>
List all synonym group records for a tenant.
Sourcepub fn build_synonym_map_for_tenant(
&self,
_tid: u64,
all_groups: &[SynonymGroupRecord],
) -> SynonymMap
pub fn build_synonym_map_for_tenant( &self, _tid: u64, all_groups: &[SynonymGroupRecord], ) -> SynonymMap
Build an in-memory SynonymMap from a slice of synonym group records.
Each term in every group maps to all other terms in that group
(bidirectional OR-expansion). Terms are analyzed with the default
analyzer so synonym keys match the stemmed tokens produced at query
time by search_with_mode.
Sourcepub fn expand_query_with_synonyms(
&self,
tid: u64,
tokens: Vec<String>,
) -> Result<Vec<String>, <B as FtsBackend>::Error>
pub fn expand_query_with_synonyms( &self, tid: u64, tokens: Vec<String>, ) -> Result<Vec<String>, <B as FtsBackend>::Error>
Load all synonym groups for a tenant and build the expansion map.
Called at FTS query time inside search_with_mode to expand query
tokens before BM25 scoring.
Source§impl<B> FtsIndex<B>where
B: FtsBackend,
impl<B> FtsIndex<B>where
B: FtsBackend,
Sourcepub fn new(backend: B) -> FtsIndex<B>
pub fn new(backend: B) -> FtsIndex<B>
Create a new FTS index with the given backend and default BM25 params.
Sourcepub fn with_params(backend: B, params: Bm25Params) -> FtsIndex<B>
pub fn with_params(backend: B, params: Bm25Params) -> FtsIndex<B>
Create a new FTS index with custom BM25 parameters.
Sourcepub fn set_governor(&mut self, governor: Arc<MemoryGovernor>)
pub fn set_governor(&mut self, governor: Arc<MemoryGovernor>)
Inject a MemoryGovernor to enforce per-engine memory budgets on
large allocations (compaction, merge, query). When not set, all
allocations proceed without budget enforcement.
This is the correct pattern for Origin deployments. NodeDB-Lite and
WASM builds should leave the governor unset (no nodedb-mem dependency).
Sourcepub fn backend_mut(&mut self) -> &mut B
pub fn backend_mut(&mut self) -> &mut B
Mutable access to the underlying backend.
Sourcepub fn index_document(
&self,
tid: u64,
collection: &str,
doc_id: Surrogate,
text: &str,
) -> Result<(), FtsIndexError<<B as FtsBackend>::Error>>
pub fn index_document( &self, tid: u64, collection: &str, doc_id: Surrogate, text: &str, ) -> Result<(), FtsIndexError<<B as FtsBackend>::Error>>
Index a document’s text content.
Returns Err(FtsIndexError::SurrogateOutOfRange) if doc_id is
Surrogate::ZERO (the unassigned sentinel) or exceeds
MAX_INDEXABLE_SURROGATE. The FTS memtable uses the surrogate’s raw
u32 value as a direct array index into per-doc fieldnorm storage;
values near u32::MAX would cause multi-GiB allocations. Rejecting
out-of-range surrogates at this boundary is the correct fix — not a
debug_assert!, which would be a silent-wrap equivalent.
Sourcepub fn remove_document(
&self,
tid: u64,
collection: &str,
doc_id: Surrogate,
) -> Result<(), <B as FtsBackend>::Error>
pub fn remove_document( &self, tid: u64, collection: &str, doc_id: Surrogate, ) -> Result<(), <B as FtsBackend>::Error>
Remove a document from the index.
Sourcepub fn purge_collection(
&self,
tid: u64,
collection: &str,
) -> Result<usize, <B as FtsBackend>::Error>
pub fn purge_collection( &self, tid: u64, collection: &str, ) -> Result<usize, <B as FtsBackend>::Error>
Purge all entries for a collection. Returns count of removed entries.
Sourcepub fn purge_tenant(&self, tid: u64) -> Result<usize, <B as FtsBackend>::Error>
pub fn purge_tenant(&self, tid: u64) -> Result<usize, <B as FtsBackend>::Error>
Purge all entries for a tenant across every collection.
Source§impl<B> FtsIndex<B>where
B: FtsBackend,
impl<B> FtsIndex<B>where
B: FtsBackend,
Sourcepub fn search(
&self,
tid: u64,
collection: &str,
query: &str,
top_k: usize,
fuzzy_enabled: bool,
prefilter: Option<&SurrogateBitmap>,
) -> Result<Vec<TextSearchResult>, FtsIndexError<<B as FtsBackend>::Error>>
pub fn search( &self, tid: u64, collection: &str, query: &str, top_k: usize, fuzzy_enabled: bool, prefilter: Option<&SurrogateBitmap>, ) -> Result<Vec<TextSearchResult>, FtsIndexError<<B as FtsBackend>::Error>>
Search the index using BM25 scoring.
Supports NOT <term> and -<term> negation in the query string.
Returns Err(FtsIndexError::InvalidQuery) for ill-formed queries such
as NOT-only queries or unsupported parenthesised groups.
Sourcepub fn search_with_mode(
&self,
tid: u64,
collection: &str,
query: &str,
top_k: usize,
fuzzy_enabled: bool,
mode: QueryMode,
prefilter: Option<&SurrogateBitmap>,
) -> Result<Vec<TextSearchResult>, FtsIndexError<<B as FtsBackend>::Error>>
pub fn search_with_mode( &self, tid: u64, collection: &str, query: &str, top_k: usize, fuzzy_enabled: bool, mode: QueryMode, prefilter: Option<&SurrogateBitmap>, ) -> Result<Vec<TextSearchResult>, FtsIndexError<<B as FtsBackend>::Error>>
Search with explicit boolean mode (AND or OR).
Supports NOT <term> and -<term> negation in the query string.
Auto Trait Implementations§
impl<B> !Freeze for FtsIndex<B>
impl<B> !RefUnwindSafe for FtsIndex<B>
impl<B> Send for FtsIndex<B>where
B: Send,
impl<B> !Sync for FtsIndex<B>
impl<B> Unpin for FtsIndex<B>where
B: Unpin,
impl<B> UnsafeUnpin for FtsIndex<B>where
B: UnsafeUnpin,
impl<B> UnwindSafe for FtsIndex<B>where
B: UnwindSafe,
Blanket Implementations§
Source§impl<T> ArchivePointee for T
impl<T> ArchivePointee for T
Source§type ArchivedMetadata = ()
type ArchivedMetadata = ()
Source§fn pointer_metadata(
_: &<T as ArchivePointee>::ArchivedMetadata,
) -> <T as Pointee>::Metadata
fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> LayoutRaw for T
impl<T> LayoutRaw for T
Source§fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
Source§impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
Source§unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
Source§fn resolve_niched(out: Place<NichedOption<T, N1>>)
fn resolve_niched(out: Place<NichedOption<T, N1>>)
out indicating that a T is niched.Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.