pub struct IndexWriter<D: DirectoryWriter + 'static> { /* private fields */ }Expand description
Async IndexWriter for adding documents and committing segments
Features:
- Parallel indexing with multiple segment builders
- Streams documents to disk immediately (no in-memory document storage)
- Uses string interning for terms (reduced allocations)
- Uses hashbrown HashMap (faster than BTreeMap)
Implementations§
Source§impl<D: DirectoryWriter + 'static> IndexWriter<D>
impl<D: DirectoryWriter + 'static> IndexWriter<D>
Sourcepub async fn create(
directory: D,
schema: Schema,
config: IndexConfig,
) -> Result<Self>
pub async fn create( directory: D, schema: Schema, config: IndexConfig, ) -> Result<Self>
Create a new index in the directory
Sourcepub async fn create_with_config(
directory: D,
schema: Schema,
config: IndexConfig,
builder_config: SegmentBuilderConfig,
) -> Result<Self>
pub async fn create_with_config( directory: D, schema: Schema, config: IndexConfig, builder_config: SegmentBuilderConfig, ) -> Result<Self>
Create a new index with custom builder config
Sourcepub async fn open(directory: D, config: IndexConfig) -> Result<Self>
pub async fn open(directory: D, config: IndexConfig) -> Result<Self>
Open an existing index for writing
Sourcepub async fn open_with_config(
directory: D,
config: IndexConfig,
builder_config: SegmentBuilderConfig,
) -> Result<Self>
pub async fn open_with_config( directory: D, config: IndexConfig, builder_config: SegmentBuilderConfig, ) -> Result<Self>
Open an existing index with custom builder config
Sourcepub fn set_tokenizer<T: Tokenizer>(&mut self, field: Field, tokenizer: T)
pub fn set_tokenizer<T: Tokenizer>(&mut self, field: Field, tokenizer: T)
Set tokenizer for a field
Sourcepub async fn add_document(&self, doc: Document) -> Result<DocId>
pub async fn add_document(&self, doc: Document) -> Result<DocId>
Add a document
Documents are distributed randomly across multiple builders for parallel indexing.
Random distribution avoids atomic contention and provides better load balancing.
When a builder reaches max_docs_per_segment, it is committed and a new one starts.
Sourcepub fn pending_build_count(&self) -> usize
pub fn pending_build_count(&self) -> usize
Get the number of pending background builds
Sourcepub fn pending_merge_count(&self) -> usize
pub fn pending_merge_count(&self) -> usize
Get the number of pending background merges
Sourcepub async fn maybe_merge(&self)
pub async fn maybe_merge(&self)
Check merge policy and spawn background merges if needed
This is called automatically after segment builds complete via SegmentManager. Can also be called manually to trigger merge checking.
Sourcepub async fn wait_for_merges(&self)
pub async fn wait_for_merges(&self)
Wait for all pending merges to complete
Sourcepub async fn cleanup_orphan_segments(&self) -> Result<usize>
pub async fn cleanup_orphan_segments(&self) -> Result<usize>
Clean up orphan segment files that are not registered
This can happen if the process halts after segment files are written but before they are registered in segments.json. Call this after opening an index to reclaim disk space from incomplete operations.
Returns the number of orphan segments deleted.
Sourcepub async fn get_builder_stats(&self) -> Option<SegmentBuilderStats>
pub async fn get_builder_stats(&self) -> Option<SegmentBuilderStats>
Get current builder statistics for debugging (aggregated from all builders)
Sourcepub async fn flush(&self) -> Result<()>
pub async fn flush(&self) -> Result<()>
Flush current builders to background processing (non-blocking)
This takes all current builders with documents and spawns background tasks
to build them. Returns immediately - use commit() for durability.
New documents can continue to be added while segments are being built.
Sourcepub async fn commit(&self) -> Result<()>
pub async fn commit(&self) -> Result<()>
Commit all pending segments to disk and wait for completion
This flushes any current builders and waits for ALL background builds to complete. Provides durability guarantees - all data is persisted.
Sourcepub async fn force_merge(&self) -> Result<()>
pub async fn force_merge(&self) -> Result<()>
Force merge all segments into one
Auto Trait Implementations§
impl<D> !Freeze for IndexWriter<D>
impl<D> !RefUnwindSafe for IndexWriter<D>
impl<D> Send for IndexWriter<D>
impl<D> Sync for IndexWriter<D>
impl<D> Unpin for IndexWriter<D>
impl<D> !UnwindSafe for IndexWriter<D>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.