IndexWriter

Struct IndexWriter 

Source
pub struct IndexWriter<D: DirectoryWriter + 'static> { /* private fields */ }
Expand description

Async IndexWriter for adding documents and committing segments

Features:

  • Parallel indexing with multiple segment builders
  • Streams documents to disk immediately (no in-memory document storage)
  • Uses string interning for terms (reduced allocations)
  • Uses hashbrown HashMap (faster than BTreeMap)

Implementations§

Source§

impl<D: DirectoryWriter + 'static> IndexWriter<D>

Source

pub async fn create( directory: D, schema: Schema, config: IndexConfig, ) -> Result<Self>

Create a new index in the directory

Source

pub async fn create_with_config( directory: D, schema: Schema, config: IndexConfig, builder_config: SegmentBuilderConfig, ) -> Result<Self>

Create a new index with custom builder config

Source

pub async fn open(directory: D, config: IndexConfig) -> Result<Self>

Open an existing index for writing

Source

pub async fn open_with_config( directory: D, config: IndexConfig, builder_config: SegmentBuilderConfig, ) -> Result<Self>

Open an existing index with custom builder config

Source

pub fn schema(&self) -> &Schema

Get the schema

Source

pub fn set_tokenizer<T: Tokenizer>(&mut self, field: Field, tokenizer: T)

Set tokenizer for a field

Source

pub async fn add_document(&self, doc: Document) -> Result<DocId>

Add a document

Documents are distributed randomly across multiple builders for parallel indexing. Random distribution avoids atomic contention and provides better load balancing. When a builder reaches max_docs_per_segment, it is committed and a new one starts.

Source

pub fn pending_build_count(&self) -> usize

Get the number of pending background builds

Source

pub fn pending_merge_count(&self) -> usize

Get the number of pending background merges

Source

pub async fn maybe_merge(&self)

Check merge policy and spawn background merges if needed

This is called automatically after segment builds complete via SegmentManager. Can also be called manually to trigger merge checking.

Source

pub async fn wait_for_merges(&self)

Wait for all pending merges to complete

Source

pub async fn cleanup_orphan_segments(&self) -> Result<usize>

Clean up orphan segment files that are not registered

This can happen if the process halts after segment files are written but before they are registered in segments.json. Call this after opening an index to reclaim disk space from incomplete operations.

Returns the number of orphan segments deleted.

Source

pub async fn get_builder_stats(&self) -> Option<SegmentBuilderStats>

Get current builder statistics for debugging (aggregated from all builders)

Source

pub async fn flush(&self) -> Result<()>

Flush current builders to background processing (non-blocking)

This takes all current builders with documents and spawns background tasks to build them. Returns immediately - use commit() for durability. New documents can continue to be added while segments are being built.

Source

pub async fn commit(&self) -> Result<()>

Commit all pending segments to disk and wait for completion

This flushes any current builders and waits for ALL background builds to complete. Provides durability guarantees - all data is persisted.

Source

pub async fn force_merge(&self) -> Result<()>

Force merge all segments into one

Auto Trait Implementations§

§

impl<D> !Freeze for IndexWriter<D>

§

impl<D> !RefUnwindSafe for IndexWriter<D>

§

impl<D> Send for IndexWriter<D>

§

impl<D> Sync for IndexWriter<D>

§

impl<D> Unpin for IndexWriter<D>

§

impl<D> !UnwindSafe for IndexWriter<D>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<SS, SP> SupersetOf<SS> for SP
where SS: SubsetOf<SP>,

Source§

fn to_subset(&self) -> Option<SS>

The inverse inclusion map: attempts to construct self from the equivalent element of its superset. Read more
Source§

fn is_in_subset(&self) -> bool

Checks if self is actually part of its subset T (and can be converted to it).
Source§

fn to_subset_unchecked(&self) -> SS

Use with care! Same as self.to_subset but without any property checks. Always succeeds.
Source§

fn from_subset(element: &SS) -> SP

The inclusion map: converts self to the equivalent element of its superset.
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V