Skip to main content

IndexWriter

Struct IndexWriter 

Source
pub struct IndexWriter<D: DirectoryWriter + 'static> { /* private fields */ }
Expand description

Async IndexWriter for adding documents and committing segments.

Backpressure: add_document() is sync, O(1). Returns Error::QueueFull when the shared queue is at capacity — caller must back off.

Two-phase commit:

  • prepare_commit()PreparedCommit::commit() or PreparedCommit::abort()
  • commit() is a convenience that does both phases.
  • Between prepare and commit, the caller can do external work (WAL, sync, etc.) knowing that abort is possible if something fails.
  • Dropping PreparedCommit without calling commit/abort auto-aborts.

Implementations§

Source§

impl<D: DirectoryWriter + 'static> IndexWriter<D>

Source

pub async fn build_vector_index(&self) -> Result<()>

Train vector index from accumulated Flat vectors (manual, not auto-triggered).

  1. Acquires a snapshot (segments safe to read)
  2. Collects vectors for training
  3. Trains centroids/codebooks
  4. Updates metadata (marks fields as Built)
  5. Publishes to ArcSwap — merges will use these automatically

Existing flat segments get ANN during normal merges. No rebuild needed.

Source

pub async fn rebuild_vector_index(&self) -> Result<()>

Rebuild vector index by retraining centroids/codebooks.

Resets Built state to Flat, clears trained structures, then trains fresh.

Source§

impl<D: DirectoryWriter + 'static> IndexWriter<D>

Source

pub async fn create( directory: D, schema: Schema, config: IndexConfig, ) -> Result<Self>

Create a new index in the directory

Source

pub async fn create_with_config( directory: D, schema: Schema, config: IndexConfig, builder_config: SegmentBuilderConfig, ) -> Result<Self>

Create a new index with custom builder config

Source

pub async fn open(directory: D, config: IndexConfig) -> Result<Self>

Open an existing index for writing

Source

pub async fn open_with_config( directory: D, config: IndexConfig, builder_config: SegmentBuilderConfig, ) -> Result<Self>

Open an existing index with custom builder config

Source

pub fn from_index(index: &Index<D>) -> Self

Create an IndexWriter from an existing Index. Shares the SegmentManager for consistent segment lifecycle management.

Source

pub fn schema(&self) -> &Schema

Get the schema

Source

pub fn set_tokenizer<T: Tokenizer>(&mut self, field: Field, tokenizer: T)

Set tokenizer for a field. Propagated to worker threads — takes effect for the next SegmentBuilder they create.

Source

pub fn add_document(&self, doc: Document) -> Result<()>

Add a document to the indexing queue (sync, O(1), lock-free).

Document is moved into the channel (zero-copy). Workers compete to pull it. Returns Error::QueueFull when the queue is at capacity — caller must back off.

Source

pub fn add_documents(&self, documents: Vec<Document>) -> Result<usize>

Add multiple documents to the indexing queue.

Returns the number of documents successfully queued. Stops at the first QueueFull and returns the count queued so far.

Source

pub async fn maybe_merge(&self)

Check merge policy and spawn a background merge if needed.

Source

pub async fn wait_for_merging_thread(&self)

Wait for the in-flight background merge to complete (if any).

Source

pub async fn wait_for_all_merges(&self)

Wait for all eligible merges to complete, including cascading merges.

Source

pub fn tracker(&self) -> Arc<SegmentTracker>

Get the segment tracker for sharing with readers.

Source

pub async fn acquire_snapshot(&self) -> SegmentSnapshot

Acquire a snapshot of current segments for reading.

Source

pub async fn cleanup_orphan_segments(&self) -> Result<usize>

Clean up orphan segment files not registered in metadata.

Source

pub async fn prepare_commit(&mut self) -> Result<PreparedCommit<'_, D>>

Prepare commit — signal workers to flush, wait for completion, collect segments.

All documents sent via add_document before this call are guaranteed to be written to segment files on disk. Segments are NOT yet registered in metadata — call PreparedCommit::commit() for that.

Workers are NOT destroyed — they flush their builders and wait for resume_workers() to give them a new channel.

add_document will return Closed error until commit/abort resumes workers.

Source

pub async fn commit(&mut self) -> Result<()>

Commit (convenience): prepare_commit + commit in one call.

Guarantees all prior add_document calls are committed. Vector training is decoupled — call build_vector_index() manually.

Source

pub async fn force_merge(&mut self) -> Result<()>

Force merge all segments into one.

Trait Implementations§

Source§

impl<D: DirectoryWriter + 'static> Drop for IndexWriter<D>

Source§

fn drop(&mut self)

Executes the destructor for this type. Read more

Auto Trait Implementations§

§

impl<D> Freeze for IndexWriter<D>

§

impl<D> !RefUnwindSafe for IndexWriter<D>

§

impl<D> Send for IndexWriter<D>

§

impl<D> Sync for IndexWriter<D>

§

impl<D> Unpin for IndexWriter<D>

§

impl<D> !UnwindSafe for IndexWriter<D>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<SS, SP> SupersetOf<SS> for SP
where SS: SubsetOf<SP>,

Source§

fn to_subset(&self) -> Option<SS>

The inverse inclusion map: attempts to construct self from the equivalent element of its superset. Read more
Source§

fn is_in_subset(&self) -> bool

Checks if self is actually part of its subset T (and can be converted to it).
Source§

fn to_subset_unchecked(&self) -> SS

Use with care! Same as self.to_subset but without any property checks. Always succeeds.
Source§

fn from_subset(element: &SS) -> SP

The inclusion map: converts self to the equivalent element of its superset.
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V