logo
pub struct IndexWriter { /* private fields */ }
Expand description

IndexWriter is the user entry-point to add document to an index.

It manages a small number of indexing thread, as well as a shared indexing queue. Each indexing thread builds its own independent Segment, via a SegmentWriter object.

Implementations

Accessor to the index.

If there are some merging threads, blocks until they all finish their work and then drop the IndexWriter.

Creates a new segment.

This method is useful only for users trying to do complex operations, like converting an index format to another.

It is safe to start writing file associated to the new Segment. These will not be garbage collected as long as an instance object of SegmentMeta object associated to the new Segment is “alive”.

Accessor to the merge policy.

Setter for the merge policy.

Detects and removes the files that are not used by the index anymore.

Deletes all documents from the index

Requires commiting Enables users to rebuild the index, by clearing and resubmitting necessary documents

use tantivy::collector::TopDocs;
use tantivy::query::QueryParser;
use tantivy::schema::*;
use tantivy::{doc, Index};

fn main() -> tantivy::Result<()> {
    let mut schema_builder = Schema::builder();
    let title = schema_builder.add_text_field("title", TEXT | STORED);
    let schema = schema_builder.build();

    let index = Index::create_in_ram(schema.clone());

    let mut index_writer = index.writer_with_num_threads(1, 50_000_000)?;
    index_writer.add_document(doc!(title => "The modern Promotheus"))?;
    index_writer.commit()?;

    let clear_res = index_writer.delete_all_documents().unwrap();
    // have to commit, otherwise deleted terms remain available
    index_writer.commit()?;

    let searcher = index.reader()?.searcher();
    let query_parser = QueryParser::for_index(&index, vec![title]);
    let query_promo = query_parser.parse_query("Promotheus")?;
    let top_docs_promo = searcher.search(&query_promo, &TopDocs::with_limit(1))?;

    assert!(top_docs_promo.is_empty());
    Ok(())
}

Merges a given list of segments

segment_ids is required to be non-empty.

Rollback to the last commit

This cancels all of the updates that happened after the last commit. After calling rollback, the index is in the same state as it was after the last commit.

The opstamp at the last commit is returned.

Prepares a commit.

Calling prepare_commit() will cut the indexing queue. All pending documents will be sent to the indexing workers. They will then terminate, regardless of the size of their current segment and flush their work on disk.

Once a commit is “prepared”, you can either call

  • .commit(): to accept this commit
  • .abort(): to cancel this commit.

In the current implementation, PreparedCommit borrows the IndexWriter mutably so we are guaranteed that no new document can be added as long as it is committed or is dropped.

It is also possible to add a payload to the commit using this API. See PreparedCommit::set_payload()

Commits all of the pending changes

A call to commit blocks. After it returns, all of the document that were added since the last commit are published and persisted.

In case of a crash or an hardware failure (as long as the hard disk is spared), it will be possible to resume indexing from this point.

Commit returns the opstamp of the last document that made it in the commit.

Delete all documents containing a given term.

Delete operation only affects documents that were added in previous commits, and documents that were added previously in the same commit.

Like adds, the deletion itself will be visible only after calling commit().

Returns the opstamp of the last successful commit.

This is, for instance, the opstamp the index will rollback to if there is a failure like a power surge.

This is also the opstamp of the commit that is currently available for searchers.

Adds a document.

If the indexing pipeline is full, this call may block.

The opstamp is an increasing u64 that can be used by the client to align commits with its own document queue.

Runs a group of document operations ensuring that the operations are assigned contigous u64 opstamps and that add operations of the same group are flushed into the same segment.

If the indexing pipeline is full, this call may block.

Each operation of the given user_operations will receive an in-order, contiguous u64 opstamp. The entire batch itself is also given an opstamp that is 1 greater than the last given operation. This batch_opstamp is the return value of run. An empty group of user_operations, an empty Vec<UserOperation>, still receives a valid opstamp even though no changes were actually made to the index.

Like adds and deletes (see IndexWriter.add_document and IndexWriter.delete_term), the changes made by calling run will be visible to readers only after calling commit().

Trait Implementations

Executes the destructor for this type. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Convert Box<dyn Trait> (where Trait: Downcast) to Box<dyn Any>. Box<dyn Any> can then be further downcast into Box<ConcreteType> where ConcreteType implements Trait. Read more

Convert Rc<Trait> (where Trait: Downcast) to Rc<Any>. Rc<Any> can then be further downcast into Rc<ConcreteType> where ConcreteType implements Trait. Read more

Convert &Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot generate &Any’s vtable from &Trait’s. Read more

Convert &mut Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot generate &mut Any’s vtable from &mut Trait’s. Read more

Convert Arc<Trait> (where Trait: Downcast) to Arc<Any>. Arc<Any> can then be further downcast into Arc<ConcreteType> where ConcreteType implements Trait. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The alignment of pointer.

The type for initializers.

Initializes a with the given initializer. Read more

Dereferences the given pointer. Read more

Mutably dereferences the given pointer. Read more

Drops the object pointed to by the given pointer. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.