[][src]Struct tantivy::IndexWriter

pub struct IndexWriter { /* fields omitted */ }

IndexWriter is the user entry-point to add document to an index.

It manages a small number of indexing thread, as well as a shared indexing queue. Each indexing thread builds its own independent Segment, via a SegmentWriter object.

Methods

impl IndexWriter[src]

pub fn wait_merging_threads(self) -> Result<()>[src]

If there are some merging threads, blocks until they all finish their work and then drop the IndexWriter.

pub fn new_segment(&self) -> Segment[src]

Creates a new segment.

This method is useful only for users trying to do complex operations, like converting an index format to another.

It is safe to start writing file associated to the new Segment. These will not be garbage collected as long as an instance object of SegmentMeta object associated to the new Segment is "alive".

pub fn get_merge_policy(&self) -> Arc<Box<dyn MergePolicy>>[src]

Accessor to the merge policy.

pub fn set_merge_policy(&self, merge_policy: Box<dyn MergePolicy>)[src]

Setter for the merge policy.

pub fn garbage_collect_files(
    &self
) -> impl Future<Output = Result<GarbageCollectionResult>>
[src]

Detects and removes the files that are not used by the index anymore.

pub fn delete_all_documents(&self) -> Result<Opstamp>[src]

Deletes all documents from the index

Requires commiting Enables users to rebuild the index, by clearing and resubmitting necessary documents

use tantivy::collector::TopDocs;
use tantivy::query::QueryParser;
use tantivy::schema::*;
use tantivy::{doc, Index};

fn main() -> tantivy::Result<()> {
    let mut schema_builder = Schema::builder();
    let title = schema_builder.add_text_field("title", TEXT | STORED);
    let schema = schema_builder.build();

    let index = Index::create_in_ram(schema.clone());

    let mut index_writer = index.writer_with_num_threads(1, 50_000_000)?;
    index_writer.add_document(doc!(title => "The modern Promotheus"));
    index_writer.commit()?;

    let clear_res = index_writer.delete_all_documents().unwrap();
    // have to commit, otherwise deleted terms remain available
    index_writer.commit()?;

    let searcher = index.reader()?.searcher();
    let query_parser = QueryParser::for_index(&index, vec![title]);
    let query_promo = query_parser.parse_query("Promotheus")?;
    let top_docs_promo = searcher.search(&query_promo, &TopDocs::with_limit(1))?;

    assert!(top_docs_promo.is_empty());
    Ok(())
}

pub fn merge(
    &mut self,
    segment_ids: &[SegmentId]
) -> impl Future<Output = Result<SegmentMeta>>
[src]

Merges a given list of segments

segment_ids is required to be non-empty.

pub fn rollback(&mut self) -> Result<Opstamp>[src]

Rollback to the last commit

This cancels all of the updates that happened after the last commit. After calling rollback, the index is in the same state as it was after the last commit.

The opstamp at the last commit is returned.

pub fn prepare_commit(&mut self) -> Result<PreparedCommit>[src]

Prepares a commit.

Calling prepare_commit() will cut the indexing queue. All pending documents will be sent to the indexing workers. They will then terminate, regardless of the size of their current segment and flush their work on disk.

Once a commit is "prepared", you can either call

  • .commit(): to accept this commit
  • .abort(): to cancel this commit.

In the current implementation, PreparedCommit borrows the IndexWriter mutably so we are guaranteed that no new document can be added as long as it is committed or is dropped.

It is also possible to add a payload to the commit using this API. See PreparedCommit::set_payload()

pub fn commit(&mut self) -> Result<Opstamp>[src]

Commits all of the pending changes

A call to commit blocks. After it returns, all of the document that were added since the last commit are published and persisted.

In case of a crash or an hardware failure (as long as the hard disk is spared), it will be possible to resume indexing from this point.

Commit returns the opstamp of the last document that made it in the commit.

pub fn delete_term(&self, term: Term) -> Opstamp[src]

Delete all documents containing a given term.

Delete operation only affects documents that were added in previous commits, and documents that were added previously in the same commit.

Like adds, the deletion itself will be visible only after calling commit().

pub fn commit_opstamp(&self) -> Opstamp[src]

Returns the opstamp of the last successful commit.

This is, for instance, the opstamp the index will rollback to if there is a failure like a power surge.

This is also the opstamp of the commit that is currently available for searchers.

pub fn add_document(&self, document: Document) -> Opstamp[src]

Adds a document.

If the indexing pipeline is full, this call may block.

The opstamp is an increasing u64 that can be used by the client to align commits with its own document queue.

pub fn run(&self, user_operations: Vec<UserOperation>) -> Opstamp[src]

Runs a group of document operations ensuring that the operations are assigned contigous u64 opstamps and that add operations of the same group are flushed into the same segment.

If the indexing pipeline is full, this call may block.

Each operation of the given user_operations will receive an in-order, contiguous u64 opstamp. The entire batch itself is also given an opstamp that is 1 greater than the last given operation. This batch_opstamp is the return value of run. An empty group of user_operations, an empty Vec<UserOperation>, still receives a valid opstamp even though no changes were actually made to the index.

Like adds and deletes (see IndexWriter.add_document and IndexWriter.delete_term), the changes made by calling run will be visible to readers only after calling commit().

Trait Implementations

impl Drop for IndexWriter[src]

Auto Trait Implementations

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> Downcast for T where
    T: Any
[src]

impl<T> DowncastSync for T where
    T: Send + Sync + Any
[src]

impl<T> Erased for T[src]

impl<T> From<T> for T[src]

impl<T> Fruit for T where
    T: Send + Downcast
[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

impl<V, T> VZip<V> for T where
    V: MultiLane<T>,