Struct tantivy::IndexWriter [−][src]
pub struct IndexWriter { /* fields omitted */ }
Expand description
IndexWriter
is the user entry-point to add document to an index.
It manages a small number of indexing thread, as well as a shared
indexing queue.
Each indexing thread builds its own independent Segment
, via
a SegmentWriter
object.
Implementations
If there are some merging threads, blocks until they all finish their work and
then drop the IndexWriter
.
Creates a new segment.
This method is useful only for users trying to do complex operations, like converting an index format to another.
It is safe to start writing file associated to the new Segment
.
These will not be garbage collected as long as an instance object of
SegmentMeta
object associated to the new Segment
is “alive”.
Accessor to the merge policy.
Setter for the merge policy.
Detects and removes the files that are not used by the index anymore.
Deletes all documents from the index
Requires commit
ing
Enables users to rebuild the index,
by clearing and resubmitting necessary documents
use tantivy::collector::TopDocs;
use tantivy::query::QueryParser;
use tantivy::schema::*;
use tantivy::{doc, Index};
fn main() -> tantivy::Result<()> {
let mut schema_builder = Schema::builder();
let title = schema_builder.add_text_field("title", TEXT | STORED);
let schema = schema_builder.build();
let index = Index::create_in_ram(schema.clone());
let mut index_writer = index.writer_with_num_threads(1, 50_000_000)?;
index_writer.add_document(doc!(title => "The modern Promotheus"));
index_writer.commit()?;
let clear_res = index_writer.delete_all_documents().unwrap();
// have to commit, otherwise deleted terms remain available
index_writer.commit()?;
let searcher = index.reader()?.searcher();
let query_parser = QueryParser::for_index(&index, vec![title]);
let query_promo = query_parser.parse_query("Promotheus")?;
let top_docs_promo = searcher.search(&query_promo, &TopDocs::with_limit(1))?;
assert!(top_docs_promo.is_empty());
Ok(())
}
Merges a given list of segments
segment_ids
is required to be non-empty.
Rollback to the last commit
This cancels all of the updates that happened after the last commit. After calling rollback, the index is in the same state as it was after the last commit.
The opstamp at the last commit is returned.
Prepares a commit.
Calling prepare_commit()
will cut the indexing
queue. All pending documents will be sent to the
indexing workers. They will then terminate, regardless
of the size of their current segment and flush their
work on disk.
Once a commit is “prepared”, you can either call
.commit()
: to accept this commit.abort()
: to cancel this commit.
In the current implementation, PreparedCommit
borrows
the IndexWriter
mutably so we are guaranteed that no new
document can be added as long as it is committed or is
dropped.
It is also possible to add a payload to the commit
using this API.
See PreparedCommit::set_payload()
Commits all of the pending changes
A call to commit blocks. After it returns, all of the document that were added since the last commit are published and persisted.
In case of a crash or an hardware failure (as long as the hard disk is spared), it will be possible to resume indexing from this point.
Commit returns the opstamp
of the last document
that made it in the commit.
Delete all documents containing a given term.
Delete operation only affects documents that were added in previous commits, and documents that were added previously in the same commit.
Like adds, the deletion itself will be visible
only after calling commit()
.
Returns the opstamp of the last successful commit.
This is, for instance, the opstamp the index will rollback to if there is a failure like a power surge.
This is also the opstamp of the commit that is currently available for searchers.
Adds a document.
If the indexing pipeline is full, this call may block.
The opstamp is an increasing u64
that can
be used by the client to align commits with its own
document queue.
Runs a group of document operations ensuring that the operations are assigned contigous u64 opstamps and that add operations of the same group are flushed into the same segment.
If the indexing pipeline is full, this call may block.
Each operation of the given user_operations
will receive an in-order,
contiguous u64 opstamp. The entire batch itself is also given an
opstamp that is 1 greater than the last given operation. This
batch_opstamp
is the return value of run
. An empty group of
user_operations
, an empty Vec<UserOperation>
, still receives
a valid opstamp even though no changes were actually made to the index.
Like adds and deletes (see IndexWriter.add_document
and
IndexWriter.delete_term
), the changes made by calling run
will be
visible to readers only after calling commit()
.
Trait Implementations
Auto Trait Implementations
impl !RefUnwindSafe for IndexWriter
impl Send for IndexWriter
impl Sync for IndexWriter
impl Unpin for IndexWriter
impl !UnwindSafe for IndexWriter
Blanket Implementations
Mutably borrows from an owned value. Read more
Convert Box<dyn Trait>
(where Trait: Downcast
) to Box<dyn Any>
. Box<dyn Any>
can
then be further downcast
into Box<ConcreteType>
where ConcreteType
implements Trait
. Read more
Convert Rc<Trait>
(where Trait: Downcast
) to Rc<Any>
. Rc<Any>
can then be
further downcast
into Rc<ConcreteType>
where ConcreteType
implements Trait
. Read more
Convert &Trait
(where Trait: Downcast
) to &Any
. This is needed since Rust cannot
generate &Any
’s vtable from &Trait
’s. Read more
Convert &mut Trait
(where Trait: Downcast
) to &Any
. This is needed since Rust cannot
generate &mut Any
’s vtable from &mut Trait
’s. Read more