[][src]Struct tantivy_fst::raw::Builder

pub struct Builder<W> { /* fields omitted */ }

A builder for creating a finite state transducer.

This is not your average everyday builder. It has two important qualities that make it a bit unique from what you might expect:

  1. All keys must be added in lexicographic order. Adding a key out of order will result in an error. Additionally, adding a duplicate key with an output value will also result in an error. That is, once a key is associated with a value, that association can never be modified or deleted.
  2. The representation of an fst is streamed to any io::Write as it is built. For an in memory representation, this can be a Vec<u8>.

Point (2) is especially important because it means that an fst can be constructed without storing the entire fst in memory. Namely, since it works with any io::Write, it can be streamed directly to a file.

With that said, the builder does use memory, but memory usage is bounded to a constant size. The amount of memory used trades off with the compression ratio. Currently, the implementation hard codes this trade off which can result in about 5-20MB of heap usage during construction. (N.B. Guaranteeing a maximal compression ratio requires memory proportional to the size of the fst, which defeats some of the benefit of streaming it to disk. In practice, a small bounded amount of memory achieves close-to-minimal compression ratios.)

The algorithmic complexity of fst construction is O(n) where n is the number of elements added to the fst.

Methods

impl Builder<Vec<u8>>[src]

pub fn memory() -> Self[src]

Create a builder that builds an fst in memory.

impl<W: Write> Builder<W>[src]

pub fn new(wtr: W) -> Result<Builder<W>>[src]

Create a builder that builds an fst by writing it to wtr in a streaming fashion.

pub fn new_type(wtr: W, ty: FstType) -> Result<Builder<W>>[src]

The same as new, except it sets the type of the fst to the type given.

pub fn add<B>(&mut self, bs: B) -> Result<()> where
    B: AsRef<[u8]>, 
[src]

Adds a byte string to this FST with a zero output value.

pub fn insert<B>(&mut self, bs: B, val: u64) -> Result<()> where
    B: AsRef<[u8]>, 
[src]

Insert a new key-value pair into the fst.

Keys must be convertible to byte strings. Values must be a u64, which is a restriction of the current implementation of finite state transducers. (Values may one day be expanded to other types.)

If a key is inserted that is less than or equal to any previous key added, then an error is returned. Similarly, if there was a problem writing to the underlying writer, an error is returned.

pub fn extend_iter<T, I>(&mut self, iter: I) -> Result<()> where
    T: AsRef<[u8]>,
    I: IntoIterator<Item = (T, Output)>, 
[src]

Calls insert on each item in the iterator.

If an error occurred while adding an element, processing is stopped and the error is returned.

If a key is inserted that is less than or equal to any previous key added, then an error is returned. Similarly, if there was a problem writing to the underlying writer, an error is returned.

pub fn extend_stream<'f, I, S>(&mut self, stream: I) -> Result<()> where
    I: for<'a> IntoStreamer<'a, Into = S, Item = (&'a [u8], Output)>,
    S: 'f + for<'a> Streamer<'a, Item = (&'a [u8], Output)>, 
[src]

Calls insert on each item in the stream.

Note that unlike extend_iter, this is not generic on the items in the stream.

If a key is inserted that is less than or equal to any previous key added, then an error is returned. Similarly, if there was a problem writing to the underlying writer, an error is returned.

pub fn finish(self) -> Result<()>[src]

Finishes the construction of the fst and flushes the underlying writer. After completion, the data written to W may be read using one of Fst's constructor methods.

pub fn into_inner(self) -> Result<W>[src]

Just like finish, except it returns the underlying writer after flushing it.

pub fn get_ref(&self) -> &W[src]

Gets a reference to the underlying writer.

pub fn bytes_written(&self) -> u64[src]

Returns the number of bytes written to the underlying writer

Auto Trait Implementations

impl<W> RefUnwindSafe for Builder<W> where
    W: RefUnwindSafe

impl<W> Send for Builder<W> where
    W: Send

impl<W> Sync for Builder<W> where
    W: Sync

impl<W> Unpin for Builder<W> where
    W: Unpin

impl<W> UnwindSafe for Builder<W> where
    W: UnwindSafe

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.