Struct fst::MapBuilder [] [src]

pub struct MapBuilder<W>(_);

A builder for creating a map.

This is not your average everyday builder. It has two important qualities that make it a bit unique from what you might expect:

  1. All keys must be added in lexicographic order. Adding a key out of order will result in an error. Additionally, adding a duplicate key will also result in an error. That is, once a key is associated with a value, that association can never be modified or deleted.
  2. The representation of a map is streamed to any io::Write as it is built. For an in memory representation, this can be a Vec<u8>.

Point (2) is especially important because it means that a map can be constructed without storing the entire map in memory. Namely, since it works with any io::Write, it can be streamed directly to a file.

With that said, the builder does use memory, but memory usage is bounded to a constant size. The amount of memory used trades off with the compression ratio. Currently, the implementation hard codes this trade off which can result in about 5-20MB of heap usage during construction. (N.B. Guaranteeing a maximal compression ratio requires memory proportional to the size of the map, which defeats some of the benefit of streaming it to disk. In practice, a small bounded amount of memory achieves close-to-minimal compression ratios.)

The algorithmic complexity of map construction is O(n) where n is the number of elements added to the map.

Example: build in memory

This shows how to use the builder to construct a map in memory. Note that Map::from_iter provides a convenience function that achieves this same goal without needing to explicitly use MapBuilder.

use fst::{IntoStreamer, Streamer, Map, MapBuilder};

let mut build = MapBuilder::memory();
build.insert("bruce", 1).unwrap();
build.insert("clarence", 2).unwrap();
build.insert("stevie", 3).unwrap();

// You could also call `finish()` here, but since we're building the map in
// memory, there would be no way to get the `Vec<u8>` back.
let bytes = build.into_inner().unwrap();

// At this point, the map has been constructed, but here's how to read it.
let map = Map::from_bytes(bytes).unwrap();
let mut stream = map.into_stream();
let mut kvs = vec![];
while let Some((k, v)) = stream.next() {
    kvs.push((k.to_vec(), v));
}
assert_eq!(kvs, vec![
    (b"bruce".to_vec(), 1),
    (b"clarence".to_vec(), 2),
    (b"stevie".to_vec(), 3),
]);

Example: stream to file

This shows how to do stream construction of a map to a file.

use std::fs::File;
use std::io;

use fst::{IntoStreamer, Streamer, Map, MapBuilder};

let mut wtr = io::BufWriter::new(File::create("map.fst").unwrap());
let mut build = MapBuilder::new(wtr).unwrap();
build.insert("bruce", 1).unwrap();
build.insert("clarence", 2).unwrap();
build.insert("stevie", 3).unwrap();

// If you want the writer back, then call `into_inner`. Otherwise, this
// will finish construction and call `flush`.
build.finish().unwrap();

// At this point, the map has been constructed, but here's how to read it.
let map = Map::from_path("map.fst").unwrap();
let mut stream = map.into_stream();
let mut kvs = vec![];
while let Some((k, v)) = stream.next() {
    kvs.push((k.to_vec(), v));
}
assert_eq!(kvs, vec![
    (b"bruce".to_vec(), 1),
    (b"clarence".to_vec(), 2),
    (b"stevie".to_vec(), 3),
]);

Methods

impl MapBuilder<Vec<u8>>
[src]

Create a builder that builds a map in memory.

impl<W: Write> MapBuilder<W>
[src]

Create a builder that builds a map by writing it to wtr in a streaming fashion.

Insert a new key-value pair into the map.

Keys must be convertible to byte strings. Values must be a u64, which is a restriction of the current implementation of finite state transducers. (Values may one day be expanded to other types.)

If a key is inserted that is less than or equal to any previous key added, then an error is returned. Similarly, if there was a problem writing to the underlying writer, an error is returned.

Calls insert on each item in the iterator.

If an error occurred while adding an element, processing is stopped and the error is returned.

If a key is inserted that is less than or equal to any previous key added, then an error is returned. Similarly, if there was a problem writing to the underlying writer, an error is returned.

Calls insert on each item in the stream.

Note that unlike extend_iter, this is not generic on the items in the stream.

If a key is inserted that is less than or equal to any previous key added, then an error is returned. Similarly, if there was a problem writing to the underlying writer, an error is returned.

Finishes the construction of the map and flushes the underlying writer. After completion, the data written to W may be read using one of Map's constructor methods.

Just like finish, except it returns the underlying writer after flushing it.

Gets a reference to the underlying writer.

Returns the number of bytes written to the underlying writer