Struct Set

Source
pub struct Set { /* private fields */ }
Expand description

Fast and compact indexed string set using front coding.

This implements an indexed set of strings in a compressed format based on front coding. n strings in the set are indexed with integers from [0..n-1] and assigned in the lexicographical order.

§Supported queries

  • Locate gets the index of a string key.
  • Decode gets the string with an index.
  • Predict enumerates the strings starting from a prefix.

§Limitations

Input keys must not contain \0 character because the character is used for the terminator.

§Example

use fcsd::Set;

// Input string keys should be sorted and unique.
let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];

// Builds an indexed set.
let set = Set::new(keys).unwrap();
assert_eq!(set.len(), keys.len());

// Gets indexes associated with given keys.
let mut locator = set.locator();
assert_eq!(locator.run(b"ICML"), Some(1));
assert_eq!(locator.run(b"SIGMOD"), Some(4));
assert_eq!(locator.run(b"SIGSPATIAL"), None);

// Decodes string keys from given indexes.
let mut decoder = set.decoder();
assert_eq!(decoder.run(0), b"ICDM".to_vec());
assert_eq!(decoder.run(3), b"SIGKDD".to_vec());

// Enumerates indexes and keys stored in the set.
let mut iter = set.iter();
assert_eq!(iter.next(), Some((0, b"ICDM".to_vec())));
assert_eq!(iter.next(), Some((1, b"ICML".to_vec())));
assert_eq!(iter.next(), Some((2, b"SIGIR".to_vec())));
assert_eq!(iter.next(), Some((3, b"SIGKDD".to_vec())));
assert_eq!(iter.next(), Some((4, b"SIGMOD".to_vec())));
assert_eq!(iter.next(), None);

// Enumerates indexes and keys starting with a prefix.
let mut iter = set.predictive_iter(b"SIG");
assert_eq!(iter.next(), Some((2, b"SIGIR".to_vec())));
assert_eq!(iter.next(), Some((3, b"SIGKDD".to_vec())));
assert_eq!(iter.next(), Some((4, b"SIGMOD".to_vec())));
assert_eq!(iter.next(), None);

// Serialization / Deserialization
let mut data = Vec::<u8>::new();
set.serialize_into(&mut data).unwrap();
assert_eq!(data.len(), set.size_in_bytes());
let other = Set::deserialize_from(&data[..]).unwrap();
assert_eq!(data.len(), other.size_in_bytes());

Implementations§

Source§

impl Set

Source

pub fn new<I, P>(keys: I) -> Result<Self>
where I: IntoIterator<Item = P>, P: AsRef<[u8]>,

Builds a new Set from string keys.

§Arguments
  • keys: string keys that are unique and sorted.
§Notes

It will set the bucket size to DEFAULT_BUCKET_SIZE. If you want to optionally set the parameter, use Set::with_bucket_size instead.

§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::new(keys).unwrap();
assert_eq!(set.len(), keys.len());
Source

pub fn with_bucket_size<I, P>(keys: I, bucket_size: usize) -> Result<Self>
where I: IntoIterator<Item = P>, P: AsRef<[u8]>,

Builds a new Set from string keys with a specified bucket size.

§Arguments
  • keys: string keys that are unique and sorted.
  • bucket_size: The number of strings in each bucket, which must be a power of two.
§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::with_bucket_size(keys, 4).unwrap();
assert_eq!(set.len(), keys.len());
Source

pub fn size_in_bytes(&self) -> usize

Returns the number of bytes needed to write the dictionary.

§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::new(keys).unwrap();
assert_eq!(set.size_in_bytes(), 110);
Source

pub fn serialize_into<W>(&self, writer: W) -> Result<()>
where W: Write,

Serializes the dictionary into a writer.

§Arguments
  • writer: Writable stream.
§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::new(keys).unwrap();

let mut data = Vec::<u8>::new();
set.serialize_into(&mut data).unwrap();
assert_eq!(data.len(), 110);
Source

pub fn deserialize_from<R>(reader: R) -> Result<Self>
where R: Read,

Deserializes the dictionary from a reader.

§Arguments
  • reader: Readable stream.
§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::new(keys).unwrap();

let mut data = Vec::<u8>::new();
set.serialize_into(&mut data).unwrap();
let other = Set::deserialize_from(&data[..]).unwrap();
assert_eq!(set.size_in_bytes(), other.size_in_bytes());
Source

pub fn locator(&self) -> Locator<'_>

Makes a class to get ids of given string keys.

§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::new(keys).unwrap();

let mut locator = set.locator();
assert_eq!(locator.run(b"ICML"), Some(1));
assert_eq!(locator.run(b"SIGMOD"), Some(4));
assert_eq!(locator.run(b"SIGSPATIAL"), None);
Source

pub fn decoder(&self) -> Decoder<'_>

Makes a class to decode stored keys associated with given ids.

§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::new(keys).unwrap();

let mut decoder = set.decoder();
assert_eq!(decoder.run(0), b"ICDM".to_vec());
assert_eq!(decoder.run(3), b"SIGKDD".to_vec());
Source

pub fn iter(&self) -> Iter<'_>

Makes an iterator to enumerate keys stored in the dictionary.

The keys will be reported in the lexicographical order.

§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR"];
let set = Set::new(keys).unwrap();

let mut iter = set.iter();
assert_eq!(iter.next(), Some((0, b"ICDM".to_vec())));
assert_eq!(iter.next(), Some((1, b"ICML".to_vec())));
assert_eq!(iter.next(), Some((2, b"SIGIR".to_vec())));
assert_eq!(iter.next(), None);
Source

pub fn predictive_iter<P>(&self, prefix: P) -> PredictiveIter<'_>
where P: AsRef<[u8]>,

Makes a predictive iterator to enumerate keys starting from a given string.

The keys will be reported in the lexicographical order.

§Arguments
  • prefix: Prefix of keys to be predicted.
§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::new(keys).unwrap();

let mut iter = set.predictive_iter(b"SIG");
assert_eq!(iter.next(), Some((2, b"SIGIR".to_vec())));
assert_eq!(iter.next(), Some((3, b"SIGKDD".to_vec())));
assert_eq!(iter.next(), Some((4, b"SIGMOD".to_vec())));
assert_eq!(iter.next(), None);
Source

pub const fn len(&self) -> usize

Gets the number of stored keys.

§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::new(keys).unwrap();
assert_eq!(set.len(), keys.len());
Source

pub const fn is_empty(&self) -> bool

Checks if the set is empty.

Source

pub const fn num_buckets(&self) -> usize

Gets the number of defined buckets.

§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::with_bucket_size(keys, 4).unwrap();
assert_eq!(set.num_buckets(), 2);
Source

pub const fn bucket_size(&self) -> usize

Gets the bucket size.

§Example
use fcsd::Set;

let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];
let set = Set::with_bucket_size(keys, 4).unwrap();
assert_eq!(set.bucket_size(), 4);

Trait Implementations§

Source§

impl Clone for Set

Source§

fn clone(&self) -> Set

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

Auto Trait Implementations§

§

impl Freeze for Set

§

impl RefUnwindSafe for Set

§

impl Send for Set

§

impl Sync for Set

§

impl Unpin for Set

§

impl UnwindSafe for Set

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.