Crate grenad

source ·
Expand description

This library provides tools to sort, merge, write, and read immutable key-value pairs. The entries in the grenad files are immutable and the only way to modify them is by creating a new file with the changes.

Example: Use the Writer and Reader structs

You can use the Writer struct to store key-value pairs into the specified std::io::Write type. The Reader type can then be used to read the entries.

The entries provided to the Writer struct must be given in lexicographic order.

use std::io::Cursor;

use grenad::{Reader, Writer};

let mut writer = Writer::memory();

// We insert our key-value pairs in lexicographic order.
writer.insert("first-counter", 119_u32.to_ne_bytes())?;
writer.insert("second-counter", 384_u32.to_ne_bytes())?;

// We create a reader from our writer.
let cursor = writer.into_inner().map(Cursor::new)?;
let mut cursor = Reader::new(cursor)?.into_cursor()?;

// We can see that the sum of u32s is valid here.
assert_eq!(cursor.move_on_next()?, Some((&b"first-counter"[..], &119_u32.to_ne_bytes()[..])));
assert_eq!(cursor.move_on_next()?, Some((&b"second-counter"[..], &384_u32.to_ne_bytes()[..])));
assert_eq!(cursor.move_on_next()?, None);

// We can also jum on any given entry.
assert_eq!(cursor.move_on_key_greater_than_or_equal_to("first")?, Some((&b"first-counter"[..], &119_u32.to_ne_bytes()[..])));
assert_eq!(cursor.move_on_key_equal_to("second-counter")?, Some((&b"second-counter"[..], &384_u32.to_ne_bytes()[..])));
assert_eq!(cursor.move_on_key_lower_than_or_equal_to("abracadabra")?, None);

Example: Use the Merger struct

In this example we show how you can merge multiple Readers by using a merge function when a conflict is encountered.

The entries yielded by the Merger struct are returned in lexicographic order, a good way to write them back into a new Writer.

use std::array::TryFromSliceError;
use std::borrow::Cow;
use std::convert::TryInto;
use std::io::Cursor;

use grenad::{MergerBuilder, Reader, Writer};

// This merge function:
//  - parses u32s from native-endian bytes,
//  - wrapping sums them and,
//  - outputs the result as native-endian bytes.
fn wrapping_sum_u32s<'a>(
 _key: &[u8],
 values: &[Cow<'a, [u8]>],
) -> Result<Cow<'a, [u8]>, TryFromSliceError>
{
    let mut output: u32 = 0;
    for bytes in values.iter().map(AsRef::as_ref) {
        let num = bytes.try_into().map(u32::from_ne_bytes)?;
        output = output.wrapping_add(num);
    }
    Ok(Cow::Owned(output.to_ne_bytes().to_vec()))
}

// We create our writers in memory to insert our key-value pairs.
let mut writera = Writer::memory();
let mut writerb = Writer::memory();
let mut writerc = Writer::memory();

// We insert our key-value pairs in lexicographic order
// and mix them between our writers.
writera.insert("first-counter", 32_u32.to_ne_bytes())?;
writera.insert("second-counter", 64_u32.to_ne_bytes())?;
writerb.insert("first-counter", 23_u32.to_ne_bytes())?;
writerb.insert("second-counter", 320_u32.to_ne_bytes())?;
writerc.insert("first-counter", 64_u32.to_ne_bytes())?;

// We create readers from our writers.
let cursora = writera.into_inner().map(Cursor::new)?;
let cursorb = writerb.into_inner().map(Cursor::new)?;
let cursorc = writerc.into_inner().map(Cursor::new)?;
let readera = Reader::new(cursora)?.into_cursor()?;
let readerb = Reader::new(cursorb)?.into_cursor()?;
let readerc = Reader::new(cursorc)?.into_cursor()?;

// We create a merger that will sum our u32s when necessary,
// and we add our readers to the list of readers to merge.
let merger_builder = MergerBuilder::new(wrapping_sum_u32s);
let merger = merger_builder.add(readera).add(readerb).add(readerc).build();

// We can iterate over the entries in key-order.
let mut iter = merger.into_stream_merger_iter()?;

// We can see that the sum of u32s is valid here.
assert_eq!(iter.next()?, Some((&b"first-counter"[..], &119_u32.to_ne_bytes()[..])));
assert_eq!(iter.next()?, Some((&b"second-counter"[..], &384_u32.to_ne_bytes()[..])));
assert_eq!(iter.next()?, None);

Example: Use the Sorter struct

In this example we show how by defining a merge function, we can insert multiple entries with the same key and output them in lexicographic order.

The Sorter accepts the entries in any given order, will reorder them in-memory and merge them with the merge function when required. It is authorized to have a memory budget during its construction and will try to follow it as closely as possible.

use std::array::TryFromSliceError;
use std::borrow::Cow;
use std::convert::TryInto;

use grenad::{CursorVec, SorterBuilder};

// This merge function:
//  - parses u32s from native-endian bytes,
//  - wrapping sums them and,
//  - outputs the result as native-endian bytes.
fn wrapping_sum_u32s<'a>(
 _key: &[u8],
 values: &[Cow<'a, [u8]>],
) -> Result<Cow<'a, [u8]>, TryFromSliceError>
{
    let mut output: u32 = 0;
    for bytes in values.iter().map(AsRef::as_ref) {
        let num = bytes.try_into().map(u32::from_ne_bytes)?;
        output = output.wrapping_add(num);
    }
    Ok(Cow::Owned(output.to_ne_bytes().to_vec()))
}

// We create a sorter that will sum our u32s when necessary.
let mut sorter = SorterBuilder::new(wrapping_sum_u32s).chunk_creator(CursorVec).build();

// We insert multiple entries with the same key but different values
// in arbitrary order, the sorter will take care of merging them for us.
sorter.insert("first-counter", 32_u32.to_ne_bytes())?;
sorter.insert("first-counter", 23_u32.to_ne_bytes())?;
sorter.insert("first-counter", 64_u32.to_ne_bytes())?;

sorter.insert("second-counter", 320_u32.to_ne_bytes())?;
sorter.insert("second-counter", 64_u32.to_ne_bytes())?;

// We can iterate over the entries in key-order.
let mut iter = sorter.into_stream_merger_iter()?;

// We can see that the sum of u32s is valid here.
assert_eq!(iter.next()?, Some((&b"first-counter"[..], &119_u32.to_ne_bytes()[..])));
assert_eq!(iter.next()?, Some((&b"second-counter"[..], &384_u32.to_ne_bytes()[..])));
assert_eq!(iter.next()?, None);

Structs

A ChunkCreator that generates Vec of bytes wrapped by a Cursor for chunks.
A struct you can use to merge entries from the sources by using the merge function provided to merge values when entries have the same key.
A struct that is used to configure a Merger with the sources to merge.
An iterator that yield the merged entries in key-order.
An iterator that is able to yield all the entries with a key that starts with a given prefix.
An iterator that is able to yield all the entries lying in a specified range.
A struct that is able to read a grenad file that has been created by a crate::Writer.
A cursor that can move forward backward and move on a specified key.
An iterator that is able to yield all the entries with a key that starts with a given prefix in reverse order.
An iterator that is able to yield all the entries lying in a specified range in reverse order.
A struct you can use to automatically sort and merge duplicate entries.
A struct that is used to configure a Sorter to better fit your needs.
A ChunkCreator that generates temporary Files for chunks.
A struct you can use to write entries into any io::Write type, entries must be inserted in key-order.
A struct that is used to configure a Writer.

Enums

The different supported types of compression.
Represents an error that can occur while using this library.
The format version of this file.
The kind of sort algorithm used by the sorter to sort its internal vector.

Traits

A trait that represent a ChunkCreator.

Type Definitions

The default chunk creator.