1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
//! # Banyan trees
//!
//! ![](https://i.imgur.com/6icLbdz.jpg)
//!
//! Banyan trees in nature are the trees with the widest canopy. They also sprout new roots, which after some time can become indistinguishable from the original root.
//!
//! The banyan trees in this library are persistent trees with a high branching factor (wide canopy) that can be used to efficiently store and query large sequences of key value pairs.
//!
//! Banyan trees are optimized for *appending* new entries and filtered, streaming reading of values. They naturally support addressing elements by index.
//! They **do not** support arbitrary insertion or deletion of elements, so they are **not** a general purpose map data structure.
//!
//! They are not balanced trees, but they share some properties with [B-Trees].
//!
//! ## Persistence
//!
//! Banyan trees are persistent, using a content-addressed storage system such as [ipfs] or a key value store.
//! Data is [CBOR] encoded and [zstd] compressed for space efficient persistent storage and replication. It is also encrypted using the [salsa20] stream cipher.
//!
//! # Indexing
//!
//! Each banyan tree entry consists of a key part and a value part.
//! The value part is an opaque value that can be serialized as [cbor]. This can not be used for filtering in queries.
//! The key part can be used for efficient querying by defining how to compute summaries from keys.
//! These summaries will be stored higher up in the tree to allow for efficient querying and filtered streaming.
//!
//! In addition to querying by key content, elements can always be efficiently accessed by offset.
//!
//! # Sealing and packing
//!
//! ## Sealing
//!
//! A leaf node of the tree is considered *sealed* as soon as the compressed data exceeds a certain size.
//! This depends on configuration and compression rate.
//! A branch node is considered *sealed* as soon as it has a certain, configurable number of child nodes.
//!
//! ## Packing
//!
//! Banyan trees are not balanced, since they do not have to support arbitrary insertion and deletion.
//! There is a concept somewhat related to balancing called packing. A tree is considered packed
//! if all but the last leaf node is packed, and branch nodes are packed to the left as much as possible.
//! This is usually the most space efficient representation of the sequence of key value pairs, and also
//! the most efficient one for subsequent appending.
//!
//! ## Unpacked trees
//!
//! Packing a tree after each append of a single entry or a small number of entries would be inefficient, since
//! it would involve recreating a new branch node for each level and a leaf node.
//!
//! To allow for quick append, it is possible to add elements without packing. This creates a linked list of trees,
//! and in the case of adding individual elements, a degenerate tree resembling a linked list.
//!
//! Unpacked trees can be packed at any time without changing the content.
//!
//! [CBOR]: https://en.wikipedia.org/wiki/CBOR
//! [zstd]: https://en.wikipedia.org/wiki/Zstandard
//! [salsa20]: https://en.wikipedia.org/wiki/Salsa20
//! [ipfs]: https://ipfs.io/
//! [B-Trees]: https://en.wikipedia.org/wiki/B-tree
pub mod forest;
pub mod index;
pub mod query;
pub mod store;
mod stream;
pub mod tree;
mod util;
mod zstd_array;