monotree 0.1.5

Rust implementation of an optimized Sparse Merkle Tree
Documentation
<p align="center"> <img src="https://raw.githubusercontent.com/thyeem/monotree/master/monotree.png" height="220"/></p>


[![Build Status](https://api.travis-ci.com/thyeem/monotree.svg?token=QYwxZ27j8uz6zsrzY6bk&branch=master)](https://app.travis-ci.com/thyeem/monotree)
[![Crates.io](https://img.shields.io/crates/v/monotree.svg)](https://crates.io/crates/monotree)

# Monotree
Rust implementation of an optimized Sparse Merkle Tree.
This is a kind of binary-radix tree based on bitwise branching, _currently_, no nibble of bit.
For now, branching unit is just ___a single bit___, _neither a 4-bit nor a byte nibble_.

## Features
- Very __simple__ and __lightweight__, but __fast__ and __robust__.
- __Fully featured__ Sparse Merkle Tree (SMT) as a storage
- <u>This includes: __non-inclusion proof__, as well as __inclusion proof__, and its verification.</u>
- Again, _NOT verbose_ at all.

This library mostly relies on the _Rust standard library only_ except for `database APIs` and `hashers`.
Currently, `monotree` supports these databases and hash functions following, but is designed to be super easy to customize and add:

_Databases include_:
- [`HashMap`]https://lib.rs/crates/hashbrown
- [`RocksDB`]https://lib.rs/crates/rocksdb
- [`Sled`]https://lib.rs/crates/sled

_Hashers include_:
- [`Blake3`]https://lib.rs/crates/blake3
- [`Blake2s`]https://lib.rs/crates/blake2-rfc and [`Blake2b`]https://lib.rs/crates/blake2-rfc
- [`SHA-2`]https://lib.rs/crates/sha2
- [`SHA-3 (Keccak)`]https://lib.rs/crates/sha3

## Quick start
> _from `examples/basic.rs`_

> Regarding __non-inclusion proof__ and __inclusion proof__, See _Merkle proof_ section in <u>More Examples</u> below.


```rust

use monotree::{Monotree, Result};
use monotree::utils::random_hash;

fn main() -> Result<()> {
    // Init a monotree instance
    // by default, with 'HashMap' and 'Blake3' hash function
    let mut tree = Monotree::default();

    // It is natural the tree root initially has 'None'
    let root = None;

    // Prepare a random pair of key and leaf.
    // random_hash() gives a fixed length of random array,
    // where Hash -> [u8; HASH_LEN], HASH_LEN = 32
    let key = random_hash();
    let leaf = random_hash();

    // Insert the entry (key, leaf) into tree, yielding a new root of tree
    let root = tree.insert(root.as_ref(), &key, &leaf)?;
    assert_ne!(root, None);

    // Get the leaf inserted just before. Note that the last root was used.
    let found = tree.get(root.as_ref(), &key)?;
    assert_eq!(found, Some(leaf));

    // Remove the entry
    let root = tree.remove(root.as_ref(), &key)?;

    // surely, the tree has nothing and the root back to 'None'
    assert_eq!(tree.get(root.as_ref(), &key)?, None);
    assert_eq!(root, None);
    Ok(())
}
```

## Performance

<u>**All benchmarks were performed in a cumulative way**</u>, where the root resulting from an operation just before was reused for subsequent operations.
They were carried out <u>with randomly-generated bytes entries</u> and <u>from the root of an empty tree</u>. (Deletion starts from the last root.)

> Tested on: _Intel Core i5-3570 CPU @ 3.4GHz and 16 GB RAM / Ubuntu 18.04_ with `rustc stable 1.42.0`
> As a hasher, `Blake3` was used in this benchmark.

|  Op.   | DB used | #Entries | Time Measured | Std Error | per Op.  |
| :----: | :-----: | -------: | :-----------: | :-------: | :------: |
| Insert | HashMap |       10 |    8.50 us    |  0.01 us  | 0.85 us  |
| Insert | HashMap |      100 |   165.71 us   |  0.14 us  | 1.66 us  |
| Insert | HashMap |     1000 |    2.32 ms    |  0.01 ms  | 2.32 us  |
| Insert | HashMap |    10000 |   37.39 ms    |  0.02 ms  | 3.74 us  |
|  Get   | HashMap |       10 |    7.82 us    |  0.01 us  | 0.78 us  |
|  Get   | HashMap |      100 |   114.51 us   |  0.14 us  | 1.15 us  |
|  Get   | HashMap |     1000 |    1.52 ms    |  0.01 ms  | 1.52 us  |
|  Get   | HashMap |    10000 |   20.40 ms    |  0.01 ms  | 2.40 us  |
| Remove | HashMap |       10 |   15.65 us    |  0.01 us  | 1.57 us  |
| Remove | HashMap |      100 |   282.68 us   |  0.09 us  | 2.83 us  |
| Remove | HashMap |     1000 |    4.23 ms    |  0.01 ms  | 4.23 us  |
| Remove | HashMap |    10000 |   69.15 ms    |  0.07 ms  | 6.92 us  |
|        |         |          |               |           |          |
| Insert | RocksDB |       10 |   34.80 us    |  0.27 us  | 3.48 us  |
| Insert | RocksDB |      100 |   602.55 us   |  2.72 us  | 6.03 us  |
| Insert | RocksDB |     1000 |   10.35 ms    |  0.06 ms  | 10.35 us |
| Insert | RocksDB |    10000 |   145.83 ms   |  0.20 ms  | 14.58 us |
|  Get   | RocksDB |       10 |   10.20 us    |  0.06 us  | 1.02 us  |
|  Get   | RocksDB |      100 |   142.20 us   |  0.79 us  | 1.42 us  |
|  Get   | RocksDB |     1000 |    1.74 ms    |  0.01 ms  | 1.74 us  |
|  Get   | RocksDB |    10000 |   22.94 ms    |  0.03 ms  | 2.29 us  |
| Remove | RocksDB |       10 |   58.33 us    |  0.50 us  | 5.83 us  |
| Remove | RocksDB |      100 |    1.02 ms    |  6.40 us  | 10.20 us |
| Remove | RocksDB |     1000 |   20.22 ms    |  0.10 ms  | 20.22 us |
| Remove | RocksDB |    10000 |   394.95 ms   |  1.46 ms  | 39.50 us |



## Integration tests and benchmark
performs integration tests with full combinations of operations and tree types consisting of _Databases_ and _Hashers_ included.

```bash
    ## Some tests are time consuming.
    ## --release is optional, but without it, it will take a longer time to complete the tests
    $ cargo test --release --features "db_rocksdb, db_sled"
```

performs a micro-benchmark based on [`Criterion`](https://crates.io/crates/criterion), with full combinations of operations and tree types consisting of _Databases_ and _Hashers_ included.

```bash
    $ cargo bench --features "db_rocksdb, db_sled"
```

and macroscopic-time benchmarks (but rather wider error bars) with all tree types were also prepared in _`examples/perf.rs`_.

```bash
    $ cargo run --release --example perf --features "db_rocksdb, db_sled"
```

and some examples describing how to manipulate `monotree`

```bash
    $ cargo run --example basic
    $ cargo run --example advanced --features "db_rocksdb"
```

## Further improvement

`monotree` is a special case among the generalized binary radix trees, I'd like to call it `PoT (Power Of Two) radix tree`.
<u>If `monotree` were generalized with `pow(2, n)` of nibbles as a branching unit</u>, _there would have been room for further performance improvement_.

This generalization is now on progress.

## More Examples

> _from `examples/advanced.rs`_

```rust

use monotree::database::*;
use monotree::hasher::*;
use monotree::utils::*;
use monotree::*;

fn main() -> Result<()> {
    // Init a monotree instance:
    // manually select a db and a hasher as your preference
    // Monotree::<DATABASE, HASHER>::new(DB_PATH)
    // where DATABASE = {MemoryDB, RocksDB, Sled}
    //         HASHER = {Blake3, Blake2s, Blake2b, Sha2, Sha3}
    let mut tree = Monotree::<RocksDB, Blake2b>::new("/tmp/monotree");

    // It is natural the tree root initially has 'None'
    let root = None;

    // Prepare 500 random pairs of key and leaf.
    // random_hashes() gives Vec<Hash>
    // where Hash is a fixed length of random array or [u8; HASH_LEN]
    let n = 500;
    let keys = random_hashes(n);
    let leaves = random_hashes(n);

    // Insert a bunch of entries of (key, leaf) into tree.
    // looks quite similar with 'monotree::insert()', but for insertion using batch.
    // 'inserts()' is much faster than 'insert()' since it's based on the following:
    // (1) DB batch-write, (2) sorting keys before insertion, and (3) mem-cache.
    let root = tree.inserts(root.as_ref(), &keys, &leaves)?;
    assert_ne!(root, None);

    // Similarly, there are methods 'gets()' and 'removes()' for batch use of
    // 'get()' and 'remove()', respectively.
    let result = tree.gets(root.as_ref(), &keys)?;
    assert_eq!(result.len(), keys.len());

    let root = tree.removes(root.as_ref(), &keys)?;
    // surely, the tree has nothing nad the root back to 'None'
    assert_eq!(root, None);

    /////////////////////////////////////////////////////////////////////
    // `Merkle proof` secion: verifying inclusion of data (inclusion proof)

    // `Monotree` has compressed representation, but it fully retains
    // the properties of the Sparse Merkle Tree (SMT).
    // Thus, `non-inclusion proof` is quite straightforward. Just go walk down
    // the tree with a key (or a path) given. If we cannot successfully get a leaf,
    // we can assure that the leaf is not a part of the tree.
    // The process of inclusion proof is below:

    // random pre-insertion for Merkle proof test
    let root = tree.inserts(root.as_ref(), &keys, &leaves)?;

    // pick a random key from keys among inserted just before
    let key = keys[99];

    // generate the Merkle proof for the root and the key
    let proof = tree.get_merkle_proof(root.as_ref(), &key)?;

    // To verify the proof correctly, you need to provide a hasher matched
    // Previously the tree was initialized with `Blake2b`
    let hasher = Blake2b::new();

    // get a leaf matched with the key: where the Merkle proof verification starts off
    let leaf = leaves[99];

    // verify the Merkle proof using all those above
    let verified = verify_proof(&hasher, root.as_ref(), &leaf, proof.as_ref());
    assert_eq!(verified, true);

    /////////////////////////////////////////////////////////////////////
    // Usage examples with some functional tests
    // Carefully trace the variable `root` as they are frequently shadowed.

    let mut tree = Monotree::default();
    let mut root = None;
    let hasher = Blake3::new();

    //--- insert/get and gen_proof/verify_proof over iterator
    for (i, (key, value)) in keys.iter().zip(leaves.iter()).enumerate() {
        // insert a key into tree
        root = tree.insert(root.as_ref(), key, value)?;

        // inserted a key and yields a root, where cumulative check-up goes on
        for (k, v) in keys.iter().zip(leaves.iter()).take(i + 1) {
            // check if the key-value pair was correctly inserted so far
            assert_eq!(tree.get(root.as_ref(), k)?, Some(*v));

            // generates a Merkle proof with all keys so far
            let proof = tree.get_merkle_proof(root.as_ref(), k)?;

            // verify the Merkle proof with all keys so far
            assert_eq!(
                verify_proof(&hasher, root.as_ref(), v, proof.as_ref()),
                true
            );
        }
    }
    assert_ne!(root, None);

    //--- insert/get and gen_proof/verify_proof after each deletion of entry
    for (i, (key, _)) in keys.iter().zip(leaves.iter()).enumerate() {
        assert_ne!(root, None);

        // assert that all other values are fine after each deletion
        for (k, v) in keys.iter().zip(leaves.iter()).skip(i) {
            // check in the same way as the previous example
            assert_eq!(tree.get(root.as_ref(), k)?, Some(*v));
            let proof = tree.get_merkle_proof(root.as_ref(), k)?;
            assert_eq!(
                verify_proof(&hasher, root.as_ref(), v, proof.as_ref()),
                true
            );
        }
        // delete a key and check if it was correctly removed
        root = tree.remove(root.as_ref(), key)?;
        assert_eq!(tree.get(root.as_ref(), key)?, None);
    }
    // must be back to inital state of tree
    assert_eq!(root, None);

    //--- faster way to insert/remove entries
    // Now tree is empty, and root is back to `None` again
    // Redo all those above using methods supporting batch operations
    root = tree.inserts(root.as_ref(), &keys, &leaves)?;
    assert_ne!(root, None);

    // Even if we shuffle the keys when removing,
    shuffle(&mut keys);

    // there's no difference. Back to `None` of root and empty tree again.
    // that's why the `root` plays a role as "state index of tree"
    root = tree.removes(root.as_ref(), &keys)?;
    assert_eq!(root, None);

    Ok(())
}
```