infinitree/lib.rs
1//! Infinitree is a versioned, embedded database that uses uniform,
2//! encrypted blobs to store data.
3//!
4//! Infinitree is based on a set of lockless and locking data
5//! structures that you can use in your application as a regular map
6//! or list.
7//!
8//! Data structures:
9//!
10//! * [`fields::VersionedMap`]: A lockless HashMap that tracks incremental changes
11//! * [`fields::Map`]: A lockless HashMap
12//! * [`fields::LinkedList`]: Linked list that tracks incremental changes
13//! * [`fields::List`]: A simple `RwLock<Vec<_>>` alias
14//! * [`fields::Serialized`]: Any type that implements [`serde::Serialize`]
15//!
16//! Tight control over resources allows you to use it in situations
17//! where memory is scarce, and fall back to querying from slower
18//! storage.
19//!
20//! Additionally, Infinitree is useful for securely storing and sharing
21//! any [`serde`](https://docs.rs/serde) serializable application
22//! state, or dumping and loading application state changes through
23//! commits. This is similar to [Git](https://git-scm.com).
24//!
25//! In case you're looking to store large amounts of binary blobs, you
26//! can open a [`BufferedSink`][object::BufferedSink], which supports
27//! `std::io::Write`, and store arbitrary byte streams in the tree.
28//!
29//! ## Features
30//!
31//! * Encrypt all on-disk data, and only decrypt on use
32//! * Transparently handle hot/warm/cold storage tiers; currently S3-compatible backends are supported
33//! * Versioned data structures allow you to save/load/fork application state safely
34//! * Thread-safe by default
35//! * Iterate over random segments of data without loading to memory in full
36//! * Focus on performance and fine-grained control of memory use
37//! * Extensible for custom data types, storage backends, and serialization
38//!
39//! ## Example use
40//!
41//! ```
42//! use infinitree::{
43//! *,
44//! crypto::UsernamePassword,
45//! backends::Directory,
46//! fields::{VersionedMap},
47//! };
48//! use serde::{Serialize, Deserialize};
49//!
50//! fn main() -> anyhow::Result<()> {
51//! let mut tree = Infinitree::<VersionedMap<String, usize>>::empty(
52//! Directory::new("../test_data")?,
53//! UsernamePassword::with_credentials("username".to_string(),
54//! "password".to_string())?
55//! ).unwrap();
56//!
57//! tree.index().insert("sample_size".into(), 1234);
58//!
59//! tree.commit("first measurement! yay!");
60//! Ok(())
61//! }
62//! ```
63//!
64//! ## Core concepts
65//!
66//! [`Infinitree`] provides is the first entry point to the
67//! library. It creates, saves, and queries various versions of your
68//! [`Index`].
69//!
70//! There are 2 types of interactions with an infinitree: one that's
71//! happening through an [`Index`], and one that's directly accessing
72//! the [`object`] structure.
73//!
74//! Any data stored in infinitree objects will receive a `ChunkPointer`,
75//! which _must_ be stored somewhere to retrieve the data. Hence the
76//! need for an index.
77//!
78//! An index can be any struct that implements the [`Index`]
79//! trait. There's also a helpful [derive macro](derive@Index) that
80//! helps you do this. An index will consist of various fields, which
81//! act like regular old Rust types, but need to implement a few
82//! traits to help serialization.
83//!
84//! ### Index
85//!
86//! You can think about your `Index` as a schema. Or just application
87//! state on steroids.
88//!
89//! In a more abstract sense, the [`Index`] trait and corresponding
90//! [derive macro](derive@Index) represent a view into a single
91//! version of your database. Using an [`Infinitree`] you can swap
92//! between, and mix-and-match data from, various versions of an
93//! `Index` state.
94//!
95//! ### Fields
96//!
97//! An `Index` contains serializable fields. These are thread-safe
98//! data structures with internal mutation, which support some kind of
99//! serialization [`Strategy`].
100//!
101//! You can use any type that implements [`serde::Serialize`] as a
102//! field through the `fields::Serialized` wrapper type, but there are
103//! [incremental hash map][fields::VersionedMap] and
104//! [list-like][fields::LinkedList] types available for you to use to
105//! track and only save changes between versions of your data.
106//!
107//! Persisting and loading fields is done using an [`Intent`]. If you
108//! use the [`Index`][derive@Index] macro, it will automatically
109//! create accessor functions for each field in an index, and return
110//! an `Intent` wrapped strategy.
111//!
112//! Intents elide the specific types of the field and allow doing
113//! batch operations, e.g. when calling [`Infinitree::commit`] using a
114//! different strategy for each field in an index.
115//!
116//! ### Strategy
117//!
118//! To tell Infinitree how to serialize a field, you can use different
119//! strategies. A [`Strategy`] has full control over how a data structure
120//! is serialized in the object system.
121//!
122//! Every strategy receives an `Index` transaction, and an
123//! [`object::Reader`] or [`object::Writer`]. It is the responsibility
124//! of the strategy to store [references](ChunkPointer) so you can
125//! load back the data once persisted.
126//!
127//! There are 2 strategies in the base library:
128//!
129//! * [`LocalField`]: Serialize all data in a single stream.
130//! * [`SparseField`]: Serialize keys and values of a Map in separate
131//! streams. Useful for quickly iterating over key indexes when
132//! querying. Currently only supports values smaller than 4MB.
133//!
134//! Deciding which strategy is best for your use case may mean you
135//! have to run some experiments and benchmarks.
136//!
137//! See the documentation for the [`Index`][derive@Index] macro to see how to
138//! use strategies.
139//!
140//! [`Intent`]: fields::Intent
141//! [`Strategy`]: fields::Strategy
142//! [`Load`]: fields::Load
143//! [`Store`]: fields::Store
144//! [`LocalField`]: fields::LocalField
145//! [`SparseField`]: fields::SparseField
146//!
147//! ## Cryptographic design
148//!
149//! To read more about how the object system keeps your data safe,
150//! please look at
151//! [DESIGN.md](https://github.com/symmetree-labs/infinitree/blob/main/DESIGN.md)
152//! file in the main repository.
153
154#![deny(
155 arithmetic_overflow,
156 future_incompatible,
157 nonstandard_style,
158 rust_2018_idioms,
159 trivial_casts,
160 unused_crate_dependencies,
161 unused_lifetimes,
162 unused_qualifications,
163 rustdoc::bare_urls,
164 rustdoc::broken_intra_doc_links,
165 rustdoc::invalid_codeblock_attributes,
166 rustdoc::invalid_rust_codeblocks,
167 rustdoc::private_intra_doc_links
168)]
169#![deny(clippy::all)]
170#![allow(clippy::ptr_arg)]
171
172#[cfg(any(test, doctest, bench))]
173use criterion as _;
174
175mod chunks;
176mod compress;
177mod id;
178
179pub mod backends;
180pub mod crypto;
181pub mod fields;
182pub mod index;
183pub mod object;
184pub mod tree;
185
186pub use crate::chunks::ChunkPointer;
187pub use crate::crypto::{Digest, Hasher, Key};
188pub use crate::index::Index;
189pub use crate::object::ObjectId;
190pub use crate::tree::Infinitree;
191pub use anyhow;
192
193use backends::Backend;
194use id::Id;
195
196use rmp_serde::decode::from_slice as deserialize_from_slice;
197use rmp_serde::encode::write as serialize_to_writer;
198use rmp_serde::to_vec as serialize_to_vec;
199use rmp_serde::Deserializer;
200
201/// Size of a storage object unit.
202pub const BLOCK_SIZE: usize = 4 * 1024 * 1024;
203
204pub use infinitree_macros::Index;
205
206#[cfg(test)]
207const TEST_DATA_DIR: &str = "../test_data";