fog_db_traits/
lib.rs

1/*!
2This crate defines the interface to a generic implementation of a fog-pack database (a FogDB).
3
4The Database
5------------
6
7A FogDB database consists of a collection of
8[Documents][fog_pack::document::Document], each of which is immutable and
9referred to by its Hash. Documents can also link to other documents by those
10same hashes. The database has a set of named "root documents" that it keeps
11resident, and any documents that can be reached by following hash links from
12those roots will also be kept resident in the database. In other words, if
13you can reach a Document from a root, it stays in the database. If you
14can't, it gets evicted from the database. These links can also be "weakened" in
15a transaction, much as you can with most reference-tracking garbage collectors.
16
17Documents can adhere to a [Schema][fog_pack::schema::Schema], which constrains
18a document's format and provide hints on how to compress it for storage. These
19schema let one pre-verify that a document can be deserialized into a data
20structure, and let systems know ahead of time what type of data is in a
21document.
22
23Now, if the database were just immutable documents, it would be quite difficult
24to deal with. That's why every document adhering to a schema can also have
25[Entries][fog_pack::entry::Entry], which are essentially smaller documents
26attached to a parent document under a key prefix. These entries are not looked
27up by their Hash, but are found by running a [Query][fog_pack::query::Query] on
28the parent document - in a FogDB, this query will return a sequence of matching
29entries, and will remain active in case more entries are found in the future.
30
31The format of entries are also constrained by the parent document's schema,
32which puts them in an interesting position for a database, and is what makes
33FogDB multi-modal:
34
35- From a document-oriented view, they're a collection of documents all matching the same schema.
36- From a relational database view, the parent document & entry key is a table
37    reference, and the entries are records (or *entries*, get it?) in the table.
38- From a graph database view, the documents are nodes, and the entries are edges.
39
40Rather than provide the expected access APIs for all of these, FogDB provides a
41base over which such APIs can be built.
42
43Transactions: Modifying the Database
44-----
45
46The database has three ways to modify it:
47- Modify the set of root named documents by changing a name-to-hash mapping.
48- Modify the set of stored schema by adding or removing a schema document.
49- Execute a transaction on the database
50
51Transactions are the most common way to change the database. They follow ACID
52properties, so when a transaction is committed, either all parts of the
53transaction complete simultaneously or the whole transaction is rejected. Most
54commonly, the transaction might fail if attempting to delete an entry that has
55already been removed - this is how compare-and-swap type transactions can be
56done to the database.
57
58Transactions can do the following:
59- Add a document to the database
60- Weaken/strengthen document hash links
61- Add an entry to the database, optionally setting a time-to-live or an access
62    policy
63- Modify an entry's time-to-live or its access policy
64- Delete an entry from the database
65
66Documents cannot be deleted directly; instead, when they are no longer reachable
67from the named root documents, they are automatically garbage-collected.
68
69Note that all transactions will only execute on the local FogDB instance; this
70follows the rule of the system can only modify itself, and it is up to other
71database nodes to modify themselves to match as they desire.
72
73Cursors: Reading the Database
74------
75
76The database is accessed through the [Cursor][cursor] interface. A
77[cursor][cursor::Cursor] can be opened either on a [Group][group::Group::cursor]
78(see [Connecting to Other Databases](#groups-connecting-to-other-databases)) or
79on the [database][Db::cursor]. A cursor must start from some specific Document,
80and can be thought of as always being "over" a document.  Each document can
81contain hashes of other documents; the cursor can follow these with a "forward"
82function call. Alternately, a new cursor can be "forked" off to the linked
83document. In this way, many cursors can be created for quicker traversal of a
84Document tree.
85
86A cursor can also be used to make a query, which uses up the cursor and turns it
87into a [CursorQuery][cursor::CursorQuery] (which can be backed out of to get the
88cursor back). This yields a stream of entries from the document the cursor is over.
89
90A query is just a fog-pack [Query][fog_pack::query::Query] with an optional
91preferred ordering to the returned Entry [results][cursor::QueryResult]. If an
92entry has hash links to documents, new cursors can be forked off to them using
93the included [`ForkSpawner`][cursor::ForkSpawner].
94
95Here's where it gets interesting: if a cursor was opened up on a
96[Group][group::Group], then any remote databases meeting the group's
97requirements can also be read by a cursor. In this way, many databases at once
98can be used to simultaneously retrieve documents and give query results, which
99is why each query result includes the source database it was retrieved from.
100
101This means that document retrieval is near-instant when the local database has
102the document, but a cursor can indefinitely go searching through remote
103databases in search of one that has the requested document. By forking off many
104cursors at once, the network can use an entire swarm of remote databases to
105retrieve the documents.
106
107Groups: Connecting to other Databases
108-----
109
110Each FogDB instance exists as a single Node, which may use any number of
111network protocols to communicate with other Nodes. This lets the [cursor]
112interface use many remote databases at once to retrieve documents and get query
113results, and lets portions of the database be exposed to other nodes in turn.
114
115Connecting to other nodes is done by [opening a group][Db::group] using a [group
116specification][group::GroupSpec]. This specification limits the network types
117over which the group will find other nodes, how it can find and connect to them,
118and if the nodes must identify themselves as part of a Policy (see [Policies and
119Certificates](#policies-and-certificates)).
120
121Node discovery can be limited to these approximate network classes:
122
123- Machine: communication between other running FogDB instances on the same
124    computer.
125- Direct: Direct machine-to-machine networking, with no switches or routers
126    present. Primary example is WiFi Direct.
127- Local: local networks. LANs, ad-hoc networks, and other physically close
128    networking systems fall under this category.
129- Regional: A collection of local networks that isn't the internet. Campus
130    networks and Metropolitan area networks fall under this category. The IPv6
131    "organization-level" multicast scope also fits.
132- Global: the global internet.
133
134Once a group is opened, the various underlying network protocols will attempt to
135establish a collection of nodes that fit the group's specification, and will
136work to set up and maintain node discovery mechanisms for the group.
137
138Gates: Making the Database Remotely Available
139-----
140
141When a group is established, it's not enough to actually communicate between
142database nodes. Each node must choose what parts of the database to expose to
143remote nodes, and this is done by creating a [Gate][gate::Gate]. A gate allows
144remote nodes to open a database cursor starting at a specific document, given
145when the gate is [opened][group::Group::gate].
146
147Gates provide the means to easily scope access to the database: anything that
148can be reached from the starting document is fair game for access by a remote
149node in the Group. Queries can also be made on reached documents. Entries can
150have additional access policies that a node must match in order to be given the
151entry; otherwise it is skipped over.
152
153When a query is made on a particular document reached through a gate, you can
154optionally [hook into the query][gate::Gate::query_hook] and manually provide
155query results. This allows for dynamic generation of query responses, and can be
156used to build RPC-like mechanisms from the query system.
157
158Policies and Certificates
159-------------------------
160
161Policies are FogDB's way of scoping access to a database, and make use of
162fog-pack [Identities][fog_pack::types::Identity] to do so. An Identity is a
163public-private keypair, which can be used to sign documents and entries, and
164generally establish a unique identity.
165
166Nodes can identify themselves on the network using these long-term signing keys.
167A full [Node Address][NodeAddr] consists of a long-term key like this, and an
168ephemeral key pair that is regenerated by each network protocol every time a
169group is created.  Not all nodes will have these Identities, but they're
170required when joining any group with a policy in place.
171
172Identities can be used to sign a special document called a
173[Certificate][cert::Cert]. Certificates are identified by their signer, the
174subject Identity, a Hash value acting as a context, and a key string.
175Certificates are immutable, but new ones with the same
176signer/subject/context/key combination can be made in order to replace previous
177ones - this also serves as a way to revoke certificates. See [the
178documentation][cert::Cert] for more info.
179
180Certificates on their own do nothing, but with a [Policy][cert::Policy] they can
181delegate access permissions. A policy can be as simple as a list of permitted
182Identities, but they can also include [Policy Chains][cert::PolicyChain], which
183allow certificates to be used to establish permission.
184
185Policies and Certificates are automatically propagated through databases; they
186must be actively retrieved or exchanged as part of a network protocol. FogDB
187doesn't specify any particular mechanism for this, leaving it up to applications
188and network protocols to propagate certificates and set policies. It's assumed
189that, as part of a FogDB setup, certificates will be stored in the database and
190be used to check policies.
191
192*/
193
194use std::{collections::{HashMap, BTreeMap}, error::Error, sync::Arc};
195
196use async_trait::async_trait;
197use cursor::{DbQuery, CursorQuery};
198use fog_pack::{entry::EntryRef, error::Error as FogError, schema::Schema, types::*, document::Document};
199use group::GroupSpec;
200use thiserror::Error;
201
202pub mod gate;
203pub mod cert;
204pub mod group;
205pub mod transaction;
206pub mod cursor;
207
208/// Network connection information
209pub struct NetInfo {
210    /// Local database connection
211    pub db: bool,
212    /// Network within the currently running machine
213    pub machine: bool,
214    /// Direct machine-to-machine communication
215    pub direct: bool,
216    /// Local network
217    pub local: bool,
218    /// Regional (municipal, large corporate, etc.) network
219    pub regional: bool,
220    /// The global internet
221    pub global: bool,
222    /// Some other, specific network, with optional additional network information
223    pub other: BTreeMap<String, BTreeMap<String, String>>,
224}
225
226/// Information about a connecting node. Includes the source network type from
227/// which the connection was made, and optionally the Identities used by the
228/// node.
229pub struct NodeInfo {
230    /// The network info for this node
231    pub net: NetType,
232    /// Long-term Identity, notionally tied to the user of the node
233    pub perm_id: Option<Identity>,
234    /// Ephemeral Identity, notionally tied to the node itself
235    pub eph_id: Option<Identity>,
236}
237
238/// An origin address for a database node on the network.
239///
240/// This address is generally unique, and at the very least the node's intent is
241/// to act as though it is unique.
242#[derive(Clone, Debug, Hash, PartialEq, Eq)]
243pub struct NodeAddr {
244    /// Long-term Identity, notionally tied to the user of the node
245    pub perm_id: Identity,
246    /// Ephemeral Identity, notionally tied to the node itself
247    pub eph_id: Identity,
248}
249
250/// An error from trying to convert a [`NodeInfo`] into a [`NodeAddr`].
251#[derive(Clone, Debug, PartialEq, Eq, Error)]
252pub enum NodeConvertError {
253    #[error("Missing permanent ID")]
254    MissingPermId,
255    #[error("Missing ephemeral ID")]
256    MissingEphId,
257}
258
259impl TryFrom<NodeInfo> for NodeAddr {
260    type Error = NodeConvertError;
261
262    fn try_from(value: NodeInfo) -> Result<Self, Self::Error> {
263        let perm_id = value.perm_id.ok_or(NodeConvertError::MissingPermId)?;
264        let eph_id = value.eph_id.ok_or(NodeConvertError::MissingEphId)?;
265        Ok(Self {
266            perm_id,
267            eph_id
268        })
269    }
270}
271
272/// A network type
273pub enum NetType {
274    Db,
275    Machine,
276    Direct,
277    Local,
278    Regional,
279    Global,
280    Other(String),
281}
282
283/// A fundamental database error has occurred. Usually means the database must
284/// be closed and access halted.
285#[non_exhaustive]
286pub enum DbError {
287    /// Internal Database error
288    Internal(Box<dyn Error>),
289    /// Error occurred while handling a fog-pack document
290    FogDoc {
291        context: String,
292        doc: Hash,
293        err: FogError,
294    },
295    /// Error occurred while handling a fog-pack entry
296    FogEntry {
297        context: String,
298        entry: EntryRef,
299        err: FogError,
300    },
301    /// Some other fog-pack related error occurred
302    FogOther { context: String, err: FogError },
303}
304
305type DbResult<T> = Result<T, Box<DbError>>;
306
307/// An implementation of a fog-pack database. Provides cursor, transaction,
308/// schema, group, and name access.
309///
310/// - Transactions may be executed upon by calling [`Db::txn`].
311/// - Groups may be opened through the database by calling [`Db::group`].
312/// - Schemas may be added, retrieved, and removed from the database.
313/// - Name-to-Document mappings may be added, retrieved, and removed from the
314///     database. These mappings function as the roots of the database's
315///     Document tree, pinning documents to the database.
316
317pub trait Db {
318
319    /// Start a new transaction with this database
320    fn txn(&self) -> transaction::Transaction;
321
322    /// Open a new group through this database
323    fn group(&self, spec: GroupSpec) -> Box<dyn group::Group>;
324
325    /// Open a local cursor on this database
326    fn cursor(&self) -> cursor::NewCursor;
327
328    /// Get a document directly from the database
329    fn doc_get(&self, doc: &Hash) -> DbResult<Option<Arc<Document>>>;
330
331    /// Make a query directly on the database
332    fn query(&self, doc: &Hash, query: DbQuery) -> Box<dyn CursorQuery>;
333
334    /// Get a schema in the database
335    fn schema_get(&self, schema: &Hash) -> DbResult<Option<Arc<Schema>>>;
336
337    /// Add a schema to the database. Fails if the schema document wasn't valid.
338    fn schema_add(&self, schema: Arc<Document>) -> DbResult<Result<Arc<Schema>, FogError>>;
339
340    /// Remove a schema from the database. Returns false if the schema wasn't in the database.
341    fn schema_del(&self, schema: &Hash) -> DbResult<bool>;
342
343    /// Get a list of all schemas in the database.
344    fn schema_list(&self) -> Vec<Hash>;
345
346    /// Get a hash associated with a name in the database.
347    fn name_get(&self, name: &str) -> DbResult<Option<Hash>>;
348
349    /// Add a name-to-hash mapping to the database. This pins the document
350    /// inside the database, once it's been added. This should be done before
351    /// adding the document in a transaction. Returns the previous hash, if
352    /// there was one.
353    fn name_add(&self, name: &str, hash: &Hash) -> DbResult<Option<Hash>>;
354
355    /// Remove a name-hash mapping from the database, returning None if there
356    /// wasn't one stored.
357    fn name_del(&self, schema: &Hash) -> DbResult<Option<Hash>>;
358
359    /// Get a list of all named documents in the database.
360    fn name_list(&self) -> Vec<(String, Hash)>;
361}
362
363/// A connection to the database through which a transaction can be committed.
364#[async_trait]
365pub trait DbCommit {
366    async fn commit(
367        self: Box<Self>,
368        docs: HashMap<Hash, transaction::DocChange>,
369        entries: HashMap<EntryRef, transaction::EntryChange>,
370    ) -> DbResult<Result<(), transaction::CommitErrors>>;
371
372    /// Get a schema in the database
373    fn schema_get(&self, schema: &Hash) -> DbResult<Option<Arc<Schema>>>;
374
375    /// Get a document directly from the database
376    fn doc_get(&self, doc: &Hash) -> DbResult<Option<Arc<Document>>>;
377}
fog_db_traits/lib.rs

fog_db_traits/
lib.rs