1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377
/*!
This crate defines the interface to a generic implementation of a fog-pack database (a FogDB).
The Database
------------
A FogDB database consists of a collection of
[Documents][fog_pack::document::Document], each of which is immutable and
referred to by its Hash. Documents can also link to other documents by those
same hashes. The database has a set of named "root documents" that it keeps
resident, and any documents that can be reached by following hash links from
those roots will also be kept resident in the database. In other words, if
you can reach a Document from a root, it stays in the database. If you
can't, it gets evicted from the database. These links can also be "weakened" in
a transaction, much as you can with most reference-tracking garbage collectors.
Documents can adhere to a [Schema][fog_pack::schema::Schema], which constrains
a document's format and provide hints on how to compress it for storage. These
schema let one pre-verify that a document can be deserialized into a data
structure, and let systems know ahead of time what type of data is in a
document.
Now, if the database were just immutable documents, it would be quite difficult
to deal with. That's why every document adhering to a schema can also have
[Entries][fog_pack::entry::Entry], which are essentially smaller documents
attached to a parent document under a key prefix. These entries are not looked
up by their Hash, but are found by running a [Query][fog_pack::query::Query] on
the parent document - in a FogDB, this query will return a sequence of matching
entries, and will remain active in case more entries are found in the future.
The format of entries are also constrained by the parent document's schema,
which puts them in an interesting position for a database, and is what makes
FogDB multi-modal:
- From a document-oriented view, they're a collection of documents all matching the same schema.
- From a relational database view, the parent document & entry key is a table
reference, and the entries are records (or *entries*, get it?) in the table.
- From a graph database view, the documents are nodes, and the entries are edges.
Rather than provide the expected access APIs for all of these, FogDB provides a
base over which such APIs can be built.
Transactions: Modifying the Database
-----
The database has three ways to modify it:
- Modify the set of root named documents by changing a name-to-hash mapping.
- Modify the set of stored schema by adding or removing a schema document.
- Execute a transaction on the database
Transactions are the most common way to change the database. They follow ACID
properties, so when a transaction is committed, either all parts of the
transaction complete simultaneously or the whole transaction is rejected. Most
commonly, the transaction might fail if attempting to delete an entry that has
already been removed - this is how compare-and-swap type transactions can be
done to the database.
Transactions can do the following:
- Add a document to the database
- Weaken/strengthen document hash links
- Add an entry to the database, optionally setting a time-to-live or an access
policy
- Modify an entry's time-to-live or its access policy
- Delete an entry from the database
Documents cannot be deleted directly; instead, when they are no longer reachable
from the named root documents, they are automatically garbage-collected.
Note that all transactions will only execute on the local FogDB instance; this
follows the rule of the system can only modify itself, and it is up to other
database nodes to modify themselves to match as they desire.
Cursors: Reading the Database
------
The database is accessed through the [Cursor][cursor] interface. A
[cursor][cursor::Cursor] can be opened either on a [Group][group::Group::cursor]
(see [Connecting to Other Databases](#groups-connecting-to-other-databases)) or
on the [database][Db::cursor]. A cursor must start from some specific Document,
and can be thought of as always being "over" a document. Each document can
contain hashes of other documents; the cursor can follow these with a "forward"
function call. Alternately, a new cursor can be "forked" off to the linked
document. In this way, many cursors can be created for quicker traversal of a
Document tree.
A cursor can also be used to make a query, which uses up the cursor and turns it
into a [CursorQuery][cursor::CursorQuery] (which can be backed out of to get the
cursor back). This yields a stream of entries from the document the cursor is over.
A query is just a fog-pack [Query][fog_pack::query::Query] with an optional
preferred ordering to the returned Entry [results][cursor::QueryResult]. If an
entry has hash links to documents, new cursors can be forked off to them using
the included [`ForkSpawner`][cursor::ForkSpawner].
Here's where it gets interesting: if a cursor was opened up on a
[Group][group::Group], then any remote databases meeting the group's
requirements can also be read by a cursor. In this way, many databases at once
can be used to simultaneously retrieve documents and give query results, which
is why each query result includes the source database it was retrieved from.
This means that document retrieval is near-instant when the local database has
the document, but a cursor can indefinitely go searching through remote
databases in search of one that has the requested document. By forking off many
cursors at once, the network can use an entire swarm of remote databases to
retrieve the documents.
Groups: Connecting to other Databases
-----
Each FogDB instance exists as a single Node, which may use any number of
network protocols to communicate with other Nodes. This lets the [cursor]
interface use many remote databases at once to retrieve documents and get query
results, and lets portions of the database be exposed to other nodes in turn.
Connecting to other nodes is done by [opening a group][Db::group] using a [group
specification][group::GroupSpec]. This specification limits the network types
over which the group will find other nodes, how it can find and connect to them,
and if the nodes must identify themselves as part of a Policy (see [Policies and
Certificates](#policies-and-certificates)).
Node discovery can be limited to these approximate network classes:
- Machine: communication between other running FogDB instances on the same
computer.
- Direct: Direct machine-to-machine networking, with no switches or routers
present. Primary example is WiFi Direct.
- Local: local networks. LANs, ad-hoc networks, and other physically close
networking systems fall under this category.
- Regional: A collection of local networks that isn't the internet. Campus
networks and Metropolitan area networks fall under this category. The IPv6
"organization-level" multicast scope also fits.
- Global: the global internet.
Once a group is opened, the various underlying network protocols will attempt to
establish a collection of nodes that fit the group's specification, and will
work to set up and maintain node discovery mechanisms for the group.
Gates: Making the Database Remotely Available
-----
When a group is established, it's not enough to actually communicate between
database nodes. Each node must choose what parts of the database to expose to
remote nodes, and this is done by creating a [Gate][gate::Gate]. A gate allows
remote nodes to open a database cursor starting at a specific document, given
when the gate is [opened][group::Group::gate].
Gates provide the means to easily scope access to the database: anything that
can be reached from the starting document is fair game for access by a remote
node in the Group. Queries can also be made on reached documents. Entries can
have additional access policies that a node must match in order to be given the
entry; otherwise it is skipped over.
When a query is made on a particular document reached through a gate, you can
optionally [hook into the query][gate::Gate::query_hook] and manually provide
query results. This allows for dynamic generation of query responses, and can be
used to build RPC-like mechanisms from the query system.
Policies and Certificates
-------------------------
Policies are FogDB's way of scoping access to a database, and make use of
fog-pack [Identities][fog_pack::types::Identity] to do so. An Identity is a
public-private keypair, which can be used to sign documents and entries, and
generally establish a unique identity.
Nodes can identify themselves on the network using these long-term signing keys.
A full [Node Address][NodeAddr] consists of a long-term key like this, and an
ephemeral key pair that is regenerated by each network protocol every time a
group is created. Not all nodes will have these Identities, but they're
required when joining any group with a policy in place.
Identities can be used to sign a special document called a
[Certificate][cert::Cert]. Certificates are identified by their signer, the
subject Identity, a Hash value acting as a context, and a key string.
Certificates are immutable, but new ones with the same
signer/subject/context/key combination can be made in order to replace previous
ones - this also serves as a way to revoke certificates. See [the
documentation][cert::Cert] for more info.
Certificates on their own do nothing, but with a [Policy][cert::Policy] they can
delegate access permissions. A policy can be as simple as a list of permitted
Identities, but they can also include [Policy Chains][cert::PolicyChain], which
allow certificates to be used to establish permission.
Policies and Certificates are automatically propagated through databases; they
must be actively retrieved or exchanged as part of a network protocol. FogDB
doesn't specify any particular mechanism for this, leaving it up to applications
and network protocols to propagate certificates and set policies. It's assumed
that, as part of a FogDB setup, certificates will be stored in the database and
be used to check policies.
*/
use std::{collections::{HashMap, BTreeMap}, error::Error, sync::Arc};
use async_trait::async_trait;
use cursor::{DbQuery, CursorQuery};
use fog_pack::{entry::EntryRef, error::Error as FogError, schema::Schema, types::*, document::Document};
use group::GroupSpec;
use thiserror::Error;
pub mod gate;
pub mod cert;
pub mod group;
pub mod transaction;
pub mod cursor;
/// Network connection information
pub struct NetInfo {
/// Local database connection
pub db: bool,
/// Network within the currently running machine
pub machine: bool,
/// Direct machine-to-machine communication
pub direct: bool,
/// Local network
pub local: bool,
/// Regional (municipal, large corporate, etc.) network
pub regional: bool,
/// The global internet
pub global: bool,
/// Some other, specific network, with optional additional network information
pub other: BTreeMap<String, BTreeMap<String, String>>,
}
/// Information about a connecting node. Includes the source network type from
/// which the connection was made, and optionally the Identities used by the
/// node.
pub struct NodeInfo {
/// The network info for this node
pub net: NetType,
/// Long-term Identity, notionally tied to the user of the node
pub perm_id: Option<Identity>,
/// Ephemeral Identity, notionally tied to the node itself
pub eph_id: Option<Identity>,
}
/// An origin address for a database node on the network.
///
/// This address is generally unique, and at the very least the node's intent is
/// to act as though it is unique.
#[derive(Clone, Debug, Hash, PartialEq, Eq)]
pub struct NodeAddr {
/// Long-term Identity, notionally tied to the user of the node
pub perm_id: Identity,
/// Ephemeral Identity, notionally tied to the node itself
pub eph_id: Identity,
}
/// An error from trying to convert a [`NodeInfo`] into a [`NodeAddr`].
#[derive(Clone, Debug, PartialEq, Eq, Error)]
pub enum NodeConvertError {
#[error("Missing permanent ID")]
MissingPermId,
#[error("Missing ephemeral ID")]
MissingEphId,
}
impl TryFrom<NodeInfo> for NodeAddr {
type Error = NodeConvertError;
fn try_from(value: NodeInfo) -> Result<Self, Self::Error> {
let perm_id = value.perm_id.ok_or(NodeConvertError::MissingPermId)?;
let eph_id = value.eph_id.ok_or(NodeConvertError::MissingEphId)?;
Ok(Self {
perm_id,
eph_id
})
}
}
/// A network type
pub enum NetType {
Db,
Machine,
Direct,
Local,
Regional,
Global,
Other(String),
}
/// A fundamental database error has occurred. Usually means the database must
/// be closed and access halted.
#[non_exhaustive]
pub enum DbError {
/// Internal Database error
Internal(Box<dyn Error>),
/// Error occurred while handling a fog-pack document
FogDoc {
context: String,
doc: Hash,
err: FogError,
},
/// Error occurred while handling a fog-pack entry
FogEntry {
context: String,
entry: EntryRef,
err: FogError,
},
/// Some other fog-pack related error occurred
FogOther { context: String, err: FogError },
}
type DbResult<T> = Result<T, Box<DbError>>;
/// An implementation of a fog-pack database. Provides cursor, transaction,
/// schema, group, and name access.
///
/// - Transactions may be executed upon by calling [`Db::txn`].
/// - Groups may be opened through the database by calling [`Db::group`].
/// - Schemas may be added, retrieved, and removed from the database.
/// - Name-to-Document mappings may be added, retrieved, and removed from the
/// database. These mappings function as the roots of the database's
/// Document tree, pinning documents to the database.
pub trait Db {
/// Start a new transaction with this database
fn txn(&self) -> transaction::Transaction;
/// Open a new group through this database
fn group(&self, spec: GroupSpec) -> Box<dyn group::Group>;
/// Open a local cursor on this database
fn cursor(&self) -> cursor::NewCursor;
/// Get a document directly from the database
fn doc_get(&self, doc: &Hash) -> DbResult<Option<Arc<Document>>>;
/// Make a query directly on the database
fn query(&self, doc: &Hash, query: DbQuery) -> Box<dyn CursorQuery>;
/// Get a schema in the database
fn schema_get(&self, schema: &Hash) -> DbResult<Option<Arc<Schema>>>;
/// Add a schema to the database. Fails if the schema document wasn't valid.
fn schema_add(&self, schema: Arc<Document>) -> DbResult<Result<Arc<Schema>, FogError>>;
/// Remove a schema from the database. Returns false if the schema wasn't in the database.
fn schema_del(&self, schema: &Hash) -> DbResult<bool>;
/// Get a list of all schemas in the database.
fn schema_list(&self) -> Vec<Hash>;
/// Get a hash associated with a name in the database.
fn name_get(&self, name: &str) -> DbResult<Option<Hash>>;
/// Add a name-to-hash mapping to the database. This pins the document
/// inside the database, once it's been added. This should be done before
/// adding the document in a transaction. Returns the previous hash, if
/// there was one.
fn name_add(&self, name: &str, hash: &Hash) -> DbResult<Option<Hash>>;
/// Remove a name-hash mapping from the database, returning None if there
/// wasn't one stored.
fn name_del(&self, schema: &Hash) -> DbResult<Option<Hash>>;
/// Get a list of all named documents in the database.
fn name_list(&self) -> Vec<(String, Hash)>;
}
/// A connection to the database through which a transaction can be committed.
#[async_trait]
pub trait DbCommit {
async fn commit(
self: Box<Self>,
docs: HashMap<Hash, transaction::DocChange>,
entries: HashMap<EntryRef, transaction::EntryChange>,
) -> DbResult<Result<(), transaction::CommitErrors>>;
/// Get a schema in the database
fn schema_get(&self, schema: &Hash) -> DbResult<Option<Arc<Schema>>>;
/// Get a document directly from the database
fn doc_get(&self, doc: &Hash) -> DbResult<Option<Arc<Document>>>;
}