Skip to main content

dbsp/
lib.rs

1//! The `dbsp` crate implements a computational engine for continuous analysis
2//! of changing data.  With DBSP, a programmer writes code in terms of
3//! computations on a complete data set, but DBSP implements it incrementally,
4//! meaning that changes to the data set run in time proportional to the size of
5//! the change rather than the size of the data set.  This is a major advantage
6//! for applications that work with large data sets that change frequently in
7//! small ways.
8//!
9//! The [`tutorial`] is a good place to start for a guided tour.  After that, if
10//! you want to look through the API on your own, the [`circuit`] module is a
11//! reasonable starting point.  For complete examples, visit the [`examples`][1]
12//! directory in the DBSP repository.
13//!
14//! [1]: https://github.com/feldera/feldera/tree/main/crates/dbsp/examples
15//!
16//! # Theory
17//!
18//! DBSP is underpinned by a formal theory:
19//!
20//! - [Budiu, Chajed, McSherry, Ryzhyk, Tannen. DBSP: Automatic Incremental View
21//!   Maintenance for Rich Query Languages, Conference on Very Large Databases, August
22//!   2023, Vancouver, Canada](https://www.feldera.com/vldb23.pdf)
23//!
24//! - Here is [a presentation about DBSP](https://www.youtube.com/watch?v=iT4k5DCnvPU)
25//!   at the 2023 Apache Calcite Meetup.
26//!
27//! The model provides two things:
28//!
29//! 1. **Semantics.** DBSP defines a formal language of streaming operators and
30//!    queries built out of these operators, and precisely specifies how these
31//!    queries must transform input streams to output streams.
32//!
33//! 2. **Algorithm.** DBSP also gives an algorithm that takes an arbitrary query
34//!    and generates an incremental dataflow program that implements this query
35//!    correctly (in accordance with its formal semantics) and efficiently. Efficiency
36//!    here means, in a nutshell, that the cost of processing a set of
37//!    input events is proportional to the size of the input rather than the entire
38//!    state of the database.
39//!
40//! # Crate overview
41//!
42//! This crate consists of several layers.
43//!
44//! * [`dynamic`] - Types and traits that support dynamic dispatch.  We heavily rely on
45//!   dynamic dispatch to limit the amount of monomorphization performed by the compiler when
46//!   building complex dataflow graphs, balancing compilation speed and runtime performance.
47//!   This module implements the type machinery necessary to support this architecture.
48//!
49//! * [`typed_batch`] - Strongly type wrappers around dynamically typed batches and traces.
50//!
51//! * [`trace`] - This module implements batches and traces, which are core DBSP data structures
52//!   that represent tables, indexes and changes to tables and indexes.  We provide both in-memory
53//!   and persistent batch and trace implementations.
54//!
55//! * [`operator::dynamic`] - Dynamically typed operator API. Operators transform data streams
56//!   (usually carrying data in the form of batches and traces).  DBSP provides many relational
57//!   operators, such as map, filter, aggregate, join, etc.  The operator API in this module is
58//!   dynamically typed and unsafe.
59//!
60//! * [`operator`] - Statically typed wrappers around the dynamic API in [`operator::dynamic`].
61
62#![allow(clippy::type_complexity)]
63
64// allow referring to self as ::dbsp for macros to work universally (from this crate and from others)
65// see https://github.com/rust-lang/rust/issues/54647
66extern crate self as dbsp;
67
68pub mod dynamic;
69mod error;
70mod hash;
71mod num_entries;
72//mod ref_pair;
73
74pub mod typed_batch;
75
76#[macro_use]
77pub mod circuit;
78pub mod algebra;
79pub mod ir;
80pub mod mimalloc;
81pub mod monitor;
82pub mod operator;
83pub mod profile;
84pub mod storage;
85pub mod time;
86pub mod trace;
87pub mod utils;
88
89#[cfg(feature = "backend-mode")]
90pub mod mono;
91
92pub use crate::{
93    error::{DetailedError, Error},
94    hash::{default_hash, default_hasher},
95    num_entries::NumEntries,
96};
97// // pub use crate::ref_pair::RefPair;
98pub use crate::time::Timestamp;
99
100pub use algebra::{DynZWeight, ZWeight};
101
102pub use circuit::{
103    ChildCircuit, Circuit, CircuitHandle, DBSPHandle, NestedCircuit, RootCircuit, Runtime,
104    RuntimeError, SchedulerError, Stream, WeakRuntime,
105};
106#[cfg(not(feature = "backend-mode"))]
107pub use operator::FilterMap;
108pub use operator::{
109    CmpFunc, OrdPartitionedIndexedZSet, OutputHandle,
110    input::{IndexedZSetHandle, InputHandle, MapHandle, SetHandle, ZSetHandle},
111};
112pub use trace::{DBData, DBWeight, cursor::Position};
113pub use typed_batch::{
114    Batch, BatchReader, FallbackKeyBatch, FallbackValBatch, FallbackWSet, FallbackZSet,
115    FileIndexedWSet, FileIndexedZSet, FileKeyBatch, FileValBatch, FileWSet, FileZSet, IndexedZSet,
116    IndexedZSetReader, OrdIndexedWSet, OrdIndexedZSet, OrdWSet, OrdZSet, Trace, TypedBox, ZSet,
117};
118
119#[cfg(doc)]
120pub mod tutorial;
121
122// TODO: import from `circuit`.
123pub type Scope = u16;