inc_complete/lib.rs
1//! # Inc-Complete
2//!
3//! inc-complete is a library for incremental compilation supporting serialization from the ground
4//! up.
5//!
6//! In inc-complete, a central `Db` object is used to query and cache the result of _pure_
7//! functions. The functions being pure is key. If there are side-effects performed then
8//! they will not be re-performed when the computation's result is later cached and returned again.
9//!
10//! Before we create the `Db` object however, we need to define the storage type to store all
11//! the computations we want to cache. In inc-complete, each computation is its own type and is
12//! either an input (if it has no dependencies) or an intermediate computation. For this example
13//! we're going to model the following spreadsheet:
14//!
15//! ```text
16//! [ A ] [ B ]
17//! [ 1 ] [ 12 ] [ =A1 + 8 ]
18//! [ 2 ] [ 4 ] [ =B1 + A2 ]
19//! ```
20//!
21//! We will have two inputs: `A1` and `A2`, and two intermediates: `B1` and `B2` where
22//! `B1` depends on `A1` and `B2` depends on `B1` and `A2` directly, and `A1` transitively.
23//! All computations in inc-complete correspond to a struct representing that computation.
24//! This struct often holds the arguments for the corresponding function to perform but in
25//! this case none of our cells need any parameters besides the values of other cells to query.
26//!
27//! For each input type we'll derive `Input` and give it a unique id, along with
28//! the output type of the function retrieving the input value, and the storage type we'll store
29//! all the metadata in - we'll define this type later on.
30//!
31//! Let's start by defining our input types:
32//!
33//! ```
34//! # struct MySpreadsheet;
35//! use inc_complete::{ Input, define_input };
36//!
37//! ##[derive(Input, Clone)]
38//! ##[inc_complete(id = 0, output = i32, storage = MySpreadsheet)]
39//! struct A1;
40//!
41//! ##[derive(Input, Clone)]
42//! ##[inc_complete(id = 1, output = i32, storage = MySpreadsheet)]
43//! struct A2;
44//! ```
45//!
46//! Next, let's define the intermediate computations and the functions to compute them.
47//! For these we just need to define the computation type and use the `intermediate` macro
48//! on a function to perform that computation. Note that this function should have
49//! exactly two arguments: the computation type itself (which may be a struct containing
50//! more arguments to use if needed), and a `DbHandle` containing your storage type:
51//!
52//! ```
53//! # use inc_complete::{ Input, define_input, Storage, impl_storage };
54//! # #[derive(Input, Clone)]
55//! # #[inc_complete(id = 0, output = i32, storage = MySpreadsheet)]
56//! # struct A1;
57//! # #[derive(Input, Clone)]
58//! # #[inc_complete(id = 1, output = i32, storage = MySpreadsheet)]
59//! # struct A2;
60//! # #[derive(Clone)]
61//! # struct B1;
62//! # use inc_complete::storage::SingletonStorage;
63//! # #[derive(Storage, Default)]
64//! # struct MySpreadsheet {
65//! # a1: SingletonStorage<A1>,
66//! # a2: SingletonStorage<A2>,
67//! # b1: SingletonStorage<B1>,
68//! # b2: SingletonStorage<B2>,
69//! # #[inc_complete(skip)]
70//! # my_metadata: String,
71//! # }
72//! use inc_complete::{ intermediate, define_intermediate };
73//!
74//! ##[intermediate(id = 2)]
75//! fn compute_b1(_ctx: &B1, db: &inc_complete::DbHandle<MySpreadsheet>) -> i32 {
76//! // These functions should be pure but we're going to cheat here to
77//! // make it obvious when a function is recomputed
78//! println!("Computing B1!");
79//! db.get(A1) + 8
80//! }
81//!
82//! ##[derive(Clone)]
83//! struct B2;
84//!
85//! ##[intermediate(id = 2)]
86//! fn compute_b2(_ctx: &B2, db: &inc_complete::DbHandle<MySpreadsheet>) -> i32 {
87//! println!("Computing B2!");
88//! db.get(B1) + db.get(A2)
89//! }
90//! ```
91//!
92//! Ceremony aside - this code should be relatively straight-forward. We `get` the value of
93//! any sub-computations we need and the `DbHandle` object automatically gives us the most
94//! up to date version of those computations - we'll examine this claim a bit closer later.
95//!
96//! Now we can define the actual storage type to hold all our computations.
97//!
98//! ```
99//! # #[derive(Clone)]
100//! # struct A1;
101//! # #[derive(Clone)]
102//! # struct A2;
103//! # #[derive(Clone, PartialEq, Eq, Hash)]
104//! # struct B1;
105//! # #[derive(Clone, PartialEq, Eq, Hash)]
106//! # struct B2;
107//! # use inc_complete::storage::SingletonStorage;
108//! # use inc_complete::define_input;
109//! # define_input!(0, A1 -> i64, MySpreadsheet);
110//! # define_input!(1, A2 -> i64, MySpreadsheet);
111//! # use inc_complete::define_intermediate;
112//! # define_intermediate!(2, B1 -> i64, MySpreadsheet, |_b1, db| {
113//! # // These functions should be pure but we're going to cheat here to
114//! # // make it obvious when a function is recomputed
115//! # println!("Computing B1!");
116//! # db.get(A1) + 8
117//! # });
118//! # // Larger programs may wish to pass an existing function instead of a closure
119//! # define_intermediate!(3, B2 -> i64, MySpreadsheet, |_b2, db| {
120//! # println!("Computing B2!");
121//! # db.get(B1) + db.get(A2)
122//! # });
123//! use inc_complete::{ Storage, impl_storage };
124//!
125//! ##[derive(Storage, Default)]
126//! struct MySpreadsheet {
127//! a1: SingletonStorage<A1>,
128//! a2: SingletonStorage<A2>,
129//! b1: SingletonStorage<B1>,
130//! b2: SingletonStorage<B2>,
131//!
132//! // If you need to store non-computation data you can use the `skip` attribute.
133//! // Just be careful since changes to this data will not be tracked by inc-complete,
134//! // it is possible to break incrementality! To avoid this, avoid mutating any `skip`
135//! // fields in the middle of an incremental computation.
136//! #[inc_complete(skip)]
137//! my_metadata: String,
138//! }
139//! ```
140//!
141//! In this example, we're using `SingletonStorage` for all of our
142//! computations because all of `A1`, `A2`, `B1`, and `B2` are singleton values like `()` with
143//! only a single value in their type. This lets us store them with an `Option<T>` instead of a
144//! `HashMap<K, V>`. If you are unsure which storage type to choose, `HashMapStorage<T>`
145//! is a good default. Even if used on singletons it will give you correct
146//! behavior, just with slightly worse performance than `SingletonStorage<T>`.
147//!
148//! With that out of the way though, we can finally create our `Db`, set the initial values for our
149//! inputs, and run our program:
150//!
151//! ```
152//! # #[derive(Clone)]
153//! # struct A1;
154//! # #[derive(Clone)]
155//! # struct A2;
156//! # #[derive(Clone, PartialEq, Eq, Hash)]
157//! # struct B1;
158//! # #[derive(Clone, PartialEq, Eq, Hash)]
159//! # struct B2;
160//! # use inc_complete::storage::SingletonStorage;
161//! # #[derive(Default)]
162//! # struct MySpreadsheet {
163//! # a1: SingletonStorage<A1>,
164//! # a2: SingletonStorage<A2>,
165//! # b1: SingletonStorage<B1>,
166//! # b2: SingletonStorage<B2>,
167//! # }
168//! # use inc_complete::{ define_input, impl_storage, define_intermediate };
169//! # // This macro may be replaced with a derive proc macro in the future
170//! # impl_storage!(MySpreadsheet,
171//! # a1: A1,
172//! # a2: A2,
173//! # b1: B1,
174//! # b2: B2,
175//! # );
176//! # define_input!(0, A1 -> i64, MySpreadsheet);
177//! # define_input!(1, A2 -> i64, MySpreadsheet);
178//! # define_intermediate!(2, B1 -> i64, MySpreadsheet, |_b1, db| {
179//! # // These functions should be pure but we're going to cheat here to
180//! # // make it obvious when a function is recomputed
181//! # println!("Computing B1!");
182//! # db.get(A1) + 8
183//! # });
184//! # // Larger programs may wish to pass an existing function instead of a closure
185//! # define_intermediate!(3, B2 -> i64, MySpreadsheet, |_b2, db| {
186//! # println!("Computing B2!");
187//! # db.get(B1) + db.get(A2)
188//! # });
189//! type SpreadsheetDb = inc_complete::Db<MySpreadsheet>;
190//!
191//! fn main() {
192//! let mut db = SpreadsheetDb::new();
193//! db.update_input(A1, 12);
194//! db.update_input(A2, 4);
195//!
196//! // Output:
197//! // Computing B2!
198//! // Computing B1!
199//! let b2 = db.get(B2);
200//! assert_eq!(b2, 24);
201//!
202//! // No output, result of B2 is cached
203//! let b2 = db.get(B2);
204//! assert_eq!(b2, 24);
205//!
206//! // Now lets update an input
207//! db.update_input(A2, 10);
208//!
209//! // B2 is now stale and gets recomputed, but crucially B1
210//! // does not depend on A2 and does not get recomputed.
211//! // Output:
212//! // Computing B2!
213//! let b2 = db.get(B2);
214//! assert_eq!(b2, 30);
215//! }
216//! ```
217//!
218//! ...And that's it for basic usage! If you want to delve deeper you can manually implement
219//! `Storage` for your storage type or `StorageFor` to define a new storage type for a single input
220//! (like `SingletonStorage` or `HashMapStorage` which inc-complete defines).
221//!
222//! This example did not show it but you can also use structs with fields in your computations, e.g:
223//!
224//! ```
225//! use inc_complete::{ storage::HashMapStorage, impl_storage, define_intermediate };
226//!
227//! #[derive(Default)]
228//! struct MyStorageType {
229//! fibs: HashMapStorage<Fibonacci>,
230//! }
231//!
232//! // The trailing `,` is required after each field type
233//! impl_storage!(MyStorageType, fibs: Fibonacci,);
234//!
235//! // a fibonacci function with cached sub-results
236//! #[derive(Clone, PartialEq, Eq, Hash)]
237//! struct Fibonacci(u32);
238//!
239//! define_intermediate!(0, Fibonacci -> u32, MyStorageType, |fib, db| {
240//! let x = fib.0;
241//! if x <= 1 {
242//! x
243//! } else {
244//! // Not exponential time since each sub-computation will be cached!
245//! // In addition to `db.get(mytype)` we can also do `mytype.get(db)`
246//! Fibonacci(x - 1).get(db) + Fibonacci(x - 2).get(db)
247//! }
248//! });
249//! ```
250//!
251//! These fields often correspond to parameters of the function being modeled, in
252//! this case the integer input to `fibonacci`.
253//!
254//! ## Serialization
255//!
256//! Serialization can be done by serializing the `Db<S>` object. All cached computations
257//! are stored in the provided storage type `S` so it is up to users to decide how they
258//! want to serialize this storage. As a starting point, it is recommended to tag a field
259//! with `#[serde(default)]` when a new field is added to keep serialization backwards-compatible
260//! when deserializing previous versions of `S`. See the source file `tests/serialize.rs`
261//! as an example of this.
262mod cell;
263#[macro_use]
264pub mod storage;
265pub mod accumulate;
266mod db;
267mod interned;
268
269pub use cell::Cell;
270pub use db::{Db, DbGet, DbHandle};
271pub use storage::{ComputationId, OutputType, Run, Storage, StorageFor};
272
273pub use inc_complete_derive::{Input, Storage, intermediate};