inc_complete/
lib.rs

1//! # Inc-Complete
2//!
3//! inc-complete is a library for incremental compilation supporting serialization from the ground
4//! up.
5//!
6//! In inc-complete, a central `Db` object is used to query and cache the result of _pure_
7//! functions. The functions being pure is key. If there are side-effects performed then
8//! they will not be re-performed when the computation's result is later cached and returned again.
9//!
10//! Before we create the `Db` object however, we need to define the storage type to store all
11//! the computations we want to cache. In inc-complete, each computation is its own type and is
12//! either an input (if it has no dependencies) or an intermediate computation. For this example
13//! we're going to model the following spreadsheet:
14//!
15//! ```text
16//!       [  A  ] [     B    ]
17//! [ 1 ] [ 12  ] [ =A1 + 8  ]
18//! [ 2 ] [  4  ] [ =B1 + A2 ]
19//! ```
20//!
21//! We will have two inputs: `A1` and `A2`, and two intermediates: `B1` and `B2` where
22//! `B1` depends on `A1` and `B2` depends on `B1` and `A2` directly, and `A1` transitively.
23//! Let's start by defining these types:
24//!
25//! ```
26//! #[derive(Clone, Debug)]
27//! struct A1;
28//!
29//! #[derive(Clone, Debug)]
30//! struct A2;
31//!
32//! #[derive(Clone, Debug)]
33//! struct B1;
34//!
35//! #[derive(Clone, Debug)]
36//! struct B2;
37//! ```
38//!
39//! The derives are all necessary for some traits we'll implement later.
40//!
41//! Now we can define the actual storage type for all our computations. We need to
42//! call `impl_storage!` to implement a few traits for us. These can also be
43//! implemented manually, but there is little advantage in doing so.
44//!
45//! ```
46//! # #[derive(Clone, Debug)]
47//! # struct A1;
48//! # #[derive(Clone, Debug)]
49//! # struct A2;
50//! # #[derive(Clone, PartialEq, Eq, Hash)]
51//! # struct B1;
52//! # #[derive(Clone, PartialEq, Eq, Hash)]
53//! # struct B2;
54//! # use inc_complete::storage::SingletonStorage;
55//! # use inc_complete::define_input;
56//! # define_input!(0, A1 -> i64, Spreadsheet);
57//! # define_input!(1, A2 -> i64, Spreadsheet);
58//! # use inc_complete::define_intermediate;
59//! # define_intermediate!(2, B1 -> i64, Spreadsheet, |_b1, db| {
60//! #     // These functions should be pure but we're going to cheat here to
61//! #     // make it obvious when a function is recomputed
62//! #     println!("Computing B1!");
63//! #     *db.get(A1) + 8
64//! # });
65//! # // Larger programs may wish to pass an existing function instead of a closure
66//! # define_intermediate!(3, B2 -> i64, Spreadsheet, |_b2, db| {
67//! #     println!("Computing B2!");
68//! #     *db.get(B1) + *db.get(A2)
69//! # });
70//! use inc_complete::impl_storage;
71//!
72//! ##[derive(Default)]
73//! struct Spreadsheet {
74//!     a1: SingletonStorage<A1>,
75//!     a2: SingletonStorage<A2>,
76//!     b1: SingletonStorage<B1>,
77//!     b2: SingletonStorage<B2>,
78//! }
79//!
80//! // This macro may be replaced with a derive proc macro in the future
81//! impl_storage!(Spreadsheet,
82//!     a1: A1,
83//!     a2: A2,
84//!     b1: B1,
85//!     b2: B2,
86//! );
87//! ```
88//!
89//! In this example, we're using `SingletonStorage` for all of our
90//! computations because all of `A1`, `A2`, `B1`, and `B2` are singleton values like `()` with
91//! only a single value in their type. This lets us store them with an `Option<T>` instead of a
92//! `HashMap<K, V>`. If you are unsure which storage type to choose, `HashMapStorage<T>` or
93//! `BTreeMapStorage<T>` are good defaults. Even if used on singletons they will give you correct
94//! behavior, just with slightly worse performance than `SingletonStorage<T>`.
95//!
96//! Next, for `Input` types we now need to define:
97//! 1. What type the input is - for this spreadsheet example all our types are `i64`.
98//! 2. A unique computation id. This id uniquely identifies the particular computation type we
99//!    want to cache. If there are any duplicates inc-complete may call the wrong `run` function
100//!    to update a computation! These are also part of the serialized output so they are important
101//!    to keep stable across changes if you want your serialization to remain backwards-compatible.
102//!
103//! For both of these, we can use the `define_input!` macro:
104//! ```
105//! # #[derive(Clone, Debug)]
106//! # struct A1;
107//! # #[derive(Clone, Debug)]
108//! # struct A2;
109//! # #[derive(Clone, PartialEq, Eq, Hash)]
110//! # struct B1;
111//! # #[derive(Clone, PartialEq, Eq, Hash)]
112//! # struct B2;
113//! # use inc_complete::storage::SingletonStorage;
114//! # use inc_complete::{ impl_storage, DbHandle };
115//! # #[derive(Default)]
116//! # struct Spreadsheet {
117//! #     a1: SingletonStorage<A1>,
118//! #     a2: SingletonStorage<A2>,
119//! #     b1: SingletonStorage<B1>,
120//! #     b2: SingletonStorage<B2>,
121//! # }
122//! # // This macro may be replaced with a derive proc macro in the future
123//! # impl_storage!(Spreadsheet,
124//! #     a1: A1,
125//! #     a2: A2,
126//! #     b1: B1,
127//! #     b2: B2,
128//! # );
129//! # use inc_complete::define_intermediate;
130//! # define_intermediate!(2, B1 -> i64, Spreadsheet, |_b1: &B1, db: &mut DbHandle<Spreadsheet>| {
131//! #     // These functions should be pure but we're going to cheat here to
132//! #     // make it obvious when a function is recomputed
133//! #     println!("Computing B1!");
134//! #     *db.get(A1) + 8
135//! # });
136//! # // Larger programs may wish to pass an existing function instead of a closure
137//! # define_intermediate!(3, B2 -> i64, Spreadsheet, |_b2, db| {
138//! #     println!("Computing B2!");
139//! #     *db.get(B1) + *db.get(A2)
140//! # });
141//! use inc_complete::define_input;
142//!
143//! define_input!(0, A1 -> i64, Spreadsheet);
144//! define_input!(1, A2 -> i64, Spreadsheet);
145//! ```
146//!
147//! For intermediate computations we need to provide all of the above, along with a `run` function
148//! to compute their result. This function will have access to the computation type itself
149//! (which often store parameters as data) and a `DbHandle` object to query sub-computations with:
150//!
151//! ```
152//! # #[derive(Clone, Debug)]
153//! # struct A1;
154//! # #[derive(Clone, Debug)]
155//! # struct A2;
156//! # #[derive(Clone, PartialEq, Eq, Hash)]
157//! # struct B1;
158//! # #[derive(Clone, PartialEq, Eq, Hash)]
159//! # struct B2;
160//! # use inc_complete::storage::SingletonStorage;
161//! # #[derive(Default)]
162//! # struct Spreadsheet {
163//! #     a1: SingletonStorage<A1>,
164//! #     a2: SingletonStorage<A2>,
165//! #     b1: SingletonStorage<B1>,
166//! #     b2: SingletonStorage<B2>,
167//! # }
168//! # // This macro may be replaced with a derive proc macro in the future
169//! # impl_storage!(Spreadsheet,
170//! #     a1: A1,
171//! #     a2: A2,
172//! #     b1: B1,
173//! #     b2: B2,
174//! # );
175//! # use inc_complete::{ define_input, impl_storage };
176//! # define_input!(0, A1 -> i64, Spreadsheet);
177//! # define_input!(1, A2 -> i64, Spreadsheet);
178//! use inc_complete::{ define_intermediate, DbHandle };
179//!
180//! define_intermediate!(2, B1 -> i64, Spreadsheet, |_b1: &B1, db: &mut DbHandle<Spreadsheet>| {
181//!     // These functions should be pure but we're going to cheat here to
182//!     // make it obvious when a function is recomputed
183//!     println!("Computing B1!");
184//!     *db.get(A1) + 8
185//! });
186//!
187//! // Larger programs may wish to pass an existing function instead of a closure
188//! define_intermediate!(3, B2 -> i64, Spreadsheet, |_b2, db| {
189//!     println!("Computing B2!");
190//!     *db.get(B1) + *db.get(A2)
191//! });
192//! ```
193//!
194//! Ceremony aside - this code should be relatively straight-forward. We `get` the value of
195//! any sub-computations we need and the `DbHandle` object automatically gives us the most
196//! up to date version of those computations - we'll examine this claim a bit closer later.
197//!
198//! With that out of the way though, we can finally create our `Db`, set the initial values for our
199//! inputs, and run our program:
200//!
201//! ```
202//! # #[derive(Clone, Debug)]
203//! # struct A1;
204//! # #[derive(Clone, Debug)]
205//! # struct A2;
206//! # #[derive(Clone, PartialEq, Eq, Hash)]
207//! # struct B1;
208//! # #[derive(Clone, PartialEq, Eq, Hash)]
209//! # struct B2;
210//! # use inc_complete::storage::SingletonStorage;
211//! # #[derive(Default)]
212//! # struct Spreadsheet {
213//! #     a1: SingletonStorage<A1>,
214//! #     a2: SingletonStorage<A2>,
215//! #     b1: SingletonStorage<B1>,
216//! #     b2: SingletonStorage<B2>,
217//! # }
218//! # use inc_complete::{ define_input, impl_storage, define_intermediate };
219//! # // This macro may be replaced with a derive proc macro in the future
220//! # impl_storage!(Spreadsheet,
221//! #     a1: A1,
222//! #     a2: A2,
223//! #     b1: B1,
224//! #     b2: B2,
225//! # );
226//! # define_input!(0, A1 -> i64, Spreadsheet);
227//! # define_input!(1, A2 -> i64, Spreadsheet);
228//! # define_intermediate!(2, B1 -> i64, Spreadsheet, |_b1, db| {
229//! #     // These functions should be pure but we're going to cheat here to
230//! #     // make it obvious when a function is recomputed
231//! #     println!("Computing B1!");
232//! #     *db.get(A1) + 8
233//! # });
234//! # // Larger programs may wish to pass an existing function instead of a closure
235//! # define_intermediate!(3, B2 -> i64, Spreadsheet, |_b2, db| {
236//! #     println!("Computing B2!");
237//! #     *db.get(B1) + *db.get(A2)
238//! # });
239//! type SpreadsheetDb = inc_complete::Db<Spreadsheet>;
240//!
241//! fn main() {
242//!     let mut db = SpreadsheetDb::new();
243//!     db.update_input(A1, 12);
244//!     db.update_input(A2, 4);
245//!
246//!     // Output:
247//!     // Computing B2!
248//!     // Computing B1!
249//!     let b2 = *db.get(B2);
250//!     assert_eq!(b2, 24);
251//!
252//!     // No output, result of B2 is cached
253//!     let b2 = *db.get(B2);
254//!     assert_eq!(b2, 24);
255//!
256//!     // Now lets update an input
257//!     db.update_input(A2, 10);
258//!
259//!     // B2 is now stale and gets recomputed, but crucially B1
260//!     // does not depend on A2 and does not get recomputed.
261//!     // Output:
262//!     // Computing B2!
263//!     let b2 = *db.get(B2);
264//!     assert_eq!(b2, 30);
265//! }
266//! ```
267//!
268//! ...And that's it for basic usage! If you want to delve deeper you can manually implement
269//! `Storage` for your storage type or `StorageFor` to define a new storage type for a single input
270//! (like `SingletonStorage`, `HashMapStorage`, or `BTreeMapStorage` which inc-complete defines).
271//!
272//! This example did not show it but you can also use structs with fields in your computations, e.g:
273//!
274//! ```
275//! use inc_complete::{ storage::HashMapStorage, impl_storage, define_intermediate };
276//!
277//! #[derive(Default)]
278//! struct MyStorageType {
279//!     fibs: HashMapStorage<Fibonacci>,
280//! }
281//!
282//! impl_storage!(MyStorageType, fibs: Fibonacci);
283//!
284//! // a fibonacci function with cached sub-results
285//! #[derive(Clone, PartialEq, Eq, Hash)]
286//! struct Fibonacci(u32);
287//!
288//! define_intermediate!(0, Fibonacci -> u32, MyStorageType, |fib, db| {
289//!     let x = fib.0;
290//!     if x <= 1 {
291//!         x
292//!     } else {
293//!         // Not exponential time since each sub-computation will be cached!
294//!         *db.get(Fibonacci(x - 1)) + db.get(Fibonacci(x - 2))
295//!     }
296//! });
297//! ```
298//!
299//! These fields often correspond to parameters of the function being modeled, in
300//! this case the integer input to `fibonacci`.
301//!
302//! ## Serialization
303//!
304//! Serialization can be done by serializing the `Db<S>` object. All cached computations
305//! are stored in the provided storage type `S` so it is up to users to decide how they
306//! want to serialize this storage. As a starting point, it is recommended to tag a field
307//! with `#[serde(default)]` when a new field is added to keep serialization backwards-compatible
308//! when deserializing previous versions of `S`. See the source file `tests/serialize.rs`
309//! as an example of this.
310mod cell;
311#[macro_use]
312pub mod storage;
313mod db;
314mod interned;
315
316pub use cell::Cell;
317pub use db::{Db, DbHandle};
318pub use storage::{ComputationId, OutputType, Run, Storage, StorageFor};
inc_complete/lib.rs

inc_complete/
lib.rs