inc_complete/
lib.rs

1//! # Inc-Complete
2//!
3//! inc-complete is a library for incremental compilation supporting serialization from the ground
4//! up.
5//!
6//! In inc-complete, a central `Db` object is used to query and cache the result of _pure_
7//! functions. The functions being pure is key. If there are side-effects performed then
8//! they will not be re-performed when the computation's result is later cached and returned again.
9//!
10//! Before we create the `Db` object however, we need to define the storage type to store all
11//! the computations we want to cache. In inc-complete, each computation is its own type and is
12//! either an input (if it has no dependencies) or an intermediate computation. For this example
13//! we're going to model the following spreadsheet:
14//!
15//! ```text
16//!       [  A  ] [     B    ]
17//! [ 1 ] [ 12  ] [ =A1 + 8  ]
18//! [ 2 ] [  4  ] [ =B1 + A2 ]
19//! ```
20//!
21//! We will have two inputs: `A1` and `A2`, and two intermediates: `B1` and `B2` where
22//! `B1` depends on `A1` and `B2` depends on `B1` and `A2` directly, and `A1` transitively.
23//! All computations in inc-complete correspond to a struct representing that computation.
24//! This struct often holds the arguments for the corresponding function to perform but in
25//! this case none of our cells need any parameters besides the values of other cells to query.
26//!
27//! For each input type we'll derive `Input` and give it a unique id, along with 
28//! the output type of the function retrieving the input value, and the storage type we'll store
29//! all the metadata in - we'll define this type later on.
30//! 
31//! Let's start by defining our input types:
32//!
33//! ```
34//! # struct MySpreadsheet;
35//! use inc_complete::{ Input, define_input };
36//!
37//! ##[derive(Input, Clone)]
38//! ##[inc_complete(id = 0, output = i32, storage = MySpreadsheet)]
39//! struct A1;
40//!
41//! ##[derive(Input, Clone)]
42//! ##[inc_complete(id = 1, output = i32, storage = MySpreadsheet)]
43//! struct A2;
44//! ```
45//!
46//! Next, let's define the intermediate computations and the functions to compute them.
47//! For these we just need to define the computation type and use the `intermediate` macro 
48//! on a function to perform that computation. Note that this function should have
49//! exactly two arguments: the computation type itself (which may be a struct containing
50//! more arguments to use if needed), and a `DbHandle` containing your storage type:
51//!
52//! ```
53//! # use inc_complete::{ Input, define_input, Storage, impl_storage };
54//! # #[derive(Input, Clone)]
55//! # #[inc_complete(id = 0, output = i32, storage = MySpreadsheet)]
56//! # struct A1;
57//! # #[derive(Input, Clone)]
58//! # #[inc_complete(id = 1, output = i32, storage = MySpreadsheet)]
59//! # struct A2;
60//! # #[derive(Clone)]
61//! # struct B1;
62//! # use inc_complete::storage::SingletonStorage;
63//! # #[derive(Storage, Default)]
64//! # struct MySpreadsheet {
65//! #     a1: SingletonStorage<A1>,
66//! #     a2: SingletonStorage<A2>,
67//! #     b1: SingletonStorage<B1>,
68//! #     b2: SingletonStorage<B2>,
69//! #     #[inc_complete(skip)]
70//! #     my_metadata: String,
71//! # }
72//! use inc_complete::{ intermediate, define_intermediate };
73//!
74//! ##[intermediate(id = 2)]
75//! fn compute_b1(_ctx: &B1, db: &inc_complete::DbHandle<MySpreadsheet>) -> i32 {
76//!     // These functions should be pure but we're going to cheat here to
77//!     // make it obvious when a function is recomputed
78//!     println!("Computing B1!");
79//!     db.get(A1) + 8
80//! }
81//!
82//! ##[derive(Clone)]
83//! struct B2;
84//!
85//! ##[intermediate(id = 2)]
86//! fn compute_b2(_ctx: &B2, db: &inc_complete::DbHandle<MySpreadsheet>) -> i32 {
87//!     println!("Computing B2!");
88//!     db.get(B1) + db.get(A2)
89//! }
90//! ```
91//!
92//! Ceremony aside - this code should be relatively straight-forward. We `get` the value of
93//! any sub-computations we need and the `DbHandle` object automatically gives us the most
94//! up to date version of those computations - we'll examine this claim a bit closer later.
95//!
96//! Now we can define the actual storage type to hold all our computations. 
97//!
98//! ```
99//! # #[derive(Clone)]
100//! # struct A1;
101//! # #[derive(Clone)]
102//! # struct A2;
103//! # #[derive(Clone, PartialEq, Eq, Hash)]
104//! # struct B1;
105//! # #[derive(Clone, PartialEq, Eq, Hash)]
106//! # struct B2;
107//! # use inc_complete::storage::SingletonStorage;
108//! # use inc_complete::define_input;
109//! # define_input!(0, A1 -> i64, MySpreadsheet);
110//! # define_input!(1, A2 -> i64, MySpreadsheet);
111//! # use inc_complete::define_intermediate;
112//! # define_intermediate!(2, B1 -> i64, MySpreadsheet, |_b1, db| {
113//! #     // These functions should be pure but we're going to cheat here to
114//! #     // make it obvious when a function is recomputed
115//! #     println!("Computing B1!");
116//! #     db.get(A1) + 8
117//! # });
118//! # // Larger programs may wish to pass an existing function instead of a closure
119//! # define_intermediate!(3, B2 -> i64, MySpreadsheet, |_b2, db| {
120//! #     println!("Computing B2!");
121//! #     db.get(B1) + db.get(A2)
122//! # });
123//! use inc_complete::{ Storage, impl_storage };
124//!
125//! ##[derive(Storage, Default)]
126//! struct MySpreadsheet {
127//!     a1: SingletonStorage<A1>,
128//!     a2: SingletonStorage<A2>,
129//!     b1: SingletonStorage<B1>,
130//!     b2: SingletonStorage<B2>,
131//!
132//!     // If you need to store non-computation data you can use the `skip` attribute.
133//!     // Just be careful since changes to this data will not be tracked by inc-complete,
134//!     // it is possible to break incrementality! To avoid this, avoid mutating any `skip`
135//!     // fields in the middle of an incremental computation.
136//!     #[inc_complete(skip)]
137//!     my_metadata: String,
138//! }
139//! ```
140//!
141//! In this example, we're using `SingletonStorage` for all of our
142//! computations because all of `A1`, `A2`, `B1`, and `B2` are singleton values like `()` with
143//! only a single value in their type. This lets us store them with an `Option<T>` instead of a
144//! `HashMap<K, V>`. If you are unsure which storage type to choose, `HashMapStorage<T>`
145//! is a good default. Even if used on singletons it will give you correct
146//! behavior, just with slightly worse performance than `SingletonStorage<T>`.
147//!
148//! With that out of the way though, we can finally create our `Db`, set the initial values for our
149//! inputs, and run our program:
150//!
151//! ```
152//! # #[derive(Clone)]
153//! # struct A1;
154//! # #[derive(Clone)]
155//! # struct A2;
156//! # #[derive(Clone, PartialEq, Eq, Hash)]
157//! # struct B1;
158//! # #[derive(Clone, PartialEq, Eq, Hash)]
159//! # struct B2;
160//! # use inc_complete::storage::SingletonStorage;
161//! # #[derive(Default)]
162//! # struct MySpreadsheet {
163//! #     a1: SingletonStorage<A1>,
164//! #     a2: SingletonStorage<A2>,
165//! #     b1: SingletonStorage<B1>,
166//! #     b2: SingletonStorage<B2>,
167//! # }
168//! # use inc_complete::{ define_input, impl_storage, define_intermediate };
169//! # // This macro may be replaced with a derive proc macro in the future
170//! # impl_storage!(MySpreadsheet,
171//! #     a1: A1,
172//! #     a2: A2,
173//! #     b1: B1,
174//! #     b2: B2,
175//! # );
176//! # define_input!(0, A1 -> i64, MySpreadsheet);
177//! # define_input!(1, A2 -> i64, MySpreadsheet);
178//! # define_intermediate!(2, B1 -> i64, MySpreadsheet, |_b1, db| {
179//! #     // These functions should be pure but we're going to cheat here to
180//! #     // make it obvious when a function is recomputed
181//! #     println!("Computing B1!");
182//! #     db.get(A1) + 8
183//! # });
184//! # // Larger programs may wish to pass an existing function instead of a closure
185//! # define_intermediate!(3, B2 -> i64, MySpreadsheet, |_b2, db| {
186//! #     println!("Computing B2!");
187//! #     db.get(B1) + db.get(A2)
188//! # });
189//! type SpreadsheetDb = inc_complete::Db<MySpreadsheet>;
190//!
191//! fn main() {
192//!     let mut db = SpreadsheetDb::new();
193//!     db.update_input(A1, 12);
194//!     db.update_input(A2, 4);
195//!
196//!     // Output:
197//!     // Computing B2!
198//!     // Computing B1!
199//!     let b2 = db.get(B2);
200//!     assert_eq!(b2, 24);
201//!
202//!     // No output, result of B2 is cached
203//!     let b2 = db.get(B2);
204//!     assert_eq!(b2, 24);
205//!
206//!     // Now lets update an input
207//!     db.update_input(A2, 10);
208//!
209//!     // B2 is now stale and gets recomputed, but crucially B1
210//!     // does not depend on A2 and does not get recomputed.
211//!     // Output:
212//!     // Computing B2!
213//!     let b2 = db.get(B2);
214//!     assert_eq!(b2, 30);
215//! }
216//! ```
217//!
218//! ...And that's it for basic usage! If you want to delve deeper you can manually implement
219//! `Storage` for your storage type or `StorageFor` to define a new storage type for a single input
220//! (like `SingletonStorage` or `HashMapStorage` which inc-complete defines).
221//!
222//! This example did not show it but you can also use structs with fields in your computations, e.g:
223//!
224//! ```
225//! use inc_complete::{ storage::HashMapStorage, impl_storage, define_intermediate };
226//!
227//! #[derive(Default)]
228//! struct MyStorageType {
229//!     fibs: HashMapStorage<Fibonacci>,
230//! }
231//!
232//! // The trailing `,` is required after each field type
233//! impl_storage!(MyStorageType, fibs: Fibonacci,);
234//!
235//! // a fibonacci function with cached sub-results
236//! #[derive(Clone, PartialEq, Eq, Hash)]
237//! struct Fibonacci(u32);
238//!
239//! define_intermediate!(0, Fibonacci -> u32, MyStorageType, |fib, db| {
240//!     let x = fib.0;
241//!     if x <= 1 {
242//!         x
243//!     } else {
244//!         // Not exponential time since each sub-computation will be cached!
245//!         // In addition to `db.get(mytype)` we can also do `mytype.get(db)`
246//!         Fibonacci(x - 1).get(db) + Fibonacci(x - 2).get(db)
247//!     }
248//! });
249//! ```
250//!
251//! These fields often correspond to parameters of the function being modeled, in
252//! this case the integer input to `fibonacci`.
253//!
254//! ## Serialization
255//!
256//! Serialization can be done by serializing the `Db<S>` object. All cached computations
257//! are stored in the provided storage type `S` so it is up to users to decide how they
258//! want to serialize this storage. As a starting point, it is recommended to tag a field
259//! with `#[serde(default)]` when a new field is added to keep serialization backwards-compatible
260//! when deserializing previous versions of `S`. See the source file `tests/serialize.rs`
261//! as an example of this.
262mod cell;
263#[macro_use]
264pub mod storage;
265pub mod accumulate;
266mod db;
267mod interned;
268
269pub use cell::Cell;
270pub use db::{Db, DbGet, DbHandle};
271pub use storage::{ComputationId, OutputType, Run, Storage, StorageFor};
272
273pub use inc_complete_derive::{Input, Storage, intermediate};