forest/documentation.rs
1// Copyright 2019-2025 ChainSafe Systems
2// SPDX-License-Identifier: Apache-2.0, MIT
3
4//! This is an empty module for documentation purposes.
5//!
6//! Documentation of core concepts belong in-tree (in `/src`).
7//!
8//! Unless it better fits on a specific component, this module is a the place for
9//! documentation about:
10//! - The current behavior of Forest
11//! - Filecoin concepts
12//!
13//! Documentation that doesn't fit the above should be in `/documentation`:
14//! - User-facing guides.
15//! - Developer _process_ guides, i.e memory profiling, release checklists.
16
17/// This is a ground-up introduction to the different kinds of snapshot files,
18/// covering:
19/// 1. [Actors in Filecoin](#actors).
20/// 2. [The Filecoin Blockchain](#the-filecoin-blockchain)
21/// 3. [The Filecoin State Tree](#the-filecoin-state-tree)
22/// 4. (Finally) [snapshots](#snapshots)
23///
24/// # Actors
25///
26/// The Filecoin Virtual Machine (FVM) hosts a number of _actors_.
27/// These are objects that maintain and mutate internal state, and communicate
28/// by passing messages.
29///
30/// An example of an actor is the [`cron`](fil_actors_shared::v11::runtime::builtins::Type::Cron)
31/// actor.
32/// Its [internal state](fil_actor_cron_state::v11::State) is a to-do list of
33/// other actors to invoke every epoch.
34///
35/// See [the Filecoin docs](https://docs.filecoin.io/basics/the-blockchain/actors)
36/// for more information about actors.
37///
38/// # The Filecoin blockchain
39///
40/// Filecoin consists of a blockchain of `messages`.
41/// Listed below are the core objects for the blockchain.
42/// Each one can be addressed by a [`Cid`](cid::Cid).
43///
44/// - [`Message`](crate::shim::message::Message)s are statements of messages between
45/// the actors.
46/// They describe and (equivalently) represent a change in _the state tree_ (see below).
47/// See [`apply_block_messages`](crate::state_manager::apply_block_messages) to learn
48/// more.
49/// Messages may be [signed](crate::message::SignedMessage).
50/// - `Message`s are grouped into [`Block`](crate::blocks::Block)s, with a single
51/// [`header`](crate::blocks::RawBlockHeader).
52/// These are what are mined by miners to get `FIL` (money).
53/// They define an [_epoch_](crate::blocks::RawBlockHeader::epoch) and a
54/// [_parent tipset_](crate::blocks::RawBlockHeader::parents).
55/// The _epoch_ is a monotonically increasing number from `0` (genesis).
56/// - `Block`s are grouped into [`Tipset`](crate::blocks::Tipset)s.
57/// All blocks in a tipset share the same `epoch`.
58///
59/// ```text
60/// ┌───────────────────────────────┐
61/// │ BlockHeader { epoch: 0, .. } │ // The genesis block/tipset
62/// ┌● └───────────────────────────────┘
63/// ~
64/// └──┬───────────────────────────────┐
65/// │ BlockHeader { epoch: 10, .. } │ // The epoch 10 tipset - one block with two messages
66/// ┌● └┬──────────────────────────────┘
67/// │ │
68/// │ │ "I contain the following messages..."
69/// │ │
70/// │ ├──────────────────┐
71/// │ │ ┌──────────────┐ │ ┌───────────────────┐
72/// │ └►│ Message: │ └►│ Message: │
73/// │ │ Afri -> Bob │ │ Charlie -> David │
74/// │ └──────────────┘ └───────────────────┘
75/// │
76/// │ "my parent is..."
77/// │
78/// └──┬───────────────────────────────┐
79/// │ BlockHeader { epoch: 11, .. } │ // The epoch 11 tipset - one block with one message
80/// ┌● └┬──────────────────────────────┘
81/// │ │ ┌────────────────┐
82/// │ └►│ Message: │
83/// │ │ Eric -> Frank │
84/// │ └────────────────┘
85/// │
86/// │ // the epoch 12 tipset - two blocks, with a total of 3 messages
87/// │
88/// ├────────────────────────────────────┐
89/// └──┬───────────────────────────────┐ └─┬───────────────────────────────┐
90/// │ BlockHeader { epoch: 12, .. } │ │ BlockHeader { epoch: 12, .. } │
91/// ┌● └┬──────────────────────────────┘ └┬─────────────────────┬────────┘
92/// ~ │ ┌───────────────────────┐ │ ┌─────────────────┐ │ ┌──────────────┐
93/// └►│ Message: │ └►│ Message: │ └►│ Message: │
94/// │ Guillaume -> Hailong │ │ Hubert -> Ivan │ │ Josh -> Kai │
95/// └───────────────────────┘ └─────────────────┘ └──────────────┘
96/// ```
97///
98/// The [`ChainMuxer`](crate::chain_sync::ChainMuxer) receives two kinds of [messages](crate::libp2p::PubsubMessage)
99/// from peers:
100/// - [`GossipBlock`](crate::blocks::GossipBlock)s are descriptions of a single block, with the `BlockHeader` and `Message` CIDs.
101/// - [`SignedMessage`](crate::message::SignedMessage)s
102///
103/// It assembles these messages into a chain to genesis.
104///
105/// Filecoin implementations store all the above in the `ChainStore`, per
106/// [the spec](https://github.com/filecoin-project/specs/blob/936f07f9a444036fe86442c919940ea0e4fb0a0b/content/systems/filecoin_nodes/repository/ipldstore/_index.md?plain=1#L43-L50).
107///
108/// # The Filecoin state tree
109///
110/// `Message`s describe/represent mutations in the [`StateTree`](crate::shim::state_tree::StateTree),
111/// which is a representation of all Filecoin state at a point in time.
112/// For each actor, the `StateTree` holds the CID for its state: [`ActorState.state`](fvm4::state_tree::ActorState::state).
113///
114/// Actor state is serialized and stored as [`Ipld`](ipld_core::ipld::Ipld).
115/// Think of this as "JSON with links ([`Cid`](cid::Cid)s)".
116/// So the `cron` actor's state mentioned above will be ultimately serialized into `Ipld`
117/// and stored in the `StateStore`, per
118/// [the spec](https://github.com/filecoin-project/specs/blob/936f07f9a444036fe86442c919940ea0e4fb0a0b/content/systems/filecoin_nodes/repository/ipldstore/_index.md?plain=1#L43-L50).
119///
120/// It isn't feasible to create a new copy of actor states whenever they change.
121/// That is, in a fictional [^1] example of a `cron` actor, starting with a [`crontab`](https://man7.org/linux/man-pages/man5/crontab.5.html)
122/// with 10 items, mutation of the state should _not_ simply duplicate the state:
123/// ```text
124/// Previous state Current state
125/// ┌───────────────────────┐ ┌───────────────────────┐
126/// │Crontab │ │Crontab │
127/// │1. Get out of bed │ │1. Get out of bed │
128/// │2. Shower │ │2. Shower │
129/// │... │ │... │
130/// │10. Take over the world│ │10. Take over the world│
131/// └───────────────────────┘ │11. Throw a party │
132/// └───────────────────────┘
133/// ```
134/// But should instead be able to refer to the previous state:
135/// ```text
136/// Previous state Current state
137/// ┌───────────────────────┐ ┌─────────────────┐
138/// │Crontab │◄─┤(See CID...) │
139/// │1. Get out of bed │ ├─────────────────┤
140/// │2. Shower │ │11. Throw a party│
141/// │... │ └─────────────────┘
142/// │10. Take over the world│
143/// └───────────────────────┘
144/// ```
145/// And removal of e.g the latest entry works similarly, _orphaning_ the removed
146/// item.
147/// ```text
148/// Previous state Orphaned item Current state
149/// ┌───────────────────────┐ ┌────────────┐
150/// │Crontab │◄──────────────────────┤(See CID...)│
151/// │1. Get out of bed │ ┌─────────────────┐ └────────────┘
152/// │2. Shower │ │11. Throw a party│
153/// │... │ └─────────────────┘
154/// │10. Take over the world│
155/// └───────────────────────┘
156/// ```
157///
158/// [^1]: The real `cron` actor doesn't mutate state like this.
159///
160/// Data structures that reach into the past of the `StateStore` like this are:
161/// - ["AMT"](fil_actors_shared::fvm_ipld_amt), a list.
162/// - ["HAMT"](fil_actors_shared::fvm_ipld_hamt), a map.
163///
164/// Therefore, the Filecoin state is, indeed, a tree of IPLD data.
165/// It can be addressed by the root of the tree, so it is often referred to as
166/// the _state root_.
167///
168/// We will now introduce some new terminology given the above information.
169///
170/// With respect to a particular IPLD [`Blockstore`](fvm_ipld_blockstore::Blockstore):
171/// - An item such a list is _fully inhabited_ if all its recursive
172/// [`Ipld::Link`](ipld_core::ipld::Ipld::Link)s exist in the blockstore.
173/// - Otherwise, an item is only _partially inhabited_.
174/// The links are said to be "dead links".
175///
176/// With respect to a particular `StateTree`:
177/// - An item is _orphaned_ if it is not reachable from the current state tree
178/// through any links.
179///
180/// # Snapshots
181///
182/// Recall that for each message execution, the state tree is mutated.
183/// Therefore, each epoch is associated with a state tree after execution,
184/// and a [_parent state tree_](crate::blocks::RawBlockHeader::state_root).
185///
186/// ```text
187/// // state after execution of
188/// // all messages in that epoch
189/// ┌───────────────────────────────┐ ┌────────────┐
190/// │ BlockHeader { epoch: 0, .. } │ │ state root ├──► initial actor states...
191/// ┌● └───────────────────────────────┘ └────────────┘ ▲ ▲
192/// ~ // links to redundant data ─● │ │
193/// └──┬───────────────────────────────┐ ┌────────────┐ │ │
194/// │ BlockHeader { epoch: 11, .. } │ │ state root ├─┬► actor state ─► AMT │
195/// ┌● └┬──────────────────────────────┘ └────────────┘ ~ │
196/// │ │ ┌─────────┐ └► actor state ─► HAMT ┘
197/// │ └►│ Message │ │
198/// │ └─────────┘ ▼
199/// ├──┬───────────────────────────────┐ // new data in this epoch ─● IPLD
200/// │ │ BlockHeader { epoch: 12, .. } │
201/// │ └┬─────────────┬────────────────┘
202/// │ │ ┌─────────┐ │ ┌─────────┐
203/// │ └►│ Message │ └►│ Message │
204/// │ └─────────┘ └─────────┘ ~ ~
205/// └──┬───────────────────────────────┐ ┌────────────┐ │ │
206/// │ BlockHeader { epoch: 12, .. } │ │ state root ├─┬► actor state ─► AMT │
207/// ┌● └┬──────────────────────────────┘ └────────────┘ ~ │
208/// ~ │ ┌─────────┐ └► actor state ─► HAMT ┘
209/// └►│ Message │
210/// └─────────┘
211/// ```
212///
213/// We are now ready to define the different snapshot types for a given epoch N.
214/// - A _lite snapshot_ contains:
215/// - All block headers from genesis to epoch N.
216/// - For the last W (width) epochs:
217/// - The _fully inhabited_ state trees.
218/// - The messages.
219/// - For epochs 0..N-W, the state trees will be dead or partially inhabited.
220/// - A _full snapshot_ contains:
221/// - All block headers from genesis to epoch N.
222/// - The fully inhabited state trees for epoch 0..N
223/// - A _diff snapshot_ contains:
224/// - For epoch N-W..N:
225/// - The block headers.
226/// - The messages.
227/// - New data in that epoch, which will be partially inhabited
228///
229/// Successive diff snapshots may be concatenated:
230/// - From genesis, to produce a full snapshot.
231/// - From a lite snapshot, to produce a successive lite snapshot.
232mod snapshots {}