forest/
documentation.rs

1// Copyright 2019-2025 ChainSafe Systems
2// SPDX-License-Identifier: Apache-2.0, MIT
3
4//! This is an empty module for documentation purposes.
5//!
6//! Documentation of core concepts belong in-tree (in `/src`).
7//!
8//! Unless it better fits on a specific component, this module is a the place for
9//! documentation about:
10//! - The current behavior of Forest
11//! - Filecoin concepts
12//!
13//! Documentation that doesn't fit the above should be in `/documentation`:
14//! - User-facing guides.
15//! - Developer _process_ guides, i.e memory profiling, release checklists.
16
17/// This is a ground-up introduction to the different kinds of snapshot files,
18/// covering:
19/// 1. [Actors in Filecoin](#actors).
20/// 2. [The Filecoin Blockchain](#the-filecoin-blockchain)
21/// 3. [The Filecoin State Tree](#the-filecoin-state-tree)
22/// 4. (Finally) [snapshots](#snapshots)
23///
24/// # Actors
25///
26/// The Filecoin Virtual Machine (FVM) hosts a number of _actors_.
27/// These are objects that maintain and mutate internal state, and communicate
28/// by passing messages.
29///
30/// An example of an actor is the [`cron`](fil_actors_shared::v11::runtime::builtins::Type::Cron)
31/// actor.
32/// Its [internal state](fil_actor_cron_state::v11::State) is a to-do list of
33/// other actors to invoke every epoch.
34///
35/// See [the Filecoin docs](https://docs.filecoin.io/basics/the-blockchain/actors)
36/// for more information about actors.
37///
38/// # The Filecoin blockchain
39///
40/// Filecoin consists of a blockchain of `messages`.
41/// Listed below are the core objects for the blockchain.
42/// Each one can be addressed by a [`Cid`](cid::Cid).
43///
44/// - [`Message`](crate::shim::message::Message)s are statements of messages between
45///   the actors.
46///   They describe and (equivalently) represent a change in _the state tree_ (see below).
47///   See [`apply_block_messages`](crate::state_manager::apply_block_messages) to learn
48///   more.
49///   Messages may be [signed](crate::message::SignedMessage).
50/// - `Message`s are grouped into [`Block`](crate::blocks::Block)s, with a single
51///   [`header`](crate::blocks::RawBlockHeader).
52///   These are what are mined by miners to get `FIL` (money).
53///   They define an [_epoch_](crate::blocks::RawBlockHeader::epoch) and a
54///   [_parent tipset_](crate::blocks::RawBlockHeader::parents).
55///   The _epoch_ is a monotonically increasing number from `0` (genesis).
56/// - `Block`s are grouped into [`Tipset`](crate::blocks::Tipset)s.
57///   All blocks in a tipset share the same `epoch`.
58///
59/// ```text
60///      ┌───────────────────────────────┐
61///      │ BlockHeader { epoch:  0, .. } │ //  The genesis block/tipset
62///   ┌● └───────────────────────────────┘
63///   ~
64///   └──┬───────────────────────────────┐
65///      │ BlockHeader { epoch: 10, .. } │ // The epoch 10 tipset - one block with two messages
66///   ┌● └┬──────────────────────────────┘
67///   │   │
68///   │   │ "I contain the following messages..."
69///   │   │
70///   │   ├──────────────────┐
71///   │   │ ┌──────────────┐ │ ┌───────────────────┐
72///   │   └►│ Message:     │ └►│ Message:          │
73///   │     │  Afri -> Bob │   │  Charlie -> David │
74///   │     └──────────────┘   └───────────────────┘
75///   │
76///   │ "my parent is..."
77///   │
78///   └──┬───────────────────────────────┐
79///      │ BlockHeader { epoch: 11, .. } │ // The epoch 11 tipset - one block with one message
80///   ┌● └┬──────────────────────────────┘
81///   │   │ ┌────────────────┐
82///   │   └►│ Message:       │
83///   │     │  Eric -> Frank │
84///   │     └────────────────┘
85///   │
86///   │ // the epoch 12 tipset - two blocks, with a total of 3 messages
87///   │
88///   ├────────────────────────────────────┐
89///   └──┬───────────────────────────────┐ └─┬───────────────────────────────┐
90///      │ BlockHeader { epoch: 12, .. } │   │ BlockHeader { epoch: 12, .. } │
91///   ┌● └┬──────────────────────────────┘   └┬─────────────────────┬────────┘
92///   ~   │ ┌───────────────────────┐         │ ┌─────────────────┐ │ ┌──────────────┐
93///       └►│ Message:              │         └►│ Message:        │ └►│ Message:     │
94///         │  Guillaume -> Hailong │           │  Hubert -> Ivan │   │  Josh -> Kai │
95///         └───────────────────────┘           └─────────────────┘   └──────────────┘
96/// ```
97///
98/// The [`ChainMuxer`](crate::chain_sync::ChainMuxer) receives two kinds of [messages](crate::libp2p::PubsubMessage)
99/// from peers:
100/// - [`GossipBlock`](crate::blocks::GossipBlock)s are descriptions of a single block, with the `BlockHeader` and `Message` CIDs.
101/// - [`SignedMessage`](crate::message::SignedMessage)s
102///
103/// It assembles these messages into a chain to genesis.
104///
105/// Filecoin implementations store all the above in the `ChainStore`, per
106/// [the spec](https://github.com/filecoin-project/specs/blob/936f07f9a444036fe86442c919940ea0e4fb0a0b/content/systems/filecoin_nodes/repository/ipldstore/_index.md?plain=1#L43-L50).
107///
108/// # The Filecoin state tree
109///
110/// `Message`s describe/represent mutations in the [`StateTree`](crate::shim::state_tree::StateTree),
111/// which is a representation of all Filecoin state at a point in time.
112/// For each actor, the `StateTree` holds the CID for its state: [`ActorState.state`](fvm4::state_tree::ActorState::state).
113///
114/// Actor state is serialized and stored as  [`Ipld`](ipld_core::ipld::Ipld).
115/// Think of this as "JSON with links ([`Cid`](cid::Cid)s)".
116/// So the `cron` actor's state mentioned above will be ultimately serialized into `Ipld`
117/// and stored in the `StateStore`, per
118/// [the spec](https://github.com/filecoin-project/specs/blob/936f07f9a444036fe86442c919940ea0e4fb0a0b/content/systems/filecoin_nodes/repository/ipldstore/_index.md?plain=1#L43-L50).
119///
120/// It isn't feasible to create a new copy of actor states whenever they change.
121/// That is, in a fictional [^1] example of a `cron` actor, starting with a [`crontab`](https://man7.org/linux/man-pages/man5/crontab.5.html)
122/// with 10 items, mutation of the state should _not_ simply duplicate the state:
123/// ```text
124/// Previous state             Current state
125/// ┌───────────────────────┐  ┌───────────────────────┐
126/// │Crontab                │  │Crontab                │
127/// │1. Get out of bed      │  │1. Get out of bed      │
128/// │2. Shower              │  │2. Shower              │
129/// │...                    │  │...                    │
130/// │10. Take over the world│  │10. Take over the world│
131/// └───────────────────────┘  │11. Throw a party      │
132///                            └───────────────────────┘
133/// ```
134/// But should instead be able to refer to the previous state:
135/// ```text
136/// Previous state             Current state
137/// ┌───────────────────────┐  ┌─────────────────┐
138/// │Crontab                │◄─┤(See CID...)     │
139/// │1. Get out of bed      │  ├─────────────────┤
140/// │2. Shower              │  │11. Throw a party│
141/// │...                    │  └─────────────────┘
142/// │10. Take over the world│
143/// └───────────────────────┘
144/// ```
145/// And removal of e.g the latest entry works similarly, _orphaning_ the removed
146/// item.
147/// ```text
148/// Previous state             Orphaned item        Current state
149/// ┌───────────────────────┐                       ┌────────────┐
150/// │Crontab                │◄──────────────────────┤(See CID...)│
151/// │1. Get out of bed      │  ┌─────────────────┐  └────────────┘
152/// │2. Shower              │  │11. Throw a party│
153/// │...                    │  └─────────────────┘
154/// │10. Take over the world│
155/// └───────────────────────┘
156/// ```
157///
158/// [^1]: The real `cron` actor doesn't mutate state like this.
159///
160/// Data structures that reach into the past of the `StateStore` like this are:
161/// - ["AMT"](fil_actors_shared::fvm_ipld_amt), a list.
162/// - ["HAMT"](fil_actors_shared::fvm_ipld_hamt), a map.
163///
164/// Therefore, the Filecoin state is, indeed, a tree of IPLD data.
165/// It can be addressed by the root of the tree, so it is often referred to as
166/// the _state root_.
167///
168/// We will now introduce some new terminology given the above information.
169///
170/// With respect to a particular IPLD [`Blockstore`](fvm_ipld_blockstore::Blockstore):
171/// - An item such a list is _fully inhabited_ if all its recursive
172///   [`Ipld::Link`](ipld_core::ipld::Ipld::Link)s exist in the blockstore.
173/// - Otherwise, an item is only _partially inhabited_.
174///   The links are said to be "dead links".
175///
176/// With respect to a particular `StateTree`:
177/// - An item is _orphaned_ if it is not reachable from the current state tree
178///   through any links.
179///
180/// # Snapshots
181///
182/// Recall that for each message execution, the state tree is mutated.
183/// Therefore, each epoch is associated with a state tree after execution,
184/// and a [_parent state tree_](crate::blocks::RawBlockHeader::state_root).
185///
186/// ```text
187///                                            // state after execution of
188///                                            // all messages in that epoch
189///      ┌───────────────────────────────┐ ┌────────────┐
190///      │ BlockHeader { epoch:  0, .. } │ │ state root ├──► initial actor states...
191///   ┌● └───────────────────────────────┘ └────────────┘                    ▲   ▲
192///   ~                                        // links to redundant data ─● │   │
193///   └──┬───────────────────────────────┐ ┌────────────┐                    │   │
194///      │ BlockHeader { epoch: 11, .. } │ │ state root ├─┬► actor state ─► AMT  │
195///   ┌● └┬──────────────────────────────┘ └────────────┘ ~                      │
196///   │   │ ┌─────────┐                                   └► actor state ─► HAMT ┘
197///   │   └►│ Message │                                                      │
198///   │     └─────────┘                                                      ▼
199///   ├──┬───────────────────────────────┐     // new data in this epoch ─● IPLD
200///   │  │ BlockHeader { epoch: 12, .. } │
201///   │  └┬─────────────┬────────────────┘
202///   │   │ ┌─────────┐ │ ┌─────────┐
203///   │   └►│ Message │ └►│ Message │
204///   │     └─────────┘   └─────────┘                                        ~   ~
205///   └──┬───────────────────────────────┐ ┌────────────┐                    │   │
206///      │ BlockHeader { epoch: 12, .. } │ │ state root ├─┬► actor state ─► AMT  │
207///   ┌● └┬──────────────────────────────┘ └────────────┘ ~                      │
208///   ~   │ ┌─────────┐                                   └► actor state ─► HAMT ┘
209///       └►│ Message │
210///         └─────────┘
211/// ```
212///
213/// We are now ready to define the different snapshot types for a given epoch N.
214/// - A _lite snapshot_ contains:
215///   - All block headers from genesis to epoch N.
216///   - For the last W (width) epochs:
217///     - The _fully inhabited_ state trees.
218///     - The messages.
219///   - For epochs 0..N-W, the state trees will be dead or partially inhabited.
220/// - A _full snapshot_ contains:
221///   - All block headers from genesis to epoch N.
222///   - The fully inhabited state trees for epoch 0..N
223/// - A _diff snapshot_ contains:
224///   - For epoch N-W..N:
225///     - The block headers.
226///     - The messages.
227///     - New data in that epoch, which will be partially inhabited
228///
229/// Successive diff snapshots may be concatenated:
230/// - From genesis, to produce a full snapshot.
231/// - From a lite snapshot, to produce a successive lite snapshot.
232mod snapshots {}