Skip to main content

soil_network/
lib.rs

1// This file is part of Soil.
2
3// Copyright (C) Soil contributors.
4// Copyright (C) Parity Technologies (UK) Ltd.
5// SPDX-License-Identifier: GPL-3.0-or-later WITH Classpath-exception-2.0
6
7#![warn(unused_extern_crates)]
8#![warn(missing_docs)]
9
10//! Substrate-specific P2P networking.
11//!
12//! **Important**: This crate is unstable and the API and usage may change.
13//!
14//! # Node identities and addresses
15//!
16//! In a decentralized network, each node possesses a network private key and a network public key.
17//! In Substrate, the keys are based on the ed25519 curve.
18//!
19//! From a node's public key, we can derive its *identity*. In Substrate and libp2p, a node's
20//! identity is represented with the [`PeerId`] struct. All network communications between nodes on
21//! the network use encryption derived from both sides's keys, which means that **identities cannot
22//! be faked**.
23//!
24//! A node's identity uniquely identifies a machine on the network. If you start two or more
25//! clients using the same network key, large interferences will happen.
26//!
27//! # Substrate's network protocol
28//!
29//! Substrate's networking protocol is based upon libp2p. It is at the moment not possible and not
30//! planned to permit using something else than the libp2p network stack and the rust-libp2p
31//! library. However the libp2p framework is very flexible and the rust-libp2p library could be
32//! extended to support a wider range of protocols than what is offered by libp2p.
33//!
34//! ## Discovery mechanisms
35//!
36//! In order for our node to join a peer-to-peer network, it has to know a list of nodes that are
37//! part of said network. This includes nodes identities and their address (how to reach them).
38//! Building such a list is called the **discovery** mechanism. There are three mechanisms that
39//! Substrate uses:
40//!
41//! - Bootstrap nodes. These are hard-coded node identities and addresses passed alongside with
42//! the network configuration.
43//! - mDNS. We perform a UDP broadcast on the local network. Nodes that listen may respond with
44//! their identity. More info [here](https://github.com/libp2p/specs/blob/master/discovery/mdns.md).
45//! mDNS can be disabled in the network configuration.
46//! - Kademlia random walk. Once connected, we perform random Kademlia `FIND_NODE` requests on the
47//! configured Kademlia DHTs (one per configured chain protocol) in order for nodes to propagate to
48//! us their view of the network. More information about Kademlia can be found [on
49//! Wikipedia](https://en.wikipedia.org/wiki/Kademlia).
50//!
51//! ## Connection establishment
52//!
53//! When node Alice knows node Bob's identity and address, it can establish a connection with Bob.
54//! All connections must always use encryption and multiplexing. While some node addresses (eg.
55//! addresses using `/quic`) already imply which encryption and/or multiplexing to use, for others
56//! the **multistream-select** protocol is used in order to negotiate an encryption layer and/or a
57//! multiplexing layer.
58//!
59//! The connection establishment mechanism is called the **transport**.
60//!
61//! As of the writing of this documentation, the following base-layer protocols are supported by
62//! Substrate:
63//!
64//! - TCP/IP for addresses of the form `/ip4/1.2.3.4/tcp/5`. Once the TCP connection is open, an
65//! encryption and a multiplexing layer are negotiated on top.
66//! - WebSockets for addresses of the form `/ip4/1.2.3.4/tcp/5/ws`. A TCP/IP connection is open and
67//! the WebSockets protocol is negotiated on top. Communications then happen inside WebSockets data
68//! frames. Encryption and multiplexing are additionally negotiated again inside this channel.
69//! - DNS for addresses of the form `/dns/example.com/tcp/5` or `/dns/example.com/tcp/5/ws`. A
70//! node's address can contain a domain name.
71//! - (All of the above using IPv6 instead of IPv4.)
72//!
73//! On top of the base-layer protocol, the [Noise](https://noiseprotocol.org/) protocol is
74//! negotiated and applied. The exact handshake protocol is experimental and is subject to change.
75//!
76//! The following multiplexing protocols are supported:
77//!
78//! - [Yamux](https://github.com/hashicorp/yamux/blob/master/spec.md).
79//!
80//! ## Substreams
81//!
82//! Once a connection has been established and uses multiplexing, substreams can be opened. When
83//! a substream is open, the **multistream-select** protocol is used to negotiate which protocol
84//! to use on that given substream.
85//!
86//! Protocols that are specific to a certain chain have a `<protocol-id>` in their name. This
87//! "protocol ID" is defined in the chain specifications. For example, the protocol ID of Polkadot
88//! is "dot". In the protocol names below, `<protocol-id>` must be replaced with the corresponding
89//! protocol ID.
90//!
91//! > **Note**: It is possible for the same connection to be used for multiple chains. For example,
92//! > one can use both the `/dot/sync/2` and `/sub/sync/2` protocols on the same
93//! > connection, provided that the remote supports them.
94//!
95//! Substrate uses the following standard libp2p protocols:
96//!
97//! - **`/ipfs/ping/1.0.0`**. We periodically open an ephemeral substream in order to ping the
98//! remote and check whether the connection is still alive. Failure for the remote to reply leads
99//! to a disconnection.
100//! - **[`/ipfs/id/1.0.0`](https://github.com/libp2p/specs/tree/master/identify)**. We
101//! periodically open an ephemeral substream in order to ask information from the remote.
102//! - **[`/<protocol_id>/kad`](https://github.com/libp2p/specs/pull/108)**. We periodically open
103//! ephemeral substreams for Kademlia random walk queries. Each Kademlia query is done in a
104//! separate substream.
105//!
106//! Additionally, Substrate uses the following non-libp2p-standard protocols:
107//!
108//! - **`/substrate/<protocol-id>/<version>`** (where `<protocol-id>` must be replaced with the
109//! protocol ID of the targeted chain, and `<version>` is a number between 2 and 6). For each
110//! connection we optionally keep an additional substream for all Substrate-based communications
111//! alive. This protocol is considered legacy, and is progressively being replaced with
112//! alternatives. This is designated as "The legacy Substrate substream" in this documentation. See
113//! below for more details.
114//! - **`/<protocol-id>/sync/2`** is a request-response protocol (see below) that lets one perform
115//! requests for information about blocks. Each request is the encoding of a `BlockRequest` and
116//! each response is the encoding of a `BlockResponse`, as defined in the `api.v1.proto` file in
117//! this source tree.
118//! - **`/<protocol-id>/light/2`** is a request-response protocol (see below) that lets one perform
119//! light-client-related requests for information about the state. Each request is the encoding of
120//! a `light::Request` and each response is the encoding of a `light::Response`, as defined in the
121//! `light.v1.proto` file in this source tree.
122//! - **`/<protocol-id>/transactions/1`** is a notifications protocol (see below) where
123//! transactions are pushed to other nodes. The handshake is empty on both sides. The message
124//! format is a SCALE-encoded list of transactions, where each transaction is an opaque list of
125//! bytes.
126//! - **`/<protocol-id>/block-announces/1`** is a notifications protocol (see below) where
127//! block announces are pushed to other nodes. The handshake is empty on both sides. The message
128//! format is a SCALE-encoded tuple containing a block header followed with an opaque list of
129//! bytes containing some data associated with this block announcement, e.g. a candidate message.
130//! - Notifications protocols that are registered using
131//! `NetworkConfiguration::notifications_protocols`. For example: `/paritytech/grandpa/1`. See
132//! below for more information.
133//!
134//! ## The legacy Substrate substream
135//!
136//! Substrate uses a component named the **peerset manager (PSM)**. Through the discovery
137//! mechanism, the PSM is aware of the nodes that are part of the network and decides which nodes
138//! we should perform Substrate-based communications with. For these nodes, we open a connection
139//! if necessary and open a unique substream for Substrate-based communications. If the PSM decides
140//! that we should disconnect a node, then that substream is closed.
141//!
142//! For more information about the PSM, see the *sc-peerset* crate.
143//!
144//! Note that at the moment there is no mechanism in place to solve the issues that arise where the
145//! two sides of a connection open the unique substream simultaneously. In order to not run into
146//! issues, only the dialer of a connection is allowed to open the unique substream. When the
147//! substream is closed, the entire connection is closed as well. This is a bug that will be
148//! resolved by deprecating the protocol entirely.
149//!
150//! Within the unique Substrate substream, messages encoded using
151//! [*parity-scale-codec*](https://github.com/paritytech/parity-scale-codec) are exchanged.
152//! The detail of theses messages is not totally in place, but they can be found in the
153//! `message.rs` file.
154//!
155//! Once the substream is open, the first step is an exchange of a *status* message from both
156//! sides, containing information such as the chain root hash, head of chain, and so on.
157//!
158//! Communications within this substream include:
159//!
160//! - Syncing. Blocks are announced and requested from other nodes.
161//! - Light-client requests. When a light client requires information, a random node we have a
162//! substream open with is chosen, and the information is requested from it.
163//! - Gossiping. Used for example by grandpa.
164//!
165//! ## Request-response protocols
166//!
167//! A so-called request-response protocol is defined as follow:
168//!
169//! - When a substream is opened, the opening side sends a message whose content is
170//! protocol-specific. The message must be prefixed with an
171//! [LEB128-encoded number](https://en.wikipedia.org/wiki/LEB128) indicating its length. After the
172//! message has been sent, the writing side is closed.
173//! - The remote sends back the response prefixed with a LEB128-encoded length, and closes its
174//! side as well.
175//!
176//! Each request is performed in a new separate substream.
177//!
178//! ## Notifications protocols
179//!
180//! A so-called notifications protocol is defined as follow:
181//!
182//! - When a substream is opened, the opening side sends a handshake message whose content is
183//! protocol-specific. The handshake message must be prefixed with an
184//! [LEB128-encoded number](https://en.wikipedia.org/wiki/LEB128) indicating its length. The
185//! handshake message can be of length 0, in which case the sender has to send a single `0`.
186//! - The receiver then either immediately closes the substream, or answers with its own
187//! LEB128-prefixed protocol-specific handshake response. The message can be of length 0, in which
188//! case a single `0` has to be sent back.
189//! - Once the handshake has completed, the notifications protocol is unidirectional. Only the
190//! node which initiated the substream can push notifications. If the remote wants to send
191//! notifications as well, it has to open its own undirectional substream.
192//! - Each notification must be prefixed with an LEB128-encoded length. The encoding of the
193//! messages is specific to each protocol.
194//! - Either party can signal that it doesn't want a notifications substream anymore by closing
195//! its writing side. The other party should respond by closing its own writing side soon after.
196//!
197//! The API of `soil-network` allows one to register user-defined notification protocols.
198//! `soil-network` automatically tries to open a substream towards each node for which the legacy
199//! Substream substream is open. The handshake is then performed automatically.
200//!
201//! For example, the `soil-grandpa` crate registers the `/paritytech/grandpa/1`
202//! notifications protocol.
203//!
204//! At the moment, for backwards-compatibility, notification protocols are tied to the legacy
205//! Substrate substream. Additionally, the handshake message is hardcoded to be a single 8-bits
206//! integer representing the role of the node:
207//!
208//! - 1 for a full node.
209//! - 2 for a light node.
210//! - 4 for an authority.
211//!
212//! In the future, though, these restrictions will be removed.
213//!
214//! # Usage
215//!
216//! Using the `soil-network` crate is done through the [`NetworkWorker`] struct. Create this
217//! struct by passing a [`config::Params`], then poll it as if it was a `Future`. You can extract an
218//! `Arc<NetworkService>` from the `NetworkWorker`, which can be shared amongst multiple places
219//! in order to give orders to the networking.
220//!
221//! See the [`config`] module for more information about how to configure the networking.
222//!
223//! After the `NetworkWorker` has been created, the important things to do are:
224//!
225//! - Calling `NetworkWorker::poll` in order to advance the network. This can be done by
226//! dispatching a background task with the [`NetworkWorker`].
227//! - Calling `on_block_import` whenever a block is added to the client.
228//! - Calling `on_block_finalized` whenever a block is finalized.
229//! - Calling `trigger_repropagate` when a transaction is added to the pool.
230//!
231//! More precise usage details are still being worked on and will likely change in the future.
232
233extern crate self as soil_network;
234
235mod behaviour;
236mod bitswap;
237pub mod common;
238pub mod light;
239mod litep2p;
240pub mod mixnet;
241mod protocol;
242
243#[cfg(test)]
244mod mock;
245
246pub mod config;
247pub mod discovery;
248pub mod error;
249pub mod event;
250pub mod gossip;
251pub mod network_state;
252pub mod peer_info;
253pub mod peer_store;
254pub mod protocol_controller;
255pub mod request_responses;
256pub mod service;
257pub mod statement;
258pub mod statement_store;
259pub mod sync;
260pub mod transactions;
261pub mod transport;
262pub mod types;
263pub mod utils;
264
265pub use crate::litep2p::Litep2pNetworkBackend;
266pub use crate::types::{
267	multiaddr::{self, Multiaddr},
268	PeerId,
269};
270pub use common::{
271	role::{ObservedRole, Roles},
272	types::ReputationChange,
273};
274pub use event::{DhtEvent, Event};
275#[doc(inline)]
276pub use request_responses::{Config, IfDisconnected, RequestFailure};
277pub use service::{
278	metrics::NotificationMetrics,
279	signature::Signature,
280	traits::{
281		KademliaKey, MessageSink, NetworkBackend, NetworkBlock, NetworkDHTProvider,
282		NetworkEventStream, NetworkPeers, NetworkRequest, NetworkSigner, NetworkStateInfo,
283		NetworkStatus, NetworkStatusProvider, NetworkSyncForkRequest, NotificationConfig,
284		NotificationSender as NotificationSenderT, NotificationSenderError,
285		NotificationSenderReady, NotificationService,
286	},
287	DecodingError, Keypair, NetworkService, NetworkWorker, NotificationSender, OutboundFailure,
288	PublicKey,
289};
290pub use types::ProtocolName;
291
292/// Log target for `soil-network`.
293const LOG_TARGET: &str = "sub-libp2p";
294
295/// The maximum allowed number of established connections per peer.
296///
297/// Typically, and by design of the network behaviours in this crate,
298/// there is a single established connection per peer. However, to
299/// avoid unnecessary and nondeterministic connection closure in
300/// case of (possibly repeated) simultaneous dialing attempts between
301/// two peers, the per-peer connection limit is not set to 1 but 2.
302const MAX_CONNECTIONS_PER_PEER: usize = 2;
303
304/// The maximum number of concurrent established connections that were incoming.
305const MAX_CONNECTIONS_ESTABLISHED_INCOMING: u32 = 10_000;
306
307/// Maximum response size limit.
308pub const MAX_RESPONSE_SIZE: u64 = 16 * 1024 * 1024;