malachitebft-core-consensus
Core consensus algorithm for the Malachite BFT consensus engine.
Description
This crate provides a coroutine-based effect system that allows the core consensus algorithm to yield control when it needs external resources, and resume when the environment is ready to provide those resources.
Key Components
Inputenum: A type that represents all possible inputs that can be processed by the consensus coroutine.Effectenum: A type that represents all possible interactions the consensus coroutine might need from its environment.Resumeenum: A type that represents all possible ways to resume the consensus coroutine after handling an effect.Resumabletrait: A trait that connects each effect with its correspondingResumetype.process!macro: A macro that handles starting the coroutine, processing an input, yielding effects, and resuming consensus with appropriate values.
See Appendix A for a detailed explanation of the underlying implementation of coroutine-based effect system using these components.
Input
The Input enum represents all possible inputs that can be processed by the consensus coroutine.
Effect
The Effect enum represents all possible operations that the consensus coroutine might need from the environment.
Resume
The Resume enum represents all possible ways to resume the consensus coroutine after handling an effect.
Values of this type cannot be constructed directly, they can only be created by calling the resume_with method on a Resumable type.
Resumable
The Resumable trait allows creating a Resume value after having processed an effect.
process!
The process! macro is the entry point for feeding inputs to consensus.
We omit its implementation here for brevity, but it handles starting the coroutine, processing an input, yielding effects, and resuming consensus with appropriate values.
One can think of it as a function with the following signature, depending on whether the effect handler is synchronous or asynchronous:
// If the effect handler is synchronous
// If the effect handler is asynchronous
async
Flow
- The application calls
process!with an input, consensus state, metrics, and an effect handler. - This initializes a coroutine which immediately starts processing the input.
- The coroutine runs until it needs something from the environment.
- At that point, the coroutine yields an
Effect(likeSignVoteorGetValue). - The effect handler performs the requested operation.
- For synchronous effects (like
SignVote), the handler immediately resumes the coroutine with the result. - For asynchronous effects (like
GetValue), the handler immediately resumes the coroutine with a()(unit) value, and will typically schedule a background task to provide the result later by feeding it as a new input back to consensus via theprocess!macro.
Example
This example demonstrates a comprehensive integration of Malachite within an asynchronous application architecture using Tokio. It showcases how to handle both synchronous and asynchronous effects while maintaining a clean separation between the consensus algorithm and its environment.
The example implements a consensus node that:
- Listens for network events from peers
- Processes incoming consensus messages
- Handles consensus effects, including asynchronous value building
- Uses a message queue to feed back asynchronous results to the consensus engine
Main loop
The main function establishes:
- A Tokio channel for queueing inputs to be processed by consensus
- A network service for receiving external messages
- The consensus state initialization with application-specific context
The main loop uses tokio::select! to concurrently handle two event sources:
- Incoming network messages (votes, proposals, etc.)
- Internally queued consensus inputs (like asynchronously produced values)
Input processing
The process_input function serves as the entry point for all consensus inputs, whether from the network or internal queues. It:
- Takes the input and consensus state
- Invokes the
process!macro to run the consensus algorithm - Handles any effects yielded by the consensus algorithm using
handle_effect
Effect handling
The handle_effect function demonstrates handling both synchronous and asynchronous effects:
-
Synchronous effects (
SignVote,VerifySignature):- Perform the operation immediately
- Resume consensus with the result directly
-
Asynchronous effects (
GetValue):- Resume consensus immediately without a result to allow it to continue
- Spawn a background task to perform the longer-running operation
- Queue the result as a new input to be processed by consensus later
-
Network communication (
Publish):- Uses the network service to broadcast messages to peers
- Waits for the operation to complete using
.await
use Arc;
use ;
use ;
use ;
async
// Function to process an input for consensus
pub async process!
}
// Method for handling effects
async
Notes
Async/await
The example demonstrates how to integrate Malachite's effect system with Rust's async/await:
- The effect handler is an async function
- Network operations can be awaited
- Long-running operations can be spawned as background tasks
Input queue
The input queue (tx_queue/rx_queue) serves as a crucial mechanism for:
- Feeding asynchronously produced results back to consensus
- Ensuring consensus processes inputs sequentially, even when they're produced concurrently
- Decoupling background tasks from the consensus state machine
Effect handling
The handle_effect function shows:
- Clear pattern matching on different effect types
- Proper resumption with appropriate values
- Background task spawning for asynchronous operations
- Error handling for operations that might fail
Handling of the GetValue effect
The GetValue effect handling is particularly noteworthy:
- It immediately resumes consensus with
()(allowing consensus to continue) - It spawns a background task that:
- Builds a value with a timeout
- When complete, queues a
ProposeValueinput
- The main loop will eventually receive this input from the queue and process it
This pattern allows consensus to make progress while waiting for potentially long-running operations like transaction execution and block construction.
Sync vs async boundary
The example elegantly handles the boundary between:
- The synchronous consensus algorithm (which yields effects and expects results)
- The asynchronous application environment (which processes effects using async operations)
This is achieved without requiring the consensus algorithm itself to be aware of async/await or any specific runtime.
Appendix A: Details of the coroutine-based effect system
Let's pretend that we are writing a program that needs to read a file from disk and then broadcast its contents over the network. We will call these operations effects.
If we were expressing this as an interface we might have a Broadcast trait:
and a FileSystem trait:
And our program would look like:
async
The downside of this approach is that we are enforcing the use of async for all effects, which might not be desirable in all cases. Moreover, we are introducing a trait for each effect, which might be cumbersome to maintain. Alternatively, we could use a single trait with multiple methods, but this would make the trait less focused and harder to implement and mock, as we would have to implement all methods even if we only need one.
Instead, let's model our effects as data, and define an Effect enum with a variant per effect:
To model the return value of each effect, we define a Resume enum:
Now, by defining an appropriate perform! macro, we can write a pure version of our program and choose how we want to interpret each effect later:
async
The perform!(effect, pat => expr) macro yields an effect to be performed by the caller (handler) and eventually resumes the program with the value expr extracted by from the Resume value by the pattern pat.
We can now choose how we want interpret each of these effects when we run our program.
For instance, we could actually perform these effects against the network and the filesystem:
async
async
Or we can perform these effects against a mock file system and network, and for this we don't need to use async at all:
Here we see one other advantage of modeling effects this way over using traits: **we can leave it up to the caller to decide whether or not to perform each effect in a sync or async context, instead of enforcing either with a trait (as methods in traits cannot be both sync and async).
The main drawback of this approach is that it is possible to resume the program using the wrong type of data:
This program will crash at runtime with UnexpectedResume error telling us that the ReadFile effect expected to be resumed with FileContents and not Sent.
To mitigate this issue, we can define a Resumable trait that connects each effect with its corresponding Resume type:
We then define a new type per resume type and implement Resumable for each:
We can now embed these types in each variant of the Effect enum:
Note that these resume_with types are private and cannot be constructed directly, they can only be accessed by extracting them from an Effect variant.
In the effect handler, we can now use the Resumable::resume_with method to resume the program with the correct type:
We can now make the Resume type private so that it is impossible to construct it directly, and only the Resumable trait can be used to create it, effectively making it impossible to resume the program with the wrong type of data.