zlayer-consensus
Shared Raft consensus library built on openraft 0.9 for ZLayer (container orchestration) and Zatabase (database replication).
Architecture
+------------------------------+
| Application Layer |
| (ZLayer scheduler / Zatabase) |
+------+----------+------------+
| |
+------v---+ +----v-----------+
| propose | | read_state() |
| (write) | | (linearizable) |
+------+---+ +----+-----------+
| |
+-------------v----------v-------------+
| ConsensusNode<TC> |
| - bootstrap / add_voter / shutdown |
| - wraps openraft::Raft<TC> |
+---+-------------+--+-----------------+
| | |
+---------v--+ +------v--v--------+
| Log Store | | State Machine |
| (v2 API) | | (v2 API) |
+-----+------+ +--------+---------+
| |
+---------+-------+ +-------+--------+
| MemLogStore | | MemStateMachine| <-- mem-store (default)
| RedbLogStore | | RedbStateMachine| <-- redb-store (optional)
+-----------------+ +----------------+
|
+---------v---------+
| HttpNetwork | <-- postcard2 over HTTP
| (RaftNetworkFactory)|
+-------------------+
|
+---------v-----------+
| raft_service_router | <-- Axum endpoints
| POST /raft/vote |
| POST /raft/append |
| POST /raft/snapshot |
+---------------------+
Quick Start (MemStore for testing)
use Cursor;
use ;
use ;
use HttpNetwork;
use ;
// 1. Define your app types
// 2. Declare the TypeConfig
declare_raft_types!;
// 3. Create stores + network
let log_store = new;
let sm = new;
let network = new;
// 4. Build node
let node = new
.with_config
.build_with
.await?;
// 5. Bootstrap (first node only)
node.bootstrap.await?;
// 6. Propose writes
node.propose.await?;
Production Setup (RedbStore)
Enable the redb-store feature:
[]
= { = "../zlayer-consensus", = ["redb-store"] }
use ;
let log_store = new?;
let sm = new?;
Defining Your TypeConfig
Use openraft::declare_raft_types! to define the type configuration:
declare_raft_types!;
Requirements for D (request type):
Clone + Debug + Default + Serialize + Deserialize + Send + Sync + 'static
Requirements for R (response type):
Clone + Debug + Default + Serialize + Deserialize + Send + Sync + 'static
Configuration Tuning
| Parameter | Default | Description |
|---|---|---|
election_timeout_min_ms |
1500 | Min election timeout (7.5x heartbeat) |
election_timeout_max_ms |
3000 | Max election timeout (15x heartbeat) |
heartbeat_interval_ms |
200 | Leader heartbeat interval |
snapshot_logs_since_last |
10,000 | Entries before triggering snapshot |
max_payload_entries |
300 | Max entries per AppendEntries RPC |
enable_prevote |
true | Prevents partitioned node disruption |
rpc_timeout |
5s | Timeout for vote/append RPCs |
snapshot_timeout |
60s | Timeout for snapshot transfers |
WAN deployments: Multiply all timeouts by 3-5x.
Performance Characteristics
- Serialization: postcard2 (70-90% smaller than JSON, 4x faster)
- Persistent storage: redb (~15K durable writes/sec on SSD)
- In-memory storage: Limited only by memory bandwidth
- Network: HTTP/1.1 with connection pooling (reqwest)
- PreVote: Enabled by default to prevent term inflation from partitioned nodes
Storage API
This crate uses the openraft v2 storage API (RaftLogStorage + RaftStateMachine)
which splits log and state machine operations into separate traits for better concurrency.
This is NOT the deprecated v1 RaftStorage + Adaptor pattern.
Future: QUIC Upgrade
The HTTP transport is suitable for LAN and moderate-WAN deployments. For high-throughput WAN scenarios, a QUIC-based transport will provide:
- Multiplexed streams (no head-of-line blocking)
- 0-RTT connection establishment
- Built-in encryption without TLS handshake overhead
- Better congestion control for lossy networks
This is planned as an additional network backend alongside HTTP.
License
Apache-2.0