Expand description
§Three-Node Cluster Example
§Overview
This example demonstrates a production-ready three-node d-engine cluster with automatic failover and data replication. All benchmark data in benches/standalone-bench/reports/ is generated using this configuration.
Key characteristics:
- High availability: Automatic failover on node failure
- Data safety: Replicated across all nodes
- Fault tolerance: Tolerates 1 node failure (quorum = 2/3)
- Performance baseline: Reference configuration for all published benchmarks
§When to Use
Use this example for:
- Production deployments: Mission-critical applications requiring high availability
- Performance benchmarking: Comparing d-engine against other distributed systems
- Load testing: Validating cluster behavior under high throughput
- Development: Testing multi-node features (replication, failover, leader election)
§Quick Start
cd examples/three-nodes-standalone
# Build and start cluster
make start-cluster
# Monitor logs (in separate terminals)
tail -f logs/1/d.log
tail -f logs/2/d.log
tail -f logs/3/d.logThe cluster starts with all three nodes simultaneously:
- Node 1 at
0.0.0.0:9081(metrics: 8081) - Node 2 at
0.0.0.0:9082(metrics: 8082) - Node 3 at
0.0.0.0:9083(metrics: 8083)
§Configuration
Each node uses an identical configuration structure in config/n{1,2,3}.toml:
[cluster]
node_id = 1
listen_address = "0.0.0.0:9081"
initial_cluster = [
{ id = 1, address = "0.0.0.0:9081", role = 1, status = 2 }, # Follower
{ id = 2, address = "0.0.0.0:9082", role = 1, status = 2 }, # Follower
{ id = 3, address = "0.0.0.0:9083", role = 1, status = 2 }, # Follower
]
[raft.read_consistency]
default_policy = "LeaseRead"
lease_duration_ms = 500
[raft.persistence]
strategy = "MemFirst"
flush_policy = { Batch = { threshold = 100, interval_ms = 20 } }
max_buffered_entries = 10000
flush_workers = 4Key differences from single-node expansion:
- All nodes start with
role = 1(Follower) andstatus = 2(Active) - No
initial_cluster_size = 1field (normal 3-node cluster start) - Leader election happens automatically via Raft
§Testing Cluster Behavior
§Verify Leader Election
# Check which node is leader (look for "Became leader" in logs)
grep -r "Became leader" logs/§Test Write Operations
# Write to cluster (automatically routes to leader)
curl -X POST http://localhost:9081/put \
-H "Content-Type: application/json" \
-d '{"key": "test-key", "value": "test-value"}'§Test Failover
# 1. Find current leader
grep -r "Became leader" logs/
# 2. Stop leader node (e.g., if Node 1 is leader)
pkill -f "CONFIG_PATH=config/n1"
# 3. Verify new leader election (within 1-2 seconds)
grep -r "Became leader" logs/Expected behavior:
- Election timeout: 1-2 seconds (configured via
election_timeout_min/max) - New leader emerges: One of the remaining 2 nodes becomes leader
- Cluster remains operational: Accepts writes with 2/3 nodes available
§Test Read Consistency
# LeaseRead (default): Fast reads from leader (500ms lease)
curl http://localhost:9081/get?key=test-key
# LinearizableRead: Strongest consistency (requires quorum confirmation)
curl http://localhost:9081/get?key=test-key&consistency=linearizable§Benchmark Configuration Reference
All performance reports in /benches/standalone-bench/reports use this exact configuration:
Raft settings:
- Persistence:
MemFirstwith batched flush (100 entries, 20ms interval) - Read consistency:
LeaseRead(500ms lease duration) - Replication: Batched append entries (5000 threshold, 0ms delay)
- Network: Tuned for high throughput (see
config/n1.tomlfor details)
Test environment:
- 3-node cluster on localhost
- Benchmark client connects to all 3 nodes (ports 9081/9082/9083)
- Workload: Mixed read/write operations under various concurrency levels
See benches/standalone-bench/README.md for benchmark reproduction steps.
§Advanced Features
§Performance Profiling
Profile cluster under load using Samply:
# Profile all 3 nodes simultaneously
make perf-cluster
# View flame graphs (open in Firefox)
firefox samply_reports/1.profile.json§Tokio Console Monitoring
Monitor async runtime behavior:
# Terminal 1: Start cluster with tokio-console enabled
make tokio-console-cluster
# Terminal 2-4: Connect to each node
tokio-console http://127.0.0.1:6670 # Node 1
tokio-console http://127.0.0.1:6671 # Node 2
tokio-console http://127.0.0.1:6672 # Node 3§Expand to 5-Node Cluster
Add learner nodes for additional fault tolerance:
# Start 3-node cluster first
make start-cluster
# Add learner nodes (in separate terminals)
make join-node4
make join-node5
# Promote learners to voters (via API)
curl -X POST http://localhost:9081/promote -d '{"node_id": 4}'
curl -X POST http://localhost:9081/promote -d '{"node_id": 5}'5-node cluster tolerates 2 simultaneous node failures (quorum = 3/5).
§Common Operations
§Clean Environment
# Remove all build artifacts, logs, and databases
make clean
# Remove only logs and databases (keep build)
make clean-log-db§Manual Node Control
Start nodes individually (useful for debugging):
# Terminal 1
make start-node1
# Terminal 2
make start-node2
# Terminal 3
make start-node3§Change Log Level
# Run with debug logs
LOG_LEVEL=debug make start-cluster
# Run with info logs (default: warn)
LOG_LEVEL=info make start-node1§Troubleshooting
Cluster won’t start:
- Verify
target/release/demoexists (runmake build) - Check port availability:
lsof -i :9081(9082, 9083) - Ensure no stale processes:
pkill -f demo
Leader election fails:
- Verify all 3 nodes are running:
ps aux | grep demo - Check network connectivity in logs:
grep "connection" logs/*/d.log - Confirm config files have matching
initial_clusterentries
Write operations timeout:
- Verify at least 2/3 nodes are running (quorum requirement)
- Check leader exists:
grep "Became leader" logs/*/d.log - Monitor replication lag:
grep "append_entries" logs/*/d.log
§Next Steps
- Single-node expansion: See single-node-expansion.md for dynamic 1→3 scaling
- Run benchmarks: See ../../../../benches/standalone-bench/README.md to reproduce performance tests