d-engine 0.2.3

Lightweight Raft consensus engine - recommended entry point for most users
Documentation
# Scale from Single-Node to 3-Node Cluster

**Goal**: Dynamically expand your running d-engine from 1 node to 3 nodes without downtime.

---

## Why Scale to 3 Nodes?

| Single Node                        | 3-Node Cluster               |
| ---------------------------------- | ---------------------------- |
| No fault tolerance                 | Tolerates 1 node failure     |
| 1 server                           | 3 servers                    |
| If node crashes → data unavailable | Auto leader re-election      |
| No replication                     | Data replicated across nodes |

**Key principle**: 3 nodes = tolerate 1 failure (quorum = 2 out of 3).

---

## Dynamic Expansion (Zero Downtime)

**Target**: 1 node → 3 nodes (skip 2-node, no fault tolerance)

**What happens**:

1. Node 1 running as single-node cluster (already has data)
2. Start Node 2 → joins as Learner → syncs data → auto-promotes to Voter
3. Start Node 3 immediately → joins and syncs (do NOT stop at 2 nodes)
4. Result: 3-node cluster, Node 1 never restarted

**Why not 2 nodes?** 2-node cluster has zero fault tolerance (quorum = 2, any failure = cluster down). Always use odd numbers: 1, 3, or 5 nodes.

**Example**: `examples/single-node-expansion/`

---

## Prerequisites

- Node 1 running in single-node mode
- 2 additional servers (or terminals for local testing)
- Network connectivity between nodes

---

## Step 1: Start Node 1 (Single-Node)

**Node 1 config** (`config/n1.toml`):

```toml
[cluster]
node_id = 1
listen_address = "0.0.0.0:9081"
initial_cluster = [
    { id = 1, address = "0.0.0.0:9081", role = 2, status = 2 }
]
db_root_dir = "./db"
```

**Config field reference**:

- `role = 2`: Leader (NodeRole: 0=Follower, 1=Candidate, 2=Leader, 3=Learner)
- `status = 2`: ACTIVE (NodeStatus: 0=PROMOTABLE, 1=READ_ONLY, 2=ACTIVE)

**Start**:

```bash
cd examples/single-node-expansion
make build
make start-node1
```

**Expected log**:

```text
[Node 1] Follower → Candidate (term 1)
[Node 1] Candidate → Leader (term 2)
```

Node 1 is now leader, accepting writes.

---

## Step 2: Join Node 2

**Node 2 config** (`config/n2.toml`):

```toml,ignore
[cluster]
node_id = 2
listen_address = "0.0.0.0:9082"
initial_cluster = [
    { id = 1, address = "0.0.0.0:9081", role = 2, status = 2 },  # Existing leader
    { id = 2, address = "0.0.0.0:9082", role = 3, status = 0 },  # Self: Learner
]
db_root_dir = "./db"
```

**Key fields**:

- `role = 3`: Learner (will auto-promote to Voter)
- `status = 0`: PROMOTABLE (learner eligible for promotion)

> **Note**: `status = 0` (PROMOTABLE) means this node is a learner that can be promoted to voter.  
> `status = 2` (ACTIVE) means the node is already a formal voting member (like Node 1).

**Why join as Learner (not Follower)?**

| Join Method              | Safety                                     | Quorum Impact                           |
| ------------------------ | ------------------------------------------ | --------------------------------------- |
| **Learner** (role=3) ✅  | Safe - doesn't affect quorum during sync   | None - promotes after catching up       |
| **Follower** (role=2) ⚠️ | Risky - immediately participates in quorum | High - can slow down writes if unstable |

**IMPORTANT**: Always join new nodes as Learner. Joining as Follower can impact cluster availability if the new node is slow or unstable.

**Start**:

```bash
make join-node2
```

**Expected log**:

```text
[Node 2] Learner → Follower (term 2)
🎊 NODE 2 PROMOTED TO VOTER!
```

**Sync mechanism**: InstallSnapshot (bulk data) + AppendEntries (incremental logs), then auto-promotes to Voter.

---

## Step 3: Join Node 3 (Immediately After Node 2)

**IMPORTANT**: Do NOT stop at 2 nodes. Start Node 3 right after Node 2.

**Node 3 config** (`config/n3.toml`):

```toml,ignore
[cluster]
node_id = 3
listen_address = "0.0.0.0:9083"
initial_cluster = [
    { id = 1, address = "0.0.0.0:9081", role = 2, status = 2 },  # Leader, ACTIVE
    { id = 2, address = "0.0.0.0:9082", role = 1, status = 2 },  # Follower (promoted), ACTIVE
    { id = 3, address = "0.0.0.0:9083", role = 3, status = 0 },  # Self: Learner, PROMOTABLE
]
db_root_dir = "./db"
```

**Key**: Node 2 listed as `role = 1, status = 2` assumes it's already promoted to Follower and ACTIVE.

> **Alternative (safer)**: If unsure about Node 2's promotion status, use conservative config:  
> `{ id = 2, ..., role = 3, status = 0 }` - System will auto-correct if Node 2 is already promoted.

**Start**:

```bash
make join-node3
```

**Result**: 3-node cluster with 1-failure tolerance. Node 1 never restarted.

---

## Verify Cluster

**Check cluster status**:

```bash
# All 3 nodes should be running
ps aux | grep demo
```

**Test replication**:

1. Write data via Node 1 (leader)
2. Read from Node 2 or Node 3
3. Data should be replicated

---

## Test Failover (Optional)

**Kill current leader**:

```bash
# Find leader process (check which node is leader)
ps aux | grep demo | grep 908[1-3]
kill <PID>
```

**Expected behavior** (Raft guarantees):

- Remaining 2 nodes detect leader failure (~1s)
- New leader elected via majority vote (2/3 quorum)
- Cluster continues accepting writes
- ~1-2s downtime during re-election

**Restart killed node**:

```bash
# If killed Node 1
make start-node1
# If killed Node 2/3, they will auto-rejoin
```

Node rejoins as follower, syncs missing data from new leader.

---

## Troubleshooting

**"Node won't join"**:

- Verify Node 1 is running and is leader
- Check network connectivity: `nc -zv 0.0.0.0 9081`
- Check logs for errors

**"No leader elected"**:

- Ensure at least 2 nodes running (quorum)
- Check logs for errors
- Verify addresses in configs match actual IPs

---

## Production Deployment

**For production servers**, update addresses:

```toml
# Node 2 on server 192.168.1.11
[cluster]
node_id = 2
listen_address = "192.168.1.11:9082"
initial_cluster = [
    { id = 1, address = "192.168.1.10:9081", role = 2, status = 2 },
    { id = 2, address = "192.168.1.11:9082", role = 3, status = 0 },
]
db_root_dir = "./db"
```

**Network requirements**:

- < 10ms latency between nodes
- Allow TCP ports 9081-9083
- Deploy across availability zones for fault tolerance

---

## Next Steps

- See `examples/single-node-expansion/README.md` for detailed architecture
- Review [Quick Start Guide]crate::docs::quick_start_5min for embedded mode basics
- Check `examples/three-nodes-standalone/` for direct 3-node deployment

---

**Created**: 2025-12-03  
**Updated**: 2025-12-25  
**Example**: `examples/single-node-expansion/`