<?xml version="1.0" encoding="UTF-8" standalone="no" ?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>chapter-4</title>
<link href="../Styles/stylesheet.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<div class="readable-text " id="p1">
<h1 class=" readable-text-h1"><span class="chapter-title-numbering"><span class="num-string">4</span></span> <span>Replication</span></h1>
</div>
<div class="readable-text" id="p85">
<h2 class=" readable-text-h2">4.8 Putting it together: Replicating a key-value store</h2>
</div>
<div class="readable-text " id="p86">
<p>In the previous chapter, we examined how colocating data with compute using SQLite reduces latency compared to a more traditional client-server approach with PostgreSQL. However, if you need multiple copies of the data, replication is necessary.</p>
</div>
<div class="readable-text intended-text" id="p87">
<p>In this section, we will implement a simple replicated in-memory key-value store using a primary-replica architecture, where writes only occur on the primary, which actively broadcasts updates to the replicas, allowing them to read values locally.</p>
</div>
<div class="readable-text intended-text" id="p88">
<p>We start by implementing an in-memory key-value store using a hash map:</p>
</div>
<div class="browsable-container listing-container" id="p89">
<div class="code-area-container">
<pre class="code-area">pub struct KVStore {
data: Mutex<HashMap<String, String>>,
}
impl struct KVStore {
pub fn get(&self, key: &str) -> Option<String> {
self.data.lock().unwrap().get(key).cloned()
}
pub fn put(&self, key: String, value: String) {
self.data.lock().unwrap().insert(key, value);
}
}
</pre>
</div>
</div>
<div class="readable-text " id="p90">
<p>The <kbd>KVStore </kbd>struct provides the backing store we use in both the primary and replica servers. In a real system, you could use a database that offers durability, such as SQLite or another embedded database.</p>
</div>
<div class="readable-text intended-text" id="p91">
<p>We then specify the replication protocol we use between the primary and the replica:</p>
</div>
<div class="browsable-container listing-container" id="p92">
<div class="code-area-container">
<pre class="code-area">pub enum Message {
Put { key: String, value: String },
Join { replica_addr: String },
Snapshot { entries: Vec<(String, String)> },
}
</pre>
</div>
</div>
<div class="readable-text " id="p93">
<p>The primary broadcasts the <kbd>Put</kbd> message to the replicas to actively push changes. A replica joins the primary replica set by sending a Join message with its network address. When a primary receives a <kbd>Join</kbd> message, it sends a snapshot of its data to the replica via a <kbd>Snapshot</kbd> message. The snapshot ensures that the replica has all the data the primary has and can, therefore, incrementally receive new updates.</p>
</div>
<div class="readable-text intended-text" id="p94">
<p>We also implement a shell for the primary server that users can use to manipulate the state:</p>
</div>
<div class="browsable-container listing-container" id="p95">
<div class="code-area-container">
<pre class="code-area">loop {
match rl.readline("primary> ") {
Ok(line) => {
let parts: Vec<&str> = line.trim().split_whitespace().collect();
match parts.as_slice() {
["PUT", key, value] => {
storage.put(key.to_string(), value.to_string());
let message = Message::Put {
key: key.to_string(),
value: value.to_string(),
};
broadcast(&replicas, &message);
}
["GET", key] => match storage.get(key) {
Some(value) => println!("{} -> {}", key, value),
None => println!("{} -> Not found", key),
},
}
}
}
}
</pre>
</div>
</div>
<div class="readable-text " id="p96">
<p>And also one for the replica:</p>
</div>
<div class="browsable-container listing-container" id="p97">
<div class="code-area-container">
<pre class="code-area">loop {
match rl.readline("replica> ") {
Ok(line) => {
let parts: Vec<&str> = line.trim().split_whitespace().collect();
match parts.as_slice() {
["GET", key] => match storage.get(key) {
Some(value) => println!("{} -> {}", key, value),
None => println!("{} -> Not found", key),
},
["JOIN", host_port] => {
let message = Message::Join {
replica_addr: replica_addr.clone(),
};
join_primary(host_port, &storage, &message);
}
}
}
}
</pre>
</div>
</div>
<div class="readable-text " id="p98">
<p>You can find the complete example of this in <a href="https://github.com/penberg/latency-book/tree/main/chapter-04/rust/replication-kv">https://github.com/penberg/latency-book/tree/main/chapter-04/rust/replication-kv</a>.</p>
</div>
<div class="readable-text intended-text" id="p99">
<p>First, let's start up the primary server:</p>
</div>
<div class="browsable-container listing-container" id="p100">
<div class="code-area-container">
<pre class="code-area">$ cargo run --bin primary
Primary server ready (port 8080)
Commands:
- PUT <key> <value>
- GET <key>
- EXIT
</pre>
</div>
</div>
<div class="readable-text " id="p101">
<p>We now have a shell that allows us to interact with the primary key-value store. The server is also listening to port 8080 for incoming messages from replicas.</p>
</div>
<div class="readable-text intended-text" id="p102">
<p>Now let's insert some data into the primary key-value store:</p>
</div>
<div class="browsable-container listing-container" id="p103">
<div class="code-area-container">
<pre class="code-area">primary> PUT a 1
OK a = 1
primary> PUT b 2
OK b = 2
</pre>
</div>
</div>
<div class="readable-text " id="p104">
<p>We now have (a, 1) and (b, 2) key-value pairs stored in the primary.</p>
</div>
<div class="readable-text intended-text" id="p105">
<p>Next, let's start up a replica server:</p>
</div>
<div class="browsable-container listing-container" id="p106">
<div class="code-area-container">
<pre class="code-area">$ cargo run --bin replica
Replica ready (port 8081)
Commands:
- GET <key>
- JOIN <host:port>
- EXIT
</pre>
</div>
</div>
<div class="readable-text " id="p107">
<p>The replica server has a similar shell to the primary server, but with slightly different commands. When the replica starts up, it is not automatically connected to the primary. Therefore, reading from the replica yields no results:</p>
</div>
<div class="browsable-container listing-container" id="p108">
<div class="code-area-container">
<pre class="code-area">replica> GET a
a -> Not found
replica> GET b
b -> Not found
</pre>
</div>
</div>
<div class="readable-text " id="p109">
<p>We can register the replica to the replication set, with the <kbd>JOIN</kbd> command as follows:</p>
</div>
<div class="browsable-container listing-container" id="p110">
<div class="code-area-container">
<pre class="code-area">replica> JOIN localhost:8080
Snapshot received
</pre>
</div>
</div>
<div class="readable-text " id="p111">
<p>When the replica registers itself to the replication set of the primary, it receives a snapshot of the current primary key-value store. So now, if we read from the replica, we can see the same values:</p>
</div>
<div class="browsable-container listing-container" id="p112">
<div class="code-area-container">
<pre class="code-area">replica> GET a
a -> 1
replica> GET b
b -> 2
</pre>
</div>
</div>
<div class="readable-text " id="p113">
<p>As the replica is now part of the primary's replication set, writing to the primary:</p>
</div>
<div class="browsable-container listing-container" id="p114">
<div class="code-area-container">
<pre class="code-area">primary> PUT c 3
OK c = 3
</pre>
</div>
</div>
<div class="readable-text " id="p115">
<p>is pushed to the replica server:</p>
</div>
<div class="browsable-container listing-container" id="p116">
<div class="code-area-container">
<pre class="code-area">replica> GET c
c -> 3
</pre>
</div>
</div>
<div class="readable-text " id="p117">
<p>That's it, we now have an example of a primary/replica architecture in action. Of course, the only guarantee we provide is eventual consistency, without any durability guarantees, and we assume no errors. However, despite these shortcomings, the approach illustrates how the flow of real-world replication systems works, even though they utilize, for example, distributed consensus algorithms to provide fault tolerance and durability. Furthermore, if you are building an application, you will probably be using the replication feature of your database or key-value store that is built into it.</p>
</div>
<div class="readable-text" id="p118">
<h2 class=" readable-text-h2"><span>4.9 Summary</span></h2>
</div>
<ul>
<li class="readable-text" id="p119">Replicating data has multiple benefits reducing latency, improving reliability and availability, and helping prevent data loss due to hardware failures, power outages, or other unforeseen circumstances.</li>
<li class="readable-text" id="p120">Strong consistency (linearizability) is the gold standard of data access, whereas eventual consistency is the weakest consistency model. Between the two, there are other consistency levels such as causal and session consistency with different trade-offs.</li>
<li class="readable-text" id="p121">Replication can either be based on single-leader, multiple leaders, or be leaderless, depending on the requirements. Replication can also be synchronous or asynchronous, which impacts latency and consistency.</li>
</ul>
</body>
</html>