pub struct Raft<C: RaftTypeConfig, N: RaftNetworkFactory<C>, S: RaftStorage<C>> { /* private fields */ }
Expand description
The Raft API.
This type implements the full Raft spec, and is the interface to a running Raft node. Applications building on top of Raft will use this to spawn a Raft task and interact with the spawned task.
For more information on the Raft protocol, see the specification here (pdf warning).
For details and discussion on this API, see the Raft API section of the guide.
clone
This type implements Clone
, and should be cloned liberally. The clone itself is very cheap
and helps to facilitate use with async workflows.
shutting down
If any of the interfaces returns a RaftError::ShuttingDown
, this indicates that the Raft node
is shutting down (potentially for data safety reasons due to a storage error), and the
shutdown
method should be called on this type to await the shutdown of the node. If the parent
application needs to shutdown the Raft node for any reason, calling shutdown
will do the
trick.
Implementations§
source§impl<C: RaftTypeConfig, N: RaftNetworkFactory<C>, S: RaftStorage<C>> Raft<C, N, S>
impl<C: RaftTypeConfig, N: RaftNetworkFactory<C>, S: RaftStorage<C>> Raft<C, N, S>
sourcepub async fn new(
id: C::NodeId,
config: Arc<Config>,
network: N,
storage: S
) -> Result<Self, Fatal<C::NodeId>>
pub async fn new( id: C::NodeId, config: Arc<Config>, network: N, storage: S ) -> Result<Self, Fatal<C::NodeId>>
Create and spawn a new Raft task.
id
The ID which the spawned Raft task will use to identify itself within the cluster. Applications must guarantee that the ID provided to this function is stable, and should be persisted in a well known location, probably alongside the Raft log and the application’s state machine. This ensures that restarts of the node will yield the same ID every time.
config
Raft’s runtime config. See the docs on the Config
object for more details.
network
An implementation of the RaftNetworkFactory
trait which will be used by Raft for sending
RPCs to peer nodes within the cluster. See the docs on the RaftNetworkFactory
trait
for more details.
storage
An implementation of the RaftStorage
trait which will be used by Raft for data storage.
See the docs on the RaftStorage
trait for more details.
sourcepub fn enable_tick(&self, enabled: bool)
pub fn enable_tick(&self, enabled: bool)
Enable or disable raft internal ticker.
The internal ticker triggers all timeout based event, e.g. election event or heartbeat event. By disabling the ticker, a follower will not enter candidate again, a leader will not send heartbeat.
pub fn enable_heartbeat(&self, enabled: bool)
pub fn enable_elect(&self, enabled: bool)
sourcepub async fn trigger_elect(&self) -> Result<(), Fatal<C::NodeId>>
pub async fn trigger_elect(&self) -> Result<(), Fatal<C::NodeId>>
Trigger election at once and return at once.
Returns error when RaftCore has Fatal error, e.g. shut down or having storage error.
It is not affected by Raft::enable_elect(false)
.
sourcepub async fn trigger_heartbeat(&self) -> Result<(), Fatal<C::NodeId>>
pub async fn trigger_heartbeat(&self) -> Result<(), Fatal<C::NodeId>>
Trigger a heartbeat at once and return at once.
Returns error when RaftCore has Fatal error, e.g. shut down or having storage error.
It is not affected by Raft::enable_heartbeat(false)
.
sourcepub async fn trigger_snapshot(&self) -> Result<(), Fatal<C::NodeId>>
pub async fn trigger_snapshot(&self) -> Result<(), Fatal<C::NodeId>>
Trigger to build a snapshot at once and return at once.
Returns error when RaftCore has Fatal error, e.g. shut down or having storage error.
sourcepub async fn append_entries(
&self,
rpc: AppendEntriesRequest<C>
) -> Result<AppendEntriesResponse<C::NodeId>, RaftError<C::NodeId>>
pub async fn append_entries( &self, rpc: AppendEntriesRequest<C> ) -> Result<AppendEntriesResponse<C::NodeId>, RaftError<C::NodeId>>
Submit an AppendEntries RPC to this Raft node.
These RPCs are sent by the cluster leader to replicate log entries (§5.3), and are also used as heartbeats (§5.2).
sourcepub async fn vote(
&self,
rpc: VoteRequest<C::NodeId>
) -> Result<VoteResponse<C::NodeId>, RaftError<C::NodeId>>
pub async fn vote( &self, rpc: VoteRequest<C::NodeId> ) -> Result<VoteResponse<C::NodeId>, RaftError<C::NodeId>>
Submit a VoteRequest (RequestVote in the spec) RPC to this Raft node.
These RPCs are sent by cluster peers which are in candidate state attempting to gather votes (§5.2).
sourcepub async fn install_snapshot(
&self,
rpc: InstallSnapshotRequest<C>
) -> Result<InstallSnapshotResponse<C::NodeId>, RaftError<C::NodeId, InstallSnapshotError>>
pub async fn install_snapshot( &self, rpc: InstallSnapshotRequest<C> ) -> Result<InstallSnapshotResponse<C::NodeId>, RaftError<C::NodeId, InstallSnapshotError>>
Submit an InstallSnapshot RPC to this Raft node.
These RPCs are sent by the cluster leader in order to bring a new node or a slow node up-to-speed with the leader (§7).
sourcepub async fn current_leader(&self) -> Option<C::NodeId>
pub async fn current_leader(&self) -> Option<C::NodeId>
Get the ID of the current leader from this Raft node.
This method is based on the Raft metrics system which does a good job at staying
up-to-date; however, the is_leader
method must still be used to guard against stale
reads. This method is perfect for making decisions on where to route client requests.
sourcepub async fn is_leader(
&self
) -> Result<(), RaftError<C::NodeId, CheckIsLeaderError<C::NodeId, C::Node>>>
pub async fn is_leader( &self ) -> Result<(), RaftError<C::NodeId, CheckIsLeaderError<C::NodeId, C::Node>>>
Check to ensure this node is still the cluster leader, in order to guard against stale reads (§8).
The actual read operation itself is up to the application, this method just ensures that the read will not be stale.
sourcepub async fn client_write(
&self,
app_data: C::D
) -> Result<ClientWriteResponse<C>, RaftError<C::NodeId, ClientWriteError<C::NodeId, C::Node>>>
pub async fn client_write( &self, app_data: C::D ) -> Result<ClientWriteResponse<C>, RaftError<C::NodeId, ClientWriteError<C::NodeId, C::Node>>>
Submit a mutating client request to Raft to update the state of the system (§5.1).
It will be appended to the log, committed to the cluster, and then applied to the application state machine. The result of applying the request to the state machine will be returned as the response from this method.
Our goal for Raft is to implement linearizable semantics. If the leader crashes after
committing a log entry but before responding to the client, the client may retry the
command with a new leader, causing it to be executed a second time. As such, clients
should assign unique serial numbers to every command. Then, the state machine should
track the latest serial number processed for each client, along with the associated
response. If it receives a command whose serial number has already been executed, it
responds immediately without re-executing the request (§8). The
RaftStorage::apply_entry_to_state_machine
method is the perfect place to implement
this.
These are application specific requirements, and must be implemented by the application which is being built on top of Raft.
sourcepub async fn initialize<T>(
&self,
members: T
) -> Result<(), RaftError<C::NodeId, InitializeError<C::NodeId, C::Node>>>where
T: IntoNodes<C::NodeId, C::Node> + Debug,
pub async fn initialize<T>( &self, members: T ) -> Result<(), RaftError<C::NodeId, InitializeError<C::NodeId, C::Node>>>where T: IntoNodes<C::NodeId, C::Node> + Debug,
Initialize a pristine Raft node with the given config.
This command should be called on pristine nodes — where the log index is 0 and the node is
in Learner state — as if either of those constraints are false, it indicates that the
cluster is already formed and in motion. If InitializeError::NotAllowed
is returned
from this function, it is safe to ignore, as it simply indicates that the cluster is
already up and running, which is ultimately the goal of this function.
This command will work for single-node or multi-node cluster formation. This command should be called with all discovered nodes which need to be part of cluster, and as such it is recommended that applications be configured with an initial cluster formation delay which will allow time for the initial members of the cluster to be discovered (by the parent application) for this call.
Once a node successfully initialized it will commit a new membership config log entry to store. Then it starts to work, i.e., entering Candidate state and try electing itself as the leader.
More than one node performing initialize()
with the same config is safe,
with different config will result in split brain condition.
sourcepub async fn add_learner(
&self,
id: C::NodeId,
node: C::Node,
blocking: bool
) -> Result<ClientWriteResponse<C>, RaftError<C::NodeId, ClientWriteError<C::NodeId, C::Node>>>
pub async fn add_learner( &self, id: C::NodeId, node: C::Node, blocking: bool ) -> Result<ClientWriteResponse<C>, RaftError<C::NodeId, ClientWriteError<C::NodeId, C::Node>>>
Add a new learner raft node, optionally, blocking until up-to-speed.
- Add a node as learner into the cluster.
- Setup replication from leader to it.
If blocking
is true
, this function blocks until the leader believes the logs on the new
node is up to date, i.e., ready to join the cluster, as a voter, by calling
change_membership
.
If blocking is false
, this function returns at once as successfully setting up the
replication.
If the node to add is already a voter or learner, it will still re-add it.
A node
stores the network address of a node. Thus an application does not
need another store for mapping node-id to ip-addr when implementing the RaftNetwork.
sourcepub async fn change_membership(
&self,
members: impl Into<ChangeMembers<C::NodeId, C::Node>>,
retain: bool
) -> Result<ClientWriteResponse<C>, RaftError<C::NodeId, ClientWriteError<C::NodeId, C::Node>>>
pub async fn change_membership( &self, members: impl Into<ChangeMembers<C::NodeId, C::Node>>, retain: bool ) -> Result<ClientWriteResponse<C>, RaftError<C::NodeId, ClientWriteError<C::NodeId, C::Node>>>
Propose a cluster configuration change.
A node in the proposed config has to be a learner, otherwise it fails with LearnerNotFound error.
Internally:
- It proposes a joint config.
- When the joint config is committed, it proposes a uniform config.
If retain
is true, then all the members which not exists in the new membership,
will be turned into learners, otherwise will be removed.
Example of retain
usage:
If the original membership is {“voter”:{1,2,3}, “learners”:{}}, and call
change_membership
with voters
{3,4,5}, then:
- If
retain
istrue
, the committed new membership is {“voters”:{3,4,5}, “learners”:{1,2}}. - Otherwise if
retain
isfalse
, then the new membership is {“voters”:{3,4,5}, “learners”:{}}, in which the voters not exists in the new membership just be removed from the cluster.
If it loses leadership or crashed before committing the second uniform config log, the cluster is left in the joint config.
sourcepub fn external_request<F: FnOnce(&RaftState<C::NodeId, C::Node>, &mut S, &mut N) + Send + 'static>(
&self,
req: F
)
pub fn external_request<F: FnOnce(&RaftState<C::NodeId, C::Node>, &mut S, &mut N) + Send + 'static>( &self, req: F )
Send a request to the Raft core loop in a fire-and-forget manner.
The request functor will be called with a mutable reference to both the state machine and the network factory and serialized with other Raft core loop processing (e.g., client requests or general state changes). The current state of the system is passed as well.
If a response is required, then the caller can store the sender of a one-shot channel in the closure of the request functor, which can then be used to send the response asynchronously.
If the API channel is already closed (Raft is in shutdown), then the request functor is destroyed right away and not called at all.
sourcepub fn metrics(&self) -> Receiver<RaftMetrics<C::NodeId, C::Node>>
pub fn metrics(&self) -> Receiver<RaftMetrics<C::NodeId, C::Node>>
Get a handle to the metrics channel.
sourcepub fn wait(&self, timeout: Option<Duration>) -> Wait<C::NodeId, C::Node>
pub fn wait(&self, timeout: Option<Duration>) -> Wait<C::NodeId, C::Node>
Get a handle to wait for the metrics to satisfy some condition.
let timeout = Duration::from_millis(200);
// wait for raft log-3 to be received and applied:
r.wait(Some(timeout)).log(Some(3), "log").await?;
// wait for ever for raft node's current leader to become 3:
r.wait(None).current_leader(2, "wait for leader").await?;
// wait for raft state to become a follower
r.wait(None).state(State::Follower, "state").await?;