Crate kube_coordinate

source ·
Expand description

Coordination utilities built around the coordination.k8s.io/v1 API.

This implementation uses only Kubernetes coordination.k8s.io/v1/Lease objects for coordination. Every client running this LeaderElector task maintains a watch on the lease object, and will receive updates as the lease holder heartbeats the lease.

Applications using this API can use the spawn handle returned from spawning the LeaderElector task to monitor lease state and to govern application behavior based on that state. E.G.,

// Spawn a leader elector task, and get a handle to the state channel.
let handle = LeaderElector::spawn(/* ... */);
let state_chan = handle.state();

// Before taking action as a leader, just check the channel to ensure
// the lease is currently held by this process.
if state_chan.borrow().is_leader() {
    // Only perform leader actions if in leader state.
}

// Or, for a more sophisticated pattern, watch the state channel for changes,
// and use it to drive your application's state machine.
let state_stream = tokio_stream::wrappers::WatchStream::new(state_chan);
loop {
    tokio::select! {
        Some(state) = state_stream.next() => match state {
            LeaderState::Leader => (), // Leader tasks.
            _ => (), // Non-leader tasks.
        },
    }
}

§Reference Implementation

This implementation is based upon the upstream Kubernetes implementation in Go which can be found here: https://github.com/kubernetes/client-go/blob/2a6c116e406126324eee341e874612a5093bdbb0/tools/leaderelection/leaderelection.go

The following docs, adapted from the reference Go implementation, also apply here:

This implementation does not guarantee that only one client is acting as a leader (a.k.a. fencing).

A client only acts on timestamps captured locally to infer the state of the leader election. The client does not consider timestamps in the leader election record to be accurate because these timestamps may not have been produced by a local clock. The implemention does not depend on their accuracy and only uses their change to indicate that another client has renewed the leader lease. Thus the implementation is tolerant to arbitrary clock skew, but is not tolerant to arbitrary clock skew rate.

However the level of tolerance to skew rate can be configured by setting renew_deadline and lease_duration appropriately. The tolerance expressed as a maximum tolerated ratio of time passed on the fastest node to time passed on the slowest node can be approximately achieved with a configuration that sets the same ratio of lease_duration to renew_deadline. For example if a user wanted to tolerate some nodes progressing forward in time twice as fast as other nodes, the user could set lease_duration to 60 seconds and renew_deadline to 30 seconds.

While not required, some method of clock synchronization between nodes in the cluster is highly recommended. It’s important to keep in mind when configuring this client that the tolerance to skew rate varies inversely to master availability.

Larger clusters often have a more lenient SLA for API latency. This should be taken into account when configuring the client. The rate of leader transitions should be monitored and retry_period and lease_duration should be increased until the rate is stable and acceptably low. It’s important to keep in mind when configuring this client that the tolerance to API latency varies inversely to master availability.

Structs§

  • Configuration for leader election.
  • A task which is responsible for acquiring and maintaining a coordination.k8s.io/v1 Lease to establish leadership.
  • A handle to a leader elector task.

Enums§

  • Coordination error variants.
  • Different states which a leader elector may be in.

Type Aliases§

  • Coordination result type.