Struct deltalake::storage::object_store::aws::DynamoCommit

source ·
pub struct DynamoCommit { /* private fields */ }
Expand description

A DynamoDB-based commit protocol, used to provide conditional write support for S3

§Limitations

Only conditional operations, e.g. copy_if_not_exists will be synchronized, and can therefore race with non-conditional operations, e.g. put, copy, delete, or conditional operations performed by writers not configured to synchronize with DynamoDB.

Workloads making use of this mechanism must ensure:

  • Conditional and non-conditional operations are not performed on the same paths
  • Conditional operations are only performed via similarly configured clients

Additionally as the locking mechanism relies on timeouts to detect stale locks, performance will be poor for systems that frequently delete and then create objects at the same path, instead being optimised for systems that primarily create files with paths never used before, or perform conditional updates to existing files

§Commit Protocol

The DynamoDB schema is as follows:

  • A string partition key named "path"
  • A string sort key named "etag"
  • A numeric TTL attribute named "ttl"
  • A numeric attribute named "generation"
  • A numeric attribute named "timeout"

An appropriate DynamoDB table can be created with the CLI as follows:

$ aws dynamodb create-table --table-name <TABLE_NAME> --key-schema AttributeName=path,KeyType=HASH AttributeName=etag,KeyType=RANGE --attribute-definitions AttributeName=path,AttributeType=S AttributeName=etag,AttributeType=S
$ aws dynamodb update-time-to-live --table-name <TABLE_NAME> --time-to-live-specification Enabled=true,AttributeName=ttl

To perform a conditional operation on an object with a given path and etag (* if creating), the commit protocol is as follows:

  1. Perform HEAD request on path and error on precondition mismatch
  2. Create record in DynamoDB with given path and etag with the configured timeout
    1. On Success: Perform operation with the configured timeout
    2. On Conflict:
      1. Periodically re-perform HEAD request on path and error on precondition mismatch
      2. If timeout * max_skew_rate passed, replace the record incrementing the "generation"
        1. On Success: GOTO 2.1
        2. On Conflict: GOTO 2.2

Provided no writer modifies an object with a given path and etag without first adding a corresponding record to DynamoDB, we are guaranteed that only one writer will ever commit.

This is inspired by the DynamoDB Lock Client but simplified for the more limited requirements of synchronizing object storage. The major changes are:

  • Uses a monotonic generation count instead of a UUID rvn, as this is:
    • Cheaper to generate, serialize and compare
    • Cannot collide
    • More human readable / interpretable
  • Relies on TTL to eventually clean up old locks

It also draws inspiration from the DeltaLake S3 Multi-Cluster commit protocol, but generalised to not make assumptions about the workload and not rely on first writing to a temporary path.

Implementations§

source§

impl DynamoCommit

source

pub fn new(table_name: String) -> DynamoCommit

Create a new DynamoCommit with a given table name

source

pub fn with_timeout(self, millis: u64) -> DynamoCommit

Overrides the lock timeout.

A longer lock timeout reduces the probability of spurious commit failures and multi-writer races, but will increase the time that writers must wait to reclaim a lock lost. The default value of 20 seconds should be appropriate for must use-cases.

source

pub fn with_max_clock_skew_rate(self, rate: u32) -> DynamoCommit

The maximum clock skew rate tolerated by the system.

An environment in which the clock on the fastest node ticks twice as fast as the slowest node, would have a clock skew rate of 2. The default value of 3 should be appropriate for most environments.

source

pub fn with_ttl(self, ttl: Duration) -> DynamoCommit

The length of time a record should be retained in DynamoDB before being cleaned up

This should be significantly larger than the configured lock timeout, with the default value of 1 hour appropriate for most use-cases.

Trait Implementations§

source§

impl Clone for DynamoCommit

source§

fn clone(&self) -> DynamoCommit

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for DynamoCommit

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
source§

impl PartialEq for DynamoCommit

source§

fn eq(&self, other: &DynamoCommit) -> bool

This method tests for self and other values to be equal, and is used by ==.
1.0.0 · source§

fn ne(&self, other: &Rhs) -> bool

This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
source§

impl Eq for DynamoCommit

source§

impl StructuralPartialEq for DynamoCommit

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

source§

fn equivalent(&self, key: &K) -> bool

Checks if this value is equivalent to the given key. Read more
source§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

source§

fn equivalent(&self, key: &K) -> bool

Compare self to key and return true if they are equal.
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T> Instrument for T

source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> IntoEither for T

source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
source§

impl<Unshared, Shared> IntoShared<Shared> for Unshared
where Shared: FromUnshared<Unshared>,

source§

fn into_shared(self) -> Shared

Creates a shared type from an unshared type.
source§

impl<T> Same for T

§

type Output = T

Should always be Self
source§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

source§

fn vzip(self) -> V

source§

impl<T> WithSubscriber for T

source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

source§

impl<T> Ungil for T
where T: Send,