Crate arc_swap

source ·
Expand description

Making Arc itself atomic

This library provides the ArcSwapAny type (which you probably don’t want to use directly) and several type aliases that set it up for common use cases:

Note that as these are type aliases, the useful methods are defined on ArcSwapAny directly and you want to look at its documentation, not on the aliases.

This is similar to RwLock<Arc<T>>, but it is faster, the readers are never blocked (not even by writes) and it is more configurable.

Or, you can look at it this way. There’s Arc<T> ‒ it knows when it stops being used and therefore can clean up memory. But once there’s a Arc<T> somewhere, shared between threads, it has to keep pointing to the same thing. On the other hand, there’s AtomicPtr which can be changed even when shared between threads, but it doesn’t know when the data pointed to is no longer in use so it doesn’t clean up. This is a hybrid between the two.

Motivation

First, the C++ shared_ptr can act this way. The fact that it’s only the surface API and all the implementations I could find hide a mutex inside wasn’t known to me when I started working on this. So I decided Rust needs to keep up there.

Second, I like hard problems and this seems like an apt supply of them.

And third, I actually have few use cases for something like this.

Performance characteristics

It is optimised for read-heavy situations with only occasional writes. Few examples might be:

  • Global configuration data structure, which is updated once in a blue moon when an operator manually does some changes, but looked into through the whole program all the time. Looking into it should be cheap and multiple threads should be able to look into it at the same time.
  • Some in-memory database or maybe routing tables, where lookup latency matters. Updating the routing tables isn’t an excuse to stop processing packets even for a short while.

Lock-free readers

All the read operations are always lock-free. Most of the time, they are actually wait-free. The only one that is only lock-free is the first lease access in each thread (across all the pointers).

So, when the documentation talks about contention, it talks about multiple cores having to sort out who changes the bytes in a cache line first and who is next. This slows things down, but it still rolls forward and stop for no one, not like with the mutex-style contention when one holds the lock and others wait outside.

Unfortunately, there are cases where readers block writers from completion. It’s much more limited in scope than with Mutex or RwLock and steady stream of readers will not prevent an update from happening indefinitely (only a reader stuck in a critical section could, and when used according to recommendations, the critical sections contain no loops and are only several instructions short).

Speeds

The base line speed of read operations is similar to using an uncontended Mutex. However, lease and peek suffer no contention from any other read operations and only slight ones during updates. The load operation is additionally contended only on the reference count of the Arc inside ‒ so, in general, while Mutex rapidly loses its performance when being in active use by multiple threads at once and RwLock is slow to start with, ArcSwapAny mostly keeps its performance even when read by many threads in parallel.

Write operations are considered expensive. A write operation is more expensive than access to an uncontended Mutex and on some architectures even slower than uncontended RwLock. However, it is faster than either when contended.

There are some (very unscientific) benchmarks within the source code of the library.

The exact numbers are highly dependant on the machine used (both absolute numbers and relative between different data structures). Not only architectures have a huge impact (eg. x86 vs ARM), but even AMD vs. Intel or two different Intel processors. Therefore, if what matters is more the speed than the wait-free guarantees, you’re advised to do your own measurements.

Choosing the right reading operation

Performance is world of trade-offs. Therefore, the library offers several very similar methods to read the pointer. The default choice should nevertheless probably be lease.

Only one of them is functionally different ‒ peek_signal_safe. See below for signals, but in general, it is the only thing you want to use inside a signal handler and you don’t want to use it anywhere else.

  • load creates a full blown Arc. It’s the most heavy-weight around and while wait-free, it suffers from contention on the reference count in the Arc, so when used from too many threads at once, it’ll become slow. On the other hand, there’s no restriction on how long you can hold onto the result or how many of them you keep around, so this is appropriate if creating handles for long-term storage. It also provides a bridge to other algorithms which only take the Arc.
  • lease returns a proxy object that works as a pointer to the stored data and can be upgraded to full Arc later on if needed. There’s no limit on how long it can live around. However, it internally comes at two flavors, one cheap and one containing a full Arc in it. Each thread is entitled to only limited total number of cheap ones at a given time (currently 6) and if more are constructed, the others fall back on the full version (which then uses load internally). Therefore, lease can be fast (almost as fast as peek) but only as long as the thread calling it doesn’t have too many leases around at the time.
  • peek is the cheapest with the most predictable performance characteristics. However, as long as the returned guard object is alive, the internal generation lock is being held and that prevents write operations from completion and they’ll spin-wait for the unlock. By default, all the pointer instances share the same generation lock (and it’ll therefore prevent write operations even on other pointers from completion). However, the IndependentArcSwap uses a private generation lock for each instance. In general, this is suitable for very fast things ‒ like reading a single scalar value out of a configuration, but not keeping it around or doing expensive lookups in data.

RCU

This also offers an RCU implementation, for read-heavy situations. Note that the RCU update is considered relatively slow operation (slower than simple write). In case there’s only one update thread, using store is enough.

Atomic orderings

It is guaranteed each operation performs at least one SeqCst atomic read-write operation, therefore even operations on different instances have a defined global order of operations.

Unix signal handlers

Unix signals are hard to use correctly, partly because there is a very restricted set of functions one might use inside them. Specifically, it is not allowed to use mutexes inside them (because that could cause a deadlock).

On the other hand, it is possible to use peek_signal_safe (but not the others). Note that the signal handler is not allowed to allocate or deallocate memory, therefore it is not recommended to upgrade the returned guard (it is strictly speaking possible to use that safely, but it is hard and brings no benefit).

Customization

While the default ArcSwap and lease is probably good enough for most of the needs, the library allows a wide range of customizations:

  • It allows storing nullable (Option<Arc<_>>) and non-nullable pointers.
  • It is possible to store other reference counted pointers (eg. if you want to use it with a hypothetical Arc that doesn’t have weak counts), by implementing the RefCnt trait.
  • It allows choosing internal locking strategy by the LockStorage trait.

Examples

extern crate arc_swap;
extern crate crossbeam_utils;

use std::sync::Arc;

use arc_swap::ArcSwap;
use crossbeam_utils::thread;

fn main() {
    let config = ArcSwap::from(Arc::new(String::default()));
    thread::scope(|scope| {
        scope.spawn(|| {
            let new_conf = Arc::new("New configuration".to_owned());
            config.store(new_conf);
        });
        for _ in 0..10 {
            scope.spawn(|| {
                loop {
                    let cfg = config.lease();
                    if !cfg.is_empty() {
                        assert_eq!(*cfg, "New configuration");
                        return;
                    }
                }
            });
        }
    });
}

Alternatives

There are other means to get similar functionality you might want to consider:

Mutex<Arc<_>> and RwLock<Arc<_>>

They have significantly worse performance in the contented scenario but are comparable in uncontended cases. They are directly in the standard library, which means better testing and less dependencies.

The same, but with parking_lot

Parking lot contains alternative implementations of Mutex and RwLock that are faster than the standard library primitives. They still suffer from contention.

crossbeam::atomic::ArcCell

This internally contains a spin-lock equivalent and is very close to the characteristics of parking_lot::Mutex<Arc<_>>. This is unofficially deprecated. See the relevant issue.

crossbeam-arccell

It is mentioned here because of the name. Despite of the name, this does something very different (which might possibly solve similar problems). It’s API is not centric to Arc or any kind of pointer, but rather it has snapshots of its internal value that can be exchanged very fast.

AtomicArc

This one is probably the closest thing to ArcSwap on the API level. Both read and write operations are lock-free, but neither is wait-free, and the performance of reads and writes are more balanced ‒ while ArcSwap is optimized for reading, AtomicArc is „balanced“.

The biggest current downside is, it is in a prototype stage and not released yet.

Modules

Customization of where and how the generation lock works.

Structs

An atomic storage for a smart pointer like Arc or Option<Arc>.
A short-term proxy object from peek.
A temporary storage of the pointer.

Traits

A trait describing things that can be turned into a raw pointer.
A trait describing smart pointers that can’t hold NULL.
A trait describing smart reference counted pointers.

Functions

Comparison of two pointer-like things.

Type Definitions

An atomic storage for Arc.
An atomic storage for Option<Arc>.
An atomic storage that doesn’t share the internal generation locks with others.