[][src]Crate arc_swap

Making Arc itself atomic

This library provides the ArcSwapAny type (which you probably don't want to use directly) and several type aliases that set it up for common use cases:

Note that as these are type aliases, the useful methods are defined on ArcSwapAny directly and you want to look at its documentation, not on the aliases.

This is similar to RwLock<Arc<T>>, but it is faster, the readers are never blocked (not even by writes) and it is more configurable.

Or, you can look at it this way. There's Arc<T> ‒ it knows when it stops being used and therefore can clean up memory. But once there's a Arc<T> somewhere, shared between threads, it has to keep pointing to the same thing. On the other hand, there's AtomicPtr which can be changed even when shared between threads, but it doesn't know when the data pointed to is no longer in use so it doesn't clean up. This is a hybrid between the two.

Motivation

First, the C++ shared_ptr can act this way. The fact that it's only the surface API and all the implementations I could find hide a mutex inside wasn't known to me when I started working on this. So I decided Rust needs to keep up there.

Second, I like hard problems and this seems like an apt supply of them.

And third, I actually have few use cases for something like this.

Performance characteristics

It is optimised for read-heavy situations with only occasional writes. Few examples might be:

  • Global configuration data structure, which is updated once in a blue moon when an operator manually does some changes, but looked into through the whole program all the time. Looking into it should be cheap and multiple threads should be able to look into it at the same time.
  • Some in-memory database or maybe routing tables, where lookup latency matters. Updating the routing tables isn't an excuse to stop processing packets even for a short while.

Lock-free readers

All the read operations are always lock-free. Most of the time, they are actually wait-free. The only one that is only lock-free is the first load access in each thread (across all the pointers).

So, when the documentation talks about contention, it talks about multiple CPU cores having to sort out who changes the bytes in a cache line first and who is next. This slows things down, but it still rolls forward and stop for no one, not like with the mutex-style contention when one holds the lock and other threads get parked.

Unfortunately, there are cases where readers block writers from completion. It's much more limited in scope than with Mutex or RwLock and steady stream of readers will not prevent an update from happening indefinitely (only a reader stuck in a critical section could, and when used according to recommendations, the critical sections contain no loops and are only several instructions short).

Speeds

The base line speed of read operations is similar to using an uncontended Mutex. However, load suffers no contention from any other read operations and only slight ones during updates. The load_full operation is additionally contended only on the reference count of the Arc inside ‒ so, in general, while Mutex rapidly loses its performance when being in active use by multiple threads at once and RwLock is slow to start with, ArcSwapAny mostly keeps its performance even when read by many threads in parallel.

Write operations are considered expensive. A write operation is more expensive than access to an uncontended Mutex and on some architectures even slower than uncontended RwLock. However, it is faster than either under contention.

There are some (very unscientific) benchmarks within the source code of the library.

The exact numbers are highly dependant on the machine used (both absolute numbers and relative between different data structures). Not only architectures have a huge impact (eg. x86 vs ARM), but even AMD vs. Intel or two different Intel processors. Therefore, if what matters is more the speed than the wait-free guarantees, you're advised to do your own measurements.

However, the intended selling point of the library is consistency in the performance, not outperforming other locking primitives in average. If you do worry about the raw performance, you can have a look at Cache.

Choosing the right reading operation

There are several load operations available. While the general go-to one should be load, there may be situations in which the others are a better match.

The load usually only borrows the instance from the shared ArcSwapAny. This makes it faster, because different threads don't contend on the reference count. There are two situations when this borrow isn't possible. If the content of ArcSwapAny gets changed, all existing Guards are promoted to contain an owned instance.

The other situation derives from internal implementation. The number of borrows each thread can have at each time (across all Guards) is limited. If this limit is exceeded, an onwed instance is created instead.

Therefore, if you intend to hold onto the loaded value for extended time span, you may prefer load_full. It loads the pointer instance (Arc) without borrowing, which is slower (because of the possible contention on the reference count), but doesn't consume one of the borrow slots, which will make it more likely for following loads to have a slot available. Similarly, if some API needs an owned Arc, load_full is more convenient.

There's also load_signal_safe. This is the only method guaranteed to be safely usable inside a unix signal handler. It has no advantages outside of them, so it makes it kind of niche one.

Additionally, it is possible to use a Cache to get further speed improvement at the cost of less comfortable API and possibly keeping the older values alive for longer than necessary.

Atomic orderings

It is guaranteed each operation performs at least one SeqCst atomic read-write operation, therefore even operations on different instances have a defined global order of operations.

Customization

While the default ArcSwap and load is probably good enough for most of the needs, the library allows a wide range of customizations:

  • It allows storing nullable (Option<Arc<_>>) and non-nullable pointers.
  • It is possible to store other reference counted pointers (eg. if you want to use it with a hypothetical Arc that doesn't have weak counts), by implementing the RefCnt trait.
  • It allows choosing internal fallback locking strategy by the LockStorage trait.

Examples

extern crate arc_swap;
extern crate crossbeam_utils;

use std::sync::Arc;

use arc_swap::ArcSwap;
use crossbeam_utils::thread;

fn main() {
    let config = ArcSwap::from(Arc::new(String::default()));
    thread::scope(|scope| {
        scope.spawn(|_| {
            let new_conf = Arc::new("New configuration".to_owned());
            config.store(new_conf);
        });
        for _ in 0..10 {
            scope.spawn(|_| {
                loop {
                    let cfg = config.load();
                    if !cfg.is_empty() {
                        assert_eq!(**cfg, "New configuration");
                        return;
                    }
                }
            });
        }
    }).unwrap();
}

Alternatives

There are other means to get similar functionality you might want to consider:

Mutex<Arc<_>> and RwLock<Arc<_>>

They have significantly worse performance in the contented scenario but are comparable in uncontended cases. They are directly in the standard library, which means better testing and less dependencies.

The same, but with parking_lot

Parking lot contains alternative implementations of Mutex and RwLock that are faster than the standard library primitives. They still suffer from contention.

crossbeam::atomic::ArcCell

This internally contains a spin-lock equivalent and is very close to the characteristics of parking_lot::Mutex<Arc<_>>. This is unofficially deprecated. See the relevant issue.

crossbeam-arccell

It is mentioned here because of the name. Despite of the name, this does something very different (which might possibly solve similar problems). It's API is not centric to Arc or any kind of pointer, but rather it has snapshots of its internal value that can be exchanged very fast.

AtomicArc

This one is probably the closest thing to ArcSwap on the API level. Both read and write operations are lock-free, but neither is wait-free, and the performance of reads and writes are more balanced ‒ while ArcSwap is optimized for reading, AtomicArc is „balanced“.

The biggest current downside is, it is in a prototype stage and not released yet.

Features

The unstable-weak feature adds the ability to use arc-swap with the Weak pointer too, through the ArcSwapWeak type. This requires the nightly Rust compiler. Also, the interface and support is not part of API stability guarantees and may be arbitrarily changed or removed in future releases (it is mostly waiting for the weak_into_raw nightly feature to stabilize before stabilizing it in this crate).

Modules

gen_lock

Customization of where and how the generation lock works.

Structs

ArcSwapAny

An atomic storage for a reference counted smart pointer like Arc or Option<Arc>.

Cache

Caching handle for ArcSwapAny.

Guard

A temporary storage of the pointer.

Traits

RefCnt

A trait describing smart reference counted pointers.

Type Definitions

ArcSwap

An atomic storage for Arc.

ArcSwapOption

An atomic storage for Option<Arc>.

ArcSwapWeak

Arc swap for the Weak pointer.

IndependentArcSwap

An atomic storage that doesn't share the internal generation locks with others.