Expand description
§Introduction to ArcShift
ArcShift
is a data type similar to std::sync::Arc
, except that it allows updating
the value pointed to. It can be used as a replacement for
std::sync::Arc<std::sync::RwLock<T>>
, giving much faster read access.
Updating the value in ArcShift is significantly more expensive than writing std::sync::RwLock
,
so ArcShift is most suited to cases where updates are infrequent.
§Example
use std::thread;
let mut arc = ArcShift::new("Hello".to_string());
let mut arc2 = arc.clone();
let j1 = thread::spawn(move||{
println!("Value in thread 1: '{}'", arc.get()); //Prints 'Hello'
arc.update("New value".to_string());
println!("Updated value in thread 1: '{}'", arc.get()); //Prints 'New value'
});
let j2 = thread::spawn(move||{
// Prints either 'Hello' or 'New value', depending on scheduling:
println!("Value in thread 2: '{}'", arc2.get());
});
j1.join().unwrap();
j2.join().unwrap();
§Strong points
- Easy to use (similar to Arc)
- Extensively tested
- All functions are lock free ( https://en.wikipedia.org/wiki/Non-blocking_algorithm )
- For use cases where no updates occur, performance is very good (much better than RwLock or Mutex).
- Updates are reasonably fast (think, 15-100 nanoseconds), but much slower than Mutex- or RwLock-writes.
- The function
ArcShift::shared_non_reloading_get
allows access without any overhead compared to regular Arc (benchmarks show identical performance to Arc). - ArcShift does not rely on thread-local variables.
- Supports unsized types (i.e, you can use
ArcShift<[u8]>
) - ArcShift is no_std compatible (though ‘alloc’ is required, since ArcShift is a heap data structure). Compile with “default-features=false” to enable no_std compatibility.
§Limitations
ArcShift achieves its performance at the expense of the following disadvantages:
- ArcShift’s performance relies on being able to update its pointer when
new values are detected. This means that ArcShift is most efficient when each
thread has a mutable ArcShift instance. This can often be achieved by cloning the ArcShift,
and distributing one owned copy to every thread (these clones all point to the same
inner value). ArcShift can still be used with only shared access (
ArcShift::shared_get
), and performance is still very good as long as the pointer is current. However, if the ArcShift instance is stale (needs reloading, because an update has occurred), reads will be approximately twice as costly as for RwLock. - When modifying the value, the old version of the value lingers in memory until
the last ArcShift that uses it has reloaded. Such a reload only happens when the ArcShift
is accessed using a unique (
&mut
) access (likeArcShift::get
orArcShift::reload
). This can be partially mitigated by using theArcShiftWeak
-type for long-lived never-reloaded instances. - Modifying the value is approximately 10x more expensive than modifying an
Arc<Mutex<T>>
. That said, if you’re storing anything significantly more complex than an integer, the overhead of ArcShift may be insignificant. - When the value is modified, the next subsequent reload is slower than an
Arc<RwLock<T>>
access. - ArcShift is its own datatype. It is in no way compatible with
Arc<T>
. - At most usize::MAX/8 instances of ArcShift or ArcShiftWeak can be created for each value. (this is because it uses some bits of its weak refcount to store metadata).
§Detailed performance characteristics
-
ArcShift::get
- Very good average performance. Checking for new values requires a single atomic operation, of the least expensive kind (Ordering::Relaxed). On x86_64, this is the exact same machine operation as a regular memory access, and also on arm it is not an expensive operation. The cost of such access is much smaller than a mutex access, even an uncontended one. In the case where a reload is actually necessary, there is a significant performance impact (but still typically below 150ns for modern machines (2025)).If other instances have made updates, subsequent accesses will have a penalty. This penalty can be significant, because previous values may have to be dropped. However, when any updates have been processed, subsequent accesses will be fast again. It is guaranteed that any update that completed before the execution of
ArcShift::get
started, will be visible. -
ArcShift::shared_get
- Good performance as long as the value is not stale. If self points to a previous value, each call toshared_get
will traverse the memory structures to find the most recent value.There are three cases:
- The value is up-to-date. In this case, execution is very fast.
- The value is stale, but no write is in progress. Expect a penalty equal to twice the cost of an RwLock write, approximately.
- The value is stale, and there is a write in progress. This is a rare race condition. Expect a severe performance penalty (~10-20x the cost of an RwLock write).
shared_get
also guarantees that any updates that completed before it was called, will be visible. -
ArcShift::shared_non_reloading_get
- No overhead compared to plain Arc. Will not reload, even if the ArcShift instance is stale. May thus return an old value. If shared_get has been used previously, this method may return an older value than what the shared_get returned. -
ArcShift::reload
- Similar cost toArcShift::get
. -
ArcShift::clone
- Fast. Requires a single atomic increment, and an atomic read. If the current instance is stale, the cloned value will be reloaded, with identical cost toArcShift::get
. -
Drop - Can be slow. The last remaining owner of a value will drop said value.
§Motivation
The primary raison d’être for ArcShift
is to be a version of Arc which allows
updating the stored value, with very little overhead over regular Arc for read heavy
loads.
One such motivating use-case for ArcShift is hot-reloadable assets in computer games.
During normal usage, assets do not change. All benchmarks and play experience will
be dependent only on this baseline performance. Ideally, we therefore want to have
a very small performance penalty for the case when assets are not updated, comparable
to using regular std::sync::Arc
.
During game development, artists may update assets, and hot-reload is a very time-saving feature. A performance hit during asset-reload is acceptable though. ArcShift prioritizes base performance, while accepting a penalty when updates are made.
ArcShift can, of course, be useful in other domains than computer games.
§Panicking drop methods
If a drop implementation panics, ArcShift will make sure that the internal data structures remain uncorrupted. When run without the std-library, some memory leakage will occur every time a drop method panics. With the std-library, only memory owned by the payload type might leak.
§No_std
By default, arcshift uses the rust standard library. This is enabled by the ‘std’ feature, which
is enabled by default. ArcShift can work without the full rust std library. However, this
comes at a slight performance cost. When the ‘std’ feature is enabled (which it is by default),
catch_unwind
is used to guard drop functions, to make sure memory structures are not corrupted
if a user-supplied drop method panics. However, to ensure the same guarantee when running
without std, arcshift presently moves allocations to temporary boxes to be able to run drop
after all memory traversal is finished. This requires multiple allocations, which makes
operation without ‘std’ slower. Panicking drop methods can also lead to memory leaks
without the std. The memory structures remain intact, and no undefined behavior occurs.
The performance penalty is only present during updates.
If the overhead mentioned in the previous paragraph is unacceptable, and if the final binary is compiled with panic=abort, this extra cost can be mitigated. Enable the feature “nostd_unchecked_panics” to do this. This must never be done if the process will ever continue executing after a panic, since it can lead to memory reclamation essentially being disabled for any ArcShift-chain that has had a panicking drop. However, no UB will result, in any case.
§Implementation
The basic idea of ArcShift is that each ArcShift instance points to a small heap block, that contains the pointee value of type T, three reference counts, and ‘prev’/‘next’-pointers. The ‘next’-pointer starts out as null, but when the value in an ArcShift is updated, the ‘next’-pointer is set to point to the updated value.
This means that each ArcShift-instance always points at a valid value of type T. No locking or synchronization is required to get at this value. This is why ArcShift instances are fast to use. There is the drawback that as long as an ArcShift-instance exists, whatever value it points to must be kept alive. Each time an ArcShift instance is accessed mutably, we have an opportunity to update its pointer to the ‘next’ value. The operation to update the pointer is called a ‘reload’.
When the last ArcShift-instance releases a particular value, it will be dropped.
ArcShiftWeak-instances also keep pointers to the heap blocks mentioned above, but the value T
in the block can be dropped while being held by an ArcShiftWeak. This means that an ArcShiftWeak-
instance only consumes std::mem::size_of::<T>()
bytes plus 5 words of memory, when the value
it points to has been dropped. When the ArcShiftWeak-instance is reloaded, or dropped, that
memory is also released.
§Prior Art
ArcShift is very much inspired by arc-swap. The two crates can be used for similar problems. They have slightly different APIs, one or the other may be a more natural fit depending on the problem. ArcShift may be faster for some problems, slower for others.
§Pitfall #1 - lingering memory usage
Be aware that ArcShift instances that are just “lying around” without ever being reloaded,
will keep old values around, taking up memory. This is a fundamental drawback of the approach
taken by ArcShift. One workaround is to replace any long-lived infrequently reloaded instances of
ArcShift
with ArcShiftWeak
. This alleviates the problem, though heap storage of approx
size_of<T>
+ 5 words is still expended.
§Pitfall #2 - reference count limitations
ArcShift uses usize data type for the reference counts. However, it reserves two bits for tracking some metadata. This leaves usize::MAX/4 as the maximum usable reference count. To avoid having to check the refcount twice (once before increasing the count), we set the limit at usize::MAX/8, and check the count after the atomic operation. This has the effect that if more than usize::MAX/8 threads clone the same ArcShift instance concurrently, the unsoundness will occur. However, this is considered acceptable, because this exceeds the possible number of concurrent threads by a huge safety margin. Also note that usize::MAX/8 ArcShift instances would take up usize::MAX bytes of memory, which is very much impossible in practice. By leaking ArcShift instances in a tight loop it is still possible to achieve a weak count of usize::MAX/8, in which case ArcShift will panic.
§A larger example
struct CharacterModel {
/* 3D model, textures, etc*/
}
struct World {
models: Vec<ArcShift<CharacterModel>>
}
/// Loads models. Regularly scans filesystem,
/// updates models when their files change on disk.
fn load_models() -> Vec<ArcShift<CharacterModel>> {
let models: Vec<ArcShift<CharacterModel>> = vec![];
/* Somehow load models */
let mut models_for_reloader = models.clone();
std::thread::spawn(move||{
loop {
/* detect file system changes*/
let changed_model = 0usize;
models_for_reloader[changed_model].update(CharacterModel{/* newly loaded*/});
}
});
models
}
fn run_game() {
let mut world = World {
models: load_models()
};
loop {
run_game_logic(&mut world);
}
}
fn run_game_logic(world: &mut World) {
/*
Do game logic, possibly in multiple threads, accessing different parts of World,
possibly cloning 'ArcShift' instances for use by other threads
*/
for model in world.models.iter_mut() {
// Accessing ArcShift using 'get' ensures
// old versions do not linger in RAM.
let model_ref : &CharacterModel = model.get();
// Do stuff with 'model_ref'
}
}
Modules§
- cell
- Module with a convenient cell-like data structure for reloading ArcShift instances despite only having shared access.
Structs§
- ArcShift
- Smart pointer with similar use case as std::sync::Arc, but with
the added ability to atomically replace the contents of the Arc.
See
crate
documentation for more information. - ArcShift
Weak - ArcShiftWeak is a way to keep a pointer to an object without preventing said object from being deallocated. This can be useful when creating cyclic data structure, to avoid memory leaks.
- NoLonger
Available Marker - This is a marker for methods that have been removed in the most recent version of ArcShift.
Enums§
- Shared
GetGuard - Return value of
ArcShift::shared_get