Expand description
§Introduction to ArcShift
ArcShift is a data type similar to std::sync::Arc, except that it allows updating
the value pointed to. The memory overhead is identical to that of Arc. ArcShift is mainly
intended for cases where updates are very infrequent. See the ‘Limitations’-heading
further down before using!
§Example
use std::thread;
let mut arc = ArcShift::new("Hello".to_string());
let mut arc2 = arc.clone();
let j1 = thread::spawn(move||{
println!("Value in thread 1: '{}'", *arc); //Prints 'Hello'
arc.update("New value".to_string());
println!("Updated value in thread 1: '{}'", *arc); //Prints 'New value'
});
let j2 = thread::spawn(move||{
// Prints either 'Hello' or 'New value', depending on scheduling:
println!("Value in thread 2: '{}'", *arc2);
});
j1.join().unwrap();
j2.join().unwrap();When ArcShift values are updated, a linked list of all updates is formed. Whenever
an arcshift-instance is reloaded (using ArcShift::reload, ArcShift::get or
ArcShiftLight::reload, that instance advances along the linked list to the last
node in the list. When no instance exists pointing at a node in the list, it is dropped.
It is thus important to periodically call reload or get (unless the number of updates is
so low that the cost of traversing the linked list is acceptable).
§Strong points
- Easy to use (similar to Arc)
- All functions are lock free (see https://en.wikipedia.org/wiki/Non-blocking_algorithm )
- For use cases where no modification of values occurs, performance is very good (much better than RwLock or Mutex).
- Modifying values is reasonably fast (think, 10-50 nanoseconds).
- The function
ArcShift::shared_non_reloading_getallows access almost without any overhead at all compared to regular Arc. - ArcShift does not rely on any thread-local variables to achieve its performance.
§Limitations
ArcShift achieves its performance at the expense of the following disadvantages:
- When modifying the value, the old version of the value lingers in memory until
the last ArcShift has been updated. Such an update only happens when the ArcShift
is accessed using a unique (
&mut) access (likeArcShift::getorArcShift::reload). This can be partially mitigated by using theArcShiftLight-type for long-lived never-reloaded instances. - Modifying the value is approximately 10x more expensive than modifying
Arc<RwLock<T>> - When the value is modified, the next subsequent access can be slower than an
Arc<RwLock<T>>access. - ArcShift is its own datatype. It is in no way compatible with
Arc<T>. - At most 524287 instances of ArcShiftLight can be created for each value.
- At most 35000000000000 instances of ArcShift can be created for each value.
- ArcShift does not support an analog to
std::sync::Arc’sstd::sync::Weak. - ArcShift instances should ideally be owned (or be mutably accessible).
The last limitation might seem unacceptable, but for many applications it is not hard to make sure each thread/scope has its own instance of ArcShift pointing to the resource. Cloning ArcShift instances is reasonably fast.
§Motivation
The primary raison d’être for ArcShift is to be a version of Arc which allows
modifying the stored value, with very little overhead over regular Arc, as long as
updates are very infrequent.
The motivating use-case for ArcShift is reloadable assets in computer games.
During normal usage, assets do not change. All benchmarks and play experience will
be dependent only on this baseline performance. Ideally, we therefore want to have
a very small performance penalty for the case when assets are not updated, compared
to using regular std::sync::Arc.
During game development, artists may update assets, and hot-reload is a very time-saving feature. A performance hit during asset-reload is acceptable though. ArcShift prioritizes base performance, while accepting a penalty when updates are made. The penalty is that, under some circumstances described below, ArcShift can have a lingering performance hit until ‘reload’ is called. See documentation for the details.
ArcShift can, of course, be useful in other domains than computer games.
§Performance properties
Accessing the value stored in an ArcShift instance only requires a single atomic operation, of the least expensive kind (Ordering::Relaxed). On x86_64, this is the exact same machine operation as a regular memory access, and also on arm it is not an expensive operation. The cost of such access is much smaller than a mutex access, even an uncontended one.
§Implementation
The basic idea of ArcShift is that each ArcShift instance points to a small heap block, that contains the pointee value of type T, a reference count, and a ‘next’-pointer. The ‘next’-pointer starts out as null, but when the value in an ArcShift is updated, the ‘next’-pointer is set to point to the updated value.
This means that each ArcShift-instance always points at valid value of type T. No locking or synchronization is required to get at this value. This is why ArcShift instances are fast to use. But it has the drawback that as long as an ArcShift-instance exists, whatever value it points to must be kept alive. Each time an ArcShift instance is accessed mutably, we have an opportunity to update its pointer to the ‘next’ value. When the last ArcShift-instance releases a particular value, it will be dropped. The operation to update the pointer is called a ‘reload’.
ArcShiftLight-instances also keep pointers to the heap blocks mentioned above, but value T
in the block can be dropped while being held by an ArcShiftLight. This means that ArcShiftLight-
instances only consume std::mem::size_of::<T>() bytes of memory, when the value they
point to has been dropped. When the ArcShiftLight-instances is reloaded, or dropped, that memory
is also released.
§Pitfall #1 - lingering memory usage
Be aware that ArcShift instances that are just “lying around” without ever being reloaded,
will keep old values around, taking up memory. This is a fundamental drawback of the approach
taken by ArcShift. One workaround is to replace long-lived non-reloaded instances of
ArcShift with ArcShiftLight. This alleviates the problem.
§Pitfall #2 - reference count limitations
ArcShift uses a single 64 bit reference counter to track both ArcShift and ArcShiftLight instance counts. This is achieved by giving each ArcShiftLight-instance a weight of 1, while each ArcShift-instance receives a weight of 524288. As a consequence of this, the maximum number of ArcShiftLight-instances (for the same value), is 524287. Because the counter is 64-bit, this leaves 2^64/524288 as the maximum number of ArcShift instances (for the same value). However, we leave some margin, to allow practically detecting any overflow, giving a maximum of 35000000000000, Since each ArcShift instance takes at least 8 bytes of space, it takes at least 280TB of memory to even be able to hit this limit. If the limit is somehow reached, there will be a best effort at detecting this and causing a panic. This is similar to how the rust std library handles overflow of the reference counter on std::sync::Arc. Just as with std::core::Arc, the overflow will be detected in practice, though there is no guarantee. For ArcShift, the overflow will be detected as long as the machine has an even remotely fair scheduler, and less than 100 billion threads (though the conditions for detection of std::core::Arc-overflow are even more assured).
§A larger example
struct CharacterModel {
/* 3D model, textures, etc*/
}
struct World {
models: Vec<ArcShift<CharacterModel>>
}
/// Loads models. Regularly scans filesystem,
/// updates models when their files change on disk.
fn load_models() -> Vec<ArcShift<CharacterModel>> {
let models: Vec<ArcShift<CharacterModel>> = vec![];
/* Somehow load models */
let mut models_for_reloader = models.clone();
std::thread::spawn(move||{
loop {
/* detect file system changes*/
let changed_model = 0usize;
models_for_reloader[changed_model].update(CharacterModel{/* newly loaded*/});
}
});
models
}
fn run_game() {
let mut world = World {
models: load_models()
};
loop {
run_game_logic(&mut world);
}
}
fn run_game_logic(world: &mut World) {
/*
Do game logic, possibly in multiple threads, accessing different parts of World,
possibly cloning 'ArcShift' instances for use by other threads
*/
for model in world.models.iter_mut() {
// Accessing ArcShift using 'get' ensures
// old versions do not linger in RAM.
let model_ref : &CharacterModel = model.get();
// Do stuff with 'model_ref'
}
}
Structs§
- Smart pointer with similar use case as std::sync::Arc, but with the added ability to atomically replace the contents of the Arc. See
cratedocumentation for more information. - ArcShiftCell is like an ArcShift, except that it can be reloaded without requiring ‘mut’-access. However, it is not ‘Sync’.
- A handle to the pointed-to value of a ArcShiftCell. This handle should not be leaked, but if it is leaked, the effect is that whatever value the ArcShiftCell-instance pointed to at that time, will forever leak also. All the linked-list nodes from that entry and onward will also leak. So make sure to not leak the handle!
- ArcShiftLight is like ArcShift, except it does not provide overhead-free access. However, it has the advantage of not preventing old versions of the payload type from being freed.
- Error type representing the case that an operation was attempted from within a ‘get’-function closure.