`conc` — An efficient concurrent reclamation system

conc builds upon hazard pointers to create a extremely performant system for concurrently handling memory. It is more general and convenient — and often also faster — than epoch-based reclamation.

Overview

High-level API
- Atomic<T> for an lockless readable and writable container.
- sync for basic datastructures implemented through conc.
  - Treiber<T> for concurrent stacks.
  - Stm<T> for a simple implementation of STM.
Low-level API
- add_garbage() for queuing destruction of garbage.
- Guard<T> for blocking destruction.
Runtime control
- gc() for collecting garbage to reduce memory.

Why?

aturon's blog post explains the issues of concurrent memory handling very well, although it take basis in epoch-based reclamation, which this crate is an alternative for.

The gist essentially is that you need to delete objects in most concurrent data structure (otherwise there would be memory leaks), however cannot safely do so, as there is no way to know if another thread is accessing the object in question. This (and other reclamation systems) provides a solution to this problem.

Usage

While the low-level API is available, it is generally sufficient to use the conc::Atomic<T> abstraction. This acts much like familiar Rust APIs. It allows the programmer to concurrently access a value through references, as well as update it, and more. Refer to the respective docs for more information.

If you are interested in implementing your own structures with conc, you must learn how to use Guard<T> and add_garbage. In short,

conc::add_garbage() adds a destructor with a pointer, which will be run eventually, when no one is reading the data anymore. In other words, it acts as a concurrent counterpart to Drop::drop().
Guard<T> "protects" a pointer from being destroyed. That is, it delays destruction (which is planned by conc::add_garbage()) until the guard is gone.

See their respective API docs for details on usage and behavior.

Debugging

Enable feature debug-tools and set environment variable CONC_DEBUG_MODE. For example, CONC_DEBUG_MODE=1 cargo test --features debug-tools. To get stacktraces after each message, set environment variable CONC_DEBUG_STACKTRACE.

Why not crossbeam/epochs?

Epochs and classical hazard pointers are generally faster than this crate, but it doesn't matter how fast it is, it has to be right.

The issue with most other and faster solutions is that, if there is a non-trivial amount of threads (say 16) constantly reading/storing some pointer, it will never get to a state, where it can be reclaimed.

In other words, given sufficient amount of threads and frequency, the gaps between the reclamation might be very very long, causing very high memory usage, and potentially OOM crashes.

These issues are not hypothetical. It happened to me while testing the caching system of TFS. Essentially, the to-be-destroyed garbage accumulated several gigabytes, without ever being open to a collection cycle.

It reminds of the MongoDB debate. It might very well be the fastest solution¹, but if it can't even ensure consistency, what is the point?

That being said, there are cases where this library is faster than the alternatives. Moreover, there are cases where the other libraries are fine (e.g. if you have a bounded number of thread and a medium-long interval between accesses).

¹If you want a super fast memory reclamation system, you should try NOP™, and not calling destructors.

Internals

It based on hazard pointers, although there are several differences. The idea is essentially that the system keeps track of some number of "hazards". As long as a hazard protects some object, the object cannot be deleted.

Once in a while, a thread performs a garbage collection by scanning the hazards and finding the objects not currently protected by any hazard. These objects are then deleted.

To improve performance, we use a layered approach: Both garbage (objects to be deleted eventually) and hazards are cached thread locally. This reduces the amount of atomic operations and cache misses.

Garbage collection

Garbage collection of the concurrently managed object is done automatically between every n frees where n is chosen from some probability distribution.

Note that a garbage collection cycle might not clear all objects. For example, some objects could be protected by hazards. Others might not have been exported from the thread-local cache yet.

Performance

It is worth noting that atomic reads through this library usually requires three atomic CPU instruction, this means that if you are traversing a list or something like that, this library might not be for you.

concurrent 0.2.1

conc — An efficient concurrent reclamation system