Expand description

Reference cycle garbage collection inspired by cpython.

The type Cc<T> provides shared ownership of a value of type T, similar to std::rc::Rc<T>. Unlike Rc<T>, collect_thread_cycles can be used to drop unreachable values that form circular references.

Examples

Cloning references

Similar to Rc<T>, use clone() to get cloned references.

use jrsonnet_gcmodule::Cc;
let foo = Cc::new(vec![1, 2, 3]);
let foo_cloned = foo.clone();

// foo and foo_cloned both point to the same `vec![1, 2, 3]`.
assert!(std::ptr::eq(&foo[0], &foo_cloned[0]));

Collecting cycles

Use collect_thread_cycles() to collect thread-local garbage.

Use count_thread_tracked() to count how many objects are tracked by the collector.

use jrsonnet_gcmodule::{Cc, Trace};
use std::cell::RefCell;
{
    type List = Cc<RefCell<Vec<Box<dyn Trace>>>>;
    let a: List = Default::default();
    let b: List = Default::default();
    a.borrow_mut().push(Box::new(b.clone()));
    b.borrow_mut().push(Box::new(a.clone()));
}

// a and b form circular references. The objects they point to are not
// dropped automatically, despite both variables run out of scope.

assert_eq!(jrsonnet_gcmodule::count_thread_tracked(), 2);   // 2 values are tracked.
assert_eq!(jrsonnet_gcmodule::collect_thread_cycles(), 2);  // This will drop a and b.
assert_eq!(jrsonnet_gcmodule::count_thread_tracked(), 0);   // no values are tracked.

Multi-thread support

The main type Cc works fine in a single-thread environment.

There are also ThreadedObjectSpace and ThreadedCc for multi-thread usecases. Beware they take more memory, are slower, and a bit harder to use.

use jrsonnet_gcmodule::{ThreadedObjectSpace, ThreadedCc, Trace};
use std::sync::Mutex;

type List = ThreadedCc<Mutex<Vec<Box<dyn Trace + Send + Sync>>>>;
let space = ThreadedObjectSpace::default();
{
    let list1: List = space.create(Mutex::new(Default::default()));
    let list2: List = space.create(Mutex::new(Default::default()));
    let thread = std::thread::spawn(move || {
        list1.borrow().lock().unwrap().push(Box::new(list2.clone()));
        list2.borrow().lock().unwrap().push(Box::new(list1.clone()));
    });
    thread.join().unwrap();
}
assert_eq!(space.count_tracked(), 2);
assert_eq!(space.collect_cycles(), 2);
assert_eq!(space.count_tracked(), 0);

Defining new types

Cc<T> requires Trace implemented for T so the collector knows how values are referred. That can usually be done by #[derive(Trace)].

Acyclic types

If a type is acyclic (cannot form reference circles about Cc), Trace::is_type_tracked() will return false.

use jrsonnet_gcmodule::{Cc, Trace};

#[derive(Trace)]
struct Foo(String);

#[derive(Trace)]
struct Bar;

assert!(!Foo::is_type_tracked()); // Acyclic
assert!(!Bar::is_type_tracked()); // Acyclic

let foo = Cc::new(Foo("abc".to_string()));
let bar = Cc::new(Bar);
let foo_cloned = foo.clone(); // Share the same `"abc"` with `foo`.
assert_eq!(jrsonnet_gcmodule::count_thread_tracked(), 0); // The collector tracks nothing.
drop(foo); // The ref count of `"abc"` drops from 2 to 1.
drop(foo_cloned); // `"abc"` will be dropped here..

Container types

Whether a container type is acyclic or not depends on its fields. Usually, types without referring to trait objects or itself are considered acyclic.

use jrsonnet_gcmodule::{Cc, Trace};

#[derive(Trace)]
struct Foo<T1: Trace, T2: Trace>(T1, T2, u8);

// `a` is not tracked - types are acyclic.
let a = Cc::new(Foo(Foo(Cc::new(1), 2, 3), Cc::new("abc"), 10));
assert_eq!(jrsonnet_gcmodule::count_thread_tracked(), 0);

// `b` is tracked because it contains a trait object.
let b = Cc::new(Foo(Box::new(1) as Box<dyn Trace>, 2, 3));
assert_eq!(jrsonnet_gcmodule::count_thread_tracked(), 1);

The #[skip_trace] attribute can be used to skip tracking specified fields in a structure.

use jrsonnet_gcmodule::{Cc, Trace};

struct AlienStruct; // Does not implement Trace

#[derive(Trace)]
struct Foo {
    field: String,

    #[trace(skip)]
    alien: AlienStruct, // Field skipped in Trace implementation.
}

Weak references

Similar to std::rc::Rc, use Cc::downgrade to create weak references. Use Weak::upgrade to test if the value is still alive and to access the value. For example:

use jrsonnet_gcmodule::{Cc, Weak};

let value = Cc::new("foo");
let weak: Weak<_> = value.downgrade();
assert_eq!(*weak.upgrade().unwrap(), "foo");
drop(value);
assert!(weak.upgrade().is_none());  // Cannot upgrade after dropping value

Technical Details

Memory Layouts

Cc<T> uses different memory layouts depending on T.

Untracked types

If <T as Trace>::is_type_tracked() returns false, the layout is similar to Rc<T>:

Shared T                    Pointer
+-------------------+     .-- Cc<T>
| ref_count: usize  | <--<
| weak_count: usize |     '-- Cc<T>::clone()
|-------------------|
| T (shared data)   | <--- Cc<T>::deref()
+-------------------+

Tracked types

If <T as Trace>::is_type_tracked() returns true, the layout has an extra GcHeader that makes the value visible in a thread-local linked list:

Shared T with GcHeader
+-------------------+
| gc_prev: pointer  | ---> GcHeader in a linked list.
| gc_next: pointer  |
| vptr<T>: pointer  | ---> Pointer to the `&T as &dyn Trace` virtual table.
|-------------------|
| ref_count: usize  | <--- Cc<T>
| weak_count: usize |
| ----------------- |
| T (shared data)   | <--- Cc<T>::deref()
+-------------------+

Incorrect Trace implementation

While most public APIs provided by this library looks safe, incorrectly implementing the Trace trait has consequences.

This library should cause no undefined behaviors (UB) even with incorrect Trace implementation on debug build.

Below are some consequences of a wrong Trace implementation.

Memory leak

If Trace::trace does not visit all referred values, the collector might fail to detect cycles, and take no actions on cycles. That causes memory leak.

Note: there are other ways to cause memory leak unrelated to an incorrect Trace implementation. For example, forgetting to call collect functions can cause memory leak. When using the advanced ObjectSpace APIs, objects in one space referring to objects in a different space can cause memory leak.

Panic

If Trace::trace visits more values than it should (for example, visit indirectly referred values, or visit a directly referred value multiple times), the collector can detect such issues and panic the thread with the message:

bug: unexpected ref-count after dropping cycles
This usually indicates a buggy Trace or Drop implementation.

Undefined behavior (UB)

After the above panic (bug: unexpected ref-count after dropping cycles), dereferencing a garbage-collected Cc<T> will trigger panic! or UB depending on whether it’s a debug build or not.

On debug build, sanity checks are added at Cc::<T>::deref(). It will panic if T was garbage-collected:

bug: accessing a dropped CcBox detected

In other words, no UB on debug build.

On release build the dereference would access dropped values, which is an undefined behavior. Again, the UB can only happen if the Trace::trace is implemented wrong, and panic will happen before the UB.

Structs

Provides advanced explicit control about where to store Cc objects.

Low-level type for Cc<T>.

Low-level type for Weak<T>.

Wraps a borrowed reference to ThreadedCc.

A collection of tracked ThreadedCc objects that can be garbage collected.

Constants

Whether the debug feature is enabled.

Traits

Defines how the cycle collector should collect a type.

Functions

Collect cyclic garbage in the current thread created by Cc::new. Return the number of objects collected.

Count number of objects tracked by the collector in the current thread created by Cc::new. Return the number of objects tracked.

Acquire reference to thread-local global object space

Type Definitions

A single-threaded reference-counting pointer that integrates with cyclic garbage collection.

A multi-thread reference-counting pointer that integrates with cyclic garbage collection.

Callback function that serves as the parameter of Trace::trace.

Weak reference of Cc.

Derive Macros

Derive Trace implementation for a structure.