cycle_ptr 0.1.1

Smart pointers, with cycles
Documentation
# Cycle Ptr Design

Concepts:
- *object*: an object that the pointers point at; objects can be destroyed, by means of [Drop]
- *pointer*: a pointer, which shares ownership (or, in the case of *weak pointer*, does not have ownership)
- *reachability*: the ability for a function to reach an *object*
- *acquiring* (an object): the act of a *pointer* to point at an *object*
- *releasing* (an object): the act of a *pointer* no longer pointing at an *object*
- *expired*: state of an object that is about to be destroyed; *member pointers* on an expired object are no longer dereferencable

The cycle pointer library has these user-visible components:
- *strong pointer*: a *pointer* for use in global and function context, and on objects that don't participate in the graph; *objects* pointed at by this *pointer* are always *reachable*
- *weak pointer*: a *pointer* that does not have ownership
- *member pointer*: a *pointer* for use on objects that participate in the graph

The cycle library has these invisible components:
- *generation*: a group of objects which may contain *member pointers* to eachother
- *generation ID*: an identifier for a *generation*, this may be unique, but it also may not be unique; these identifiers can be ordered
- *control-block*: an struct used for bookkeeping information, and associated with exactly one *object*

The cycle library has these invisible utilities:
- *list*: a collection that will contain mutable references to objects

## Multi Threaded

The cycle pointer divides its components into two styles:
- thread-safe: pointers and objects which implement [Send]
- not-thread-safe: pointers and objects which aren't [Send]

Pointers in this library are not [Sync].

## Generation Invariants and Consequences

The library maintains the following invariants:
1. *member pointers* on objects in *generation* A can point at *generation* B, if the *generation ID* of A is less than the *generation ID* of B
   (this allows us to run garbage collections on a generation without needing to consider other generations at all)
2. *thread-safe* *member pointer* can point at [Send] objects, but not at non-[Send] objects
   (this is actually more a Rust thing, but we'll make use of it)
3. a *not-thread-safe* *member pointer* can point at [Send] objects and non-[Send] objects both
4. *generation* is divided in *thread-safe* and *not-thread-safe*, *thread-safe* generations can contain only [Send] objects, whereas *not-thread-safe* generations may contain only non-[Send] objects
5. a *thread-safe* *generation* has a *generation ID* that is always larger than the *generation ID* of all *not-thread-safe* *generation*
6. a *thread-safe* *generation* can have its garbage collector run on any thread
7. a *not-thread-safe* *generation* can have its garbage collector run only on the thread that it was created on
8. a *member pointer* can only be dereferenced, if the origin object is reachable
9. a *member pointer* will increase a reference count if the origin and destination object belong to different *generations*
10. if two *objects* are part of the same *generation*, they will always share a *generation* (*generations* can't be split into separate parts)

## Garbage Collector

The garbage collector will run on a single *generation*.  Multiple *generations* can perform a garbage collector cycle concurrently.  A *generation* will only run one garbage collector at a time.

The garbage collector will invalidate *member pointers* on any objects it deems *unreachable*, prior to running the [Drop] of that object.  This will prevent the [Drop] from turning an *unreachable* *object* back into a *reachable* *object*.  This is also important in order to break cycles: by breaking the link, we can [Drop] the *objects* in arbitrary order.

The underlying algorithm is a tri-colour algorithm:
- white: denotes an *object* that is not _directly_ *reachable* from a black *object*
- grey: denotes an *object* that is _directly_ *reachable*, or _directly_ *reachable* from a grey or black *object*
- black: denotes an *object* that is reachable, and does not point at any white *objects*

Changes to *strong pointers* and *member pointers* will update the colours, any may cause a garbage collection to start.

When the colour invariant is broken, a mark-sweep operation will be run to restore the invariant.

### Strong Pointer

#### Assignment

The *strong pointer*, when acquiring an *object*, will increment the reference count on that *object*.  If the *object* colour was white, it is changed to grey.

#### Release

The *strong pointer*, when releasing an *object*, will decrement the reference count on that *object*.

If the reference count reaches 0, the *object* may or may not be *reachable*.  So we can't simply colour the *object* white (that would break invariant on any black *objects*), nor can we colour it grey (the object may be *unreachable*).  Furthermore if the object is *unreachable*, the *object* may point at other *objects*, that are now no longer *reachable*, but are still coloured grey or black.  Since this means the colour invariant is broken, we start a mark-sweep to restore it.

If the reference count did not reach 0, there must be some other *pointer* ensuring the *object* is reachable, and thus no action is required.

### Weak Pointer

#### Assignment

The *weak pointer*, when acuiring an *object*, will only take ownership of the *control-block*.  It will not affect *object* *reachability*.

#### Release

The *weak pointer*, when releasing an *object*, will only release ownership of the *control-block*.  It will not affect *object* *reachability*.

### Member Pointer

A *member pointer* can either be cross-generational (the origin and destination *object* are of different *generations*) or intra-generational (the origin and destination *object* are part of the same generation).

#### Assignment

If the *member pointer* is cross-generational, it will increment the reference count on the *object*.

If the *member pointer* is intra-generational, reference count is not altered.

In both cases, if the colour of the *object* was white, it is changed to grey.

#### Release

If the *member pointer* is cross-generational, it functions exactly the same as a *strong pointer*.

If the *member pointer* is intra-generational, and the (pointed-to) *object* has a reference count of 0, we can no longer maintain the colour invariant.  And thus, if the *object* isn't white, we need a mark-sweep to restore the invariant.  (If the *object* is white, we already know it's unreachable (or that a mark-sweep is running to update its colour), and there's no need to confirm it a second time.)

### Mark-Sweep

The mark-sweep operation is used to restore the colour invariant, once it has been broken.  (It's safe to run when it hasn't been broken, but we want to avoid it, as it costs CPU time that could have gone to something we care about instead.)

The mark-sweep operation itself is an algorithm, but before we start it, we permit user-space some control over its execution.

#### Prelude

When we know a mark-sweep has to be run, we set a flag on the generation.  If the flag was not already set, we then submit a task object to a dispatcher.  The task will run the actual algorithm.

The flag indicates that a task has been scheduled, but not started, and thus allows us to reduce the number of algorithm invocations.

#### Task

The task will start by preventing other tasks from running (using the `task_exclusion`).  This is because the first action of the task, will be to clear the flag set in the prelude (and thus may cause other threads to enqueue another task on the same *generation*).  Tasks have the requirement to not be run concurrently for the same *generation*.

Next, the task will clear the flag that was set during the prelude.  This is to indicate that any objects becoming potentially-*unreachable* from this point on, may require their own task to be run.

Next, the task will change the colour of all *objects* on the *generation*:
- any object with a reference count of 0, will be coloured white
- any object with a reference count of non-zero, will be coloured grey
- there will be no black objects

Next, all grey objects are evaluated one-at-a-time:
- all their intra-generational *member pointers* have their *object* colour updated: white turns grey
- the *object* is coloured black

Note that, during this time, the pointers may also update colours of objects.

Once that is complete, the `weak_lockout` is activated (exclusive), which will temporarily prevent *weak pointers* from having their *object* be *acquired*.  After the lockout is activated, another pass is done over the objects, to find any grey objects.

By now, there should be only white and black objects.  Any *objects* that are still white, are marked as *expired*, after which the lockout is deactivated.

All *expired* objects are collected in a list, after which the task will no longer prevent other tasks from running (for example by releasing the mutex that was acquired at the start).

Only once other tasks are allowed to run, will the *expired* objects be destroyed.

Note: it's super important that *expired* objects are only destroyed after all objects have been marked *expired*.  In part because we don't want to block other tasks from running while waiting on (potentially-slow) destructors.

## Deref

*Strong pointers* and *member pointers* are both dereferenceable.  (*Weak pointers* are not dereferenceable.)

When a *strong pointer* is dereferenced, it always returns the *object* reference.

When a *member pointer* is dereferenced:
- if the origin is *expired*, it panics (in `try_deref`, we return an error instead)
- otherwise, it returns the *object* reference

## Member Pointer Assignment

When a *member pointer* is assigned to, the invariant (origin-*generation* before-or-same-identity-as destination-*generation*) must hold.

In order to maintain the invariant, we'll need to:
- confirm if the *generations* of origin and destination meet this constraint
- and if they do, ensure the *generations* don't change this constraint
- but if they don't, change them into meeting this constraint

1. Since *generations* are shared pointers, and can be altered, we need to grab a copy of the generation pointers (so we ensure the *generation* remains live.
2. Compare the addresses of both the origin and destination *generation*.  If both are the same: use the *member pointer* assignment, for the non-reference-counted case.  (Since we know the two used the same *generation* at some point, and *generations* can't be split, we know they'll forever share a generation.)  Then return.
3. Lock-shared the origin *generation*.  (This will prevent the origin *object* from being moved to a different generation.)  Once the *generation* is locked, confirm it is still the *generation* of the origin *object*.  If it is different, re-read the origin generation and re-try until success.
4. Re-check if the generations are the same.  (We have to re-check, because a thread could have combined the generations before we acquired the lock.)  If they are the same: unlock, use the *member pointer* assignment for the non-reference-counted case, then return.
5. Read the *generation ID* on the origin and destination *generations*.  If the `origin_id < destination_id`, run the acquire logic, using reference counters.  Then return.
6. Otherwise, run the generation-merge algorithm to combine the generations together.

Note that, because we need to coordinate between two generations, we'll require a read-lock on the generations.
That way, another thread can't merge them from under us.

## Struct Layouts

`RWLockAlike` could be [RWLock] for *thread-safe* case, or [RefCell] for *non-thread-safe* case.
`MutexAlike` could be [Mutex] for *thread-safe* case, or [RefCell] for *non-thread-safe* case.

### Generation Pointer

    struct GenerationPtr = Rc/Arc

    impl Deref<Generation> for GenerationPtr;
    impl Clone for GenerationPtr;
    impl Send+Sync for GenerationPtr; // (multi-threaded case only)

Possible implementations:

    Rc<Generation> // non-threaded case
    Arc<Generation> // threaded case

### Generation

    struct Generation {
        id               GenerationID // (never changes)
        task_exclusion   MutexAlike
        weak_lockout     RWLockAlike
        objects          RWLockAlike<List<ControlBlock, ControlBlockTag>>
    }

### GenerationId

    struct GenerationId {
        sequence  usize
    }

### ControlBlock

    struct ControlBlock {
        destructor  Box<dyn FnOnce() -> ()> // destruction function (do we need this?)
        generation  RWLockAlike<GenerationPtr> // can change when current and new generation objects are exclusively locked
        refcount    Atomic<usize, Colour, expiredBool>
    }