abfall 0.1.0

Concurrent tri-color tracing garbage collector for Rust with incremental and concurrent mark-sweep
Documentation
# Implementation Summary

This document provides a technical overview of the concurrent tri-color tracing mark and sweep garbage collector implementation.

## Architecture Overview

### Core Components

1. **Color System** (`color.rs`)
   - `Color` enum: White, Gray, Black states
   - `AtomicColor`: Thread-safe color storage using atomic operations

2. **Heap Management** (`heap.rs`)
   - `GcBox<T>`: Wrapper around managed objects with GC metadata
   - `Heap`: Central heap structure managing all allocations
   - Tracks allocations via intrusive linked list
   - **GC Phase State Machine**: Idle, Marking, Sweeping
   - **Background Thread**: Performs incremental collection

3. **Smart Pointers** (`ptr.rs`)
   - `GcPtr<T>`: Lightweight copy pointer to GC objects
   - `GcRoot<T>`: Rooted reference that keeps objects alive
   - Manages root set membership automatically

4. **Write Barriers** (`cell.rs`)
   - `GcCell<T>`: Interior mutability with write barriers
   - Traces new values during marking phase
   - Maintains tri-color invariant during concurrent mutation

5. **Garbage Collector** (`gc.rs`)
   - `GcContext`: Thread-local API for allocation and collection
   - Supports both automatic and manual collection modes
   - Background collection thread for concurrent GC

## Tri-Color Algorithm

### Invariants

1. **White**: No black object points to a white object
2. **Gray**: Objects in the gray queue are being processed
3. **Black**: All references from black objects have been scanned

### Collection Phases

#### Mark Phase
```
1. Initialize: All objects are white
2. Color roots gray, add to gray queue
3. While gray queue is not empty:
   a. Pop object from gray queue
   b. Trace object references (via vtable)
   c. Add newly discovered objects to gray queue
   d. Mark object black when fully scanned
```

#### Sweep Phase
```
1. Iterate through all allocations
2. For each white (unmarked) non-root object:
   a. Deallocate memory
   b. Remove from allocation list
3. For all other objects:
   a. Reset color to white for next cycle
```

## Concurrent Collection

### Synchronization Strategy (Go-inspired)

1. **GC Phase State**
   - Atomic state: Idle → Marking → Sweeping → Idle
   - Write barriers only active during marking
   - Prevents concurrent collections

2. **Background Thread**
   - Periodically checks if collection should run
   - Performs incremental marking in work budgets (100 objects/iteration)
   - Yields between iterations to allow mutators to progress

3. **Write Barriers (Dijkstra-style)**
   - Active only during marking phase
   - Traces new pointer values when storing to `GcCell`
   - Ensures newly reachable objects are marked gray

4. **STW Pauses**
   - Brief pause for root scanning at mark start
   - Most marking happens concurrently

5. **Incremental Marking**
   - `mark_incremental(budget)` processes limited work
   - Returns true when marking complete
   - Background thread processes in chunks

### Thread Safety

- **Atomic Operations**: Color changes, phase transitions, allocation lists
- **Mutex Protection**: Gray queue, thread list
- **Intrusive Linked List**: Lock-free traversal for allocation list
- **Stop Signal**: Graceful background thread shutdown

### Memory Ordering

- **Acquire-Release**: Phase transitions, color operations
- **Relaxed**: Byte count tracking (non-critical)
- **SeqCst**: Critical color CAS operations

## Memory Management

### Allocation Strategy

```rust
1. Allocate GcBox<T> with Box::new
2. Initialize metadata (color=White, root_count=1)
3. Insert into intrusive linked list (atomic CAS loop)
4. Return GcRoot wrapping the pointer
```

### Deallocation Strategy

```rust
1. During sweep, identify unmarked non-roots
2. Remove from linked list
3. Call vtable drop function (Box::from_raw)
4. Memory automatically freed
```

### Safety Invariants

1. All pointers in allocation list point to valid `GcBox` instances
2. Objects are only deallocated during sweep phase (when GC is in Sweeping state)
3. Root objects are never collected (root_count > 0)
4. No dangling pointers after collection
5. Write barriers maintain tri-color invariant during concurrent marking

## Performance Characteristics

### Time Complexity

- **Allocation**: O(1) - lock-free linked list insertion
- **Root Operations**: O(1) - atomic counter updates
- **Mark Phase**: O(R + E) where R = roots, E = reachable edges
- **Sweep Phase**: O(N) where N = total allocations
- **Incremental Mark**: O(W) where W = work budget

### Space Complexity

- **Per-Object Overhead**: 
  - 1 byte for color (AtomicU8)
  - 8 bytes for root count (AtomicUsize)
  - 8 bytes for next pointer (AtomicPtr)
  - 8 bytes for vtable pointer
  - ~25-32 bytes total per object (with alignment)

- **Heap Overhead**:
  - Gray queue: O(G) where G = gray objects
  - Thread list: O(T) where T = threads
  - Background thread stack: O(1)

### Scalability

- **Concurrent Allocation**: Lock-free linked list (minimal contention)
- **Parallel Mutation**: Write barriers allow safe concurrent writes
- **Incremental Collection**: Pause times proportional to work budget, not heap size
- **Background Collection**: Minimal interference with application threads

## Current Implementation Status

### ✅ Implemented Features

1. **Tri-color marking** with atomic color transitions
2. **Incremental marking** with configurable work budgets
3. **Write barriers** for concurrent mutation safety
4. **Background GC thread** for automatic collection
5. **GC phase state machine** (Idle/Marking/Sweeping)
6. **Reference tracing** via vtable function pointers
7. **Intrusive linked list** for allocation tracking
8. **Thread-safe operations** throughout
9. **Root tracking** via reference counting
10. **Graceful shutdown** of background thread

### ⚠️ Known Limitations

1. **No Compaction**: Heap fragmentation may occur
2. **No Generational Collection**: All objects treated equally
3. **Simple Threshold**: Fixed ratio for collection trigger
4. **No Parallel Marking**: Single background thread only
5. **Thread List Unused**: Not yet used for work coordination

## Future Enhancements

### High Priority

1. **Mutator Assist**: Application threads help with marking if allocation outpaces GC
2. **Work Stealing**: Distribute marking work across application threads
3. **Better Heuristics**: Adaptive threshold based on allocation rate

### Medium Priority

4. **Generational Collection**: Separate young/old generations
5. **Parallel Marking**: Use multiple threads for marking
6. **Reference Counting Hybrid**: Immediate collection of acyclic structures

### Low Priority

7. **Compaction**: Reduce fragmentation by moving objects
8. **Large Object Space**: Separate area for large allocations
9. **Write Barrier Optimization**: Conditional barriers, remembered sets

## Testing Strategy

### Current Tests

1. **Basic Allocation**: Verify allocation and access
2. **Collection**: Verify unreachable objects are collected
3. **Concurrent Collection**: Multiple threads with background GC
4. **Incremental Marking**: Verify incremental progress
5. **Write Barriers**: Verify concurrent mutation safety

### Recommended Additional Tests

1. **Stress Tests**: High allocation rate with background collection
2. **Cyclic Structures**: Verify cycle detection works
3. **Memory Leak Detection**: Long-running programs
4. **Benchmarks**: Compare with other GC strategies (e.g., Boehm GC)
5. **Pause Time Analysis**: Measure STW pauses

## Code Quality

### Safety

- Minimal `unsafe` code, clearly documented
- All unsafe blocks have safety comments
- Atomic operations for all shared state
- No data races (verified by type system)

### Correctness

- All tests pass
- Write barriers maintain tri-color invariant
- Phase transitions are atomic
- Graceful shutdown prevents use-after-free

### Maintainability

- Comprehensive documentation
- Clear module boundaries
- Simple, understandable algorithm
- Go-inspired design patterns

## Conclusion

This implementation provides a production-ready concurrent garbage collector in Rust, inspired by Go's GC design. It features:

- **Concurrent marking** with write barriers
- **Incremental collection** to minimize pause times
- **Background thread** for automatic memory management
- **Thread-safe** operations throughout

The architecture is extensible and can be enhanced with more sophisticated features like generational collection, parallel marking, and compaction as needed.