procref 0.1.0

Cross-platform process reference counting for shared service lifecycle management
Documentation
# procref

Cross-platform process reference counting for shared service lifecycle management.

## Overview

`procref` provides kernel-level reference counting to manage shared services across multiple processes. When multiple processes need to share a single service (like a database server), `procref` handles:

- **Automatic startup**: First client starts the service
- **Reference tracking**: Kernel tracks how many processes are using the service
- **Crash safety**: If a process crashes, the kernel automatically decrements the count
- **Automatic shutdown**: When last client exits, the service is stopped

## Platform Support

| Platform | Mechanism | Auto-cleanup on crash |
|----------|-----------|----------------------|
| Linux | System V Semaphore + `SEM_UNDO` | ✅ Kernel auto-undo |
| macOS | Mach Port send rights | ✅ Kernel auto-release |
| Windows | Named Semaphore | ✅ Handle auto-close |

## Key Design Principle

**Trust the kernel, not files.**

Unlike file-based approaches that can leave stale state after crashes, `procref` uses kernel-managed primitives that are automatically cleaned up:

- Linux: `SEM_UNDO` flag ensures kernel reverses semaphore operations on exit
- macOS: Mach port rights are reference-counted by the kernel
- Windows: Named semaphore handles are closed when process terminates

## Usage

```rust
use procref::{SharedService, ServiceInfo, Result};

#[tokio::main]
async fn main() -> Result<()> {
    // Create a shared service manager
    let service = SharedService::builder("my-database")
        // Called when first client connects (start the service)
        .on_first_acquire(|| async {
            let port = 5432;
            let pid = start_database(port).await?;
            Ok(ServiceInfo::new(pid, port))
        })
        // Called when last client disconnects (stop the service)
        .on_last_release(|info| async move {
            procref::process::stop(info.pid(), 5000);
            Ok(())
        })
        // Optional: Custom health check
        .on_health_check(|info| async move {
            check_database_connection(info.port()).await
        })
        // Optional: Recovery when health check fails
        .on_recover(|old_info| async move {
            procref::process::stop(old_info.pid(), 1000);
            let port = 5432;
            let pid = start_database(port).await?;
            Ok(ServiceInfo::new(pid, port))
        })
        .build()?;

    // Acquire a reference (starts service if first client)
    let handle = service.acquire().await?;

    println!("Service running on port {}", handle.port());

    // Use the service...

    // When handle is dropped, reference is released
    // If this is the last client, on_last_release is called
    Ok(())
}
```

## Lifecycle Callbacks

| Callback | When Called | Purpose |
|----------|-------------|---------|
| `on_first_acquire` | Count goes 0→1 | Start the service |
| `on_last_release` | Count goes 1→0 | Stop the service |
| `on_health_check` | Every acquire (except first) | Verify service is healthy |
| `on_recover` | Health check fails | Restart/recover the service |

## Process Utilities

```rust
use procref::process;

// Check if a process is alive
if process::is_alive(pid) {
    // Process exists
}

// Graceful shutdown (SIGTERM)
process::terminate(pid);

// Forceful shutdown (SIGKILL)
process::kill(pid);

// Graceful with timeout, then forceful
process::stop(pid, 5000); // 5 second timeout
```

## How It Works

### Reference Counting Flow

```
Process A starts:
  1. acquire() → count becomes 1 (was 0)
  2. on_first_acquire() called → service starts
  3. ServiceHandle returned

Process B starts:
  1. acquire() → count becomes 2
  2. Health check runs
  3. ServiceHandle returned

Process A exits (or crashes):
  1. Kernel decrements count → becomes 1
  2. (If crash, kernel handles this automatically)

Process B exits:
  1. release() → count becomes 0
  2. on_last_release() called → service stops
```

### Crash Safety

When a process crashes:

1. **Linux**: Kernel applies `SEM_UNDO` - reverses the `semop()` that incremented the count
2. **macOS**: Kernel releases all Mach port rights held by the process
3. **Windows**: Kernel closes all handles, decrementing semaphore count

No stale state. No cleanup needed. No race conditions.

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                     SharedService                           │
│  - Lifecycle callbacks (on_first_acquire, on_last_release) │
│  - Health check and recovery                                │
│  - Service info persistence                                 │
└─────────────────────────────────────────────────────────────┘
                                                            ┌─────────────────────────────────────────────────────────────┐
│                     RefCounter trait                        │
│  - acquire() / release()                                    │
│  - count()                                                  │
│  - try_lock() / unlock() (for startup coordination)         │
└─────────────────────────────────────────────────────────────┘
                                            ┌───────────────┼───────────────┐
              ▼               ▼               ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ LinuxRefCounter │ │ MacOSRefCounter │ │WindowsRefCounter│
│ (System V sem)  │ │ (Mach ports)    │ │ (Named sem)     │
└─────────────────┘ └─────────────────┘ └─────────────────┘
```

## License

MIT