zinit 0.3.6

Process supervisor with dependency management
Documentation
# ADR-005: Integrate sysvol as Built-in Service

## Status

Draft

## Context

### What is sysvol?

`sysvol` (system volume) provides persistent local storage for:
- **System state** - service state, configuration persistence
- **Cache** - kernel modules, firmware, software upgrades
- **Host-local data** - anything that needs to survive reboots

### Boot Sequence

```
udev → network → fuse/myfs mounts → udev trigger → wait → sysvol init
```

sysvol runs late in boot, after:
1. udev has detected hardware
2. Network is up
3. Remote mounts (fuse/myfs) provide modules/firmware
4. udev trigger completes with full device population

### Current State

Storage initialization will be handled by an external binary spawned as a oneshot service.
The library (currently `mosstorage/`, to be renamed `sysvol/`) exists in the zinit workspace.

### Problem

1. External binary adds deployment complexity
2. Extra process spawn overhead during boot
3. Error handling across process boundary is limited
4. Library already exists - why spawn a binary wrapper?

## Decision

### Integrate sysvol as a "builtin" service

Replace external binary with a direct library call from zinit-server using a special `@builtin:` prefix.

### **CRITICAL SAFETY GATE: PID1 Mode Only**

Builtin services that touch hardware **MUST only run in PID1 mode**. In standalone mode, zinit is just a process supervisor - we cannot risk formatting disks.

**Implementation**: When `exec = "@builtin:sysvol"` is encountered AND `pid1_mode == false`:
- Log warning: "builtin sysvol skipped (not in PID1 mode)"
- Transition service directly to `Running { pid: 0 }` (no-op success)
- Dependents proceed normally

### Async with Event

1. After dependencies complete (udev-trigger), sysvol service becomes eligible
2. Run `sysvol::init()` in a tokio blocking task (non-blocking to supervisor)
3. Emit `BuiltinCompleted` event when done
4. Handle outcomes: Mounted, Initialized, NoDisk (all success), or Error (failure)

## Implementation

### 1. Rename crate

```
mosstorage/ → sysvol/
```

Update `Cargo.toml` workspace members.

### 2. Add dependency

**File**: `zinit-server/Cargo.toml`
```toml
sysvol = { path = "../sysvol" }
```

### 3. Add builtin detection and PID1 gate

**File**: `zinit-server/src/supervisor.rs`

```rust
const BUILTIN_PREFIX: &str = "@builtin:";

impl Supervisor {
    fn is_builtin_service(config: &ServiceConfig) -> Option<&str> {
        config.service.exec.strip_prefix(BUILTIN_PREFIX)
    }

    async fn run_builtin_service(&mut self, id: ServiceId, builtin: &str, config: &ServiceConfig) {
        let name = config.service.name.clone();

        // SAFETY GATE: Only run hardware-touching builtins in PID1 mode
        if !self.pid1_mode {
            tracing::warn!(
                service = %name,
                builtin = builtin,
                "builtin service skipped (not in PID1 mode)"
            );
            // No-op success - let dependents proceed
            let mut graph = self.graph.write().await;
            if let Some(service) = graph.get_mut(id) {
                service.state = ServiceState::Running { pid: 0 };
            }
            self.queue_reevaluate(graph.dependents(id)).await;
            return;
        }

        match builtin {
            "sysvol" => self.run_sysvol(id, &name).await,
            _ => {
                tracing::error!(builtin = builtin, "unknown builtin service");
                // mark failed
            }
        }
    }

    async fn run_sysvol(&mut self, id: ServiceId, name: &str) {
        // Set Starting state
        {
            let mut graph = self.graph.write().await;
            if let Some(service) = graph.get_mut(id) {
                service.record_started();
                service.state = ServiceState::Starting { pid: 0 };
            }
        }

        let event_tx = self.event_tx.clone();
        let service_name = name.to_string();

        tokio::spawn(async move {
            let result = tokio::task::spawn_blocking(sysvol::init).await;

            let (success, error_msg) = match result {
                Ok(Ok(state)) => {
                    match state {
                        sysvol::StorageState::Mounted { device, .. } => {
                            tracing::info!(device = %device.display(), "sysvol mounted");
                            (true, None)
                        }
                        sysvol::StorageState::Initialized { device, .. } => {
                            tracing::info!(device = %device.display(), "sysvol initialized");
                            (true, None)
                        }
                        sysvol::StorageState::NoDisk => {
                            tracing::info!("no disk found, running diskless");
                            (true, None)
                        }
                    }
                }
                Ok(Err(e)) => {
                    tracing::error!(service = %service_name, error = %e, "sysvol failed");
                    (false, Some(e.to_string()))
                }
                Err(e) => {
                    tracing::error!(service = %service_name, error = %e, "sysvol panicked");
                    (false, Some(e.to_string()))
                }
            };

            let _ = event_tx.send(SupervisorEvent::BuiltinCompleted {
                service_id: id,
                success,
                error: error_msg,
            }).await;
        });
    }
}
```

### 4. Add event type and handler

**File**: `zinit-server/src/supervisor.rs`

```rust
pub enum SupervisorEvent {
    // ... existing variants

    /// A builtin service completed.
    BuiltinCompleted {
        service_id: ServiceId,
        success: bool,
        error: Option<String>,
    },
}
```

### 5. Modify try_start_service

Detect builtin prefix and dispatch:

```rust
StartAction::SpawnProcess { config } => {
    if let Some(config) = config {
        if let Some(builtin) = Self::is_builtin_service(&config) {
            self.run_builtin_service(id, builtin, &config).await;
        } else {
            self.spawn_service(id, config).await;
        }
    }
}
```

### 6. Update service config

**File**: `etc/zinit/system/sysvol.toml`

```toml
[service]
name = "sysvol"
exec = "@builtin:sysvol"
oneshot = true
class = "system"
critical = true

[dependencies]
requires = ["udev-trigger"]
after = ["network"]
```

## Files to Modify

| File | Change |
|------|--------|
| `mosstorage/` | Rename to `sysvol/` |
| `Cargo.toml` | Update workspace member name |
| `zinit-server/Cargo.toml` | Add `sysvol = { path = "../sysvol" }` |
| `zinit-server/src/supervisor.rs` | Add builtin detection, PID1 gate, sysvol runner, event handler |
| `etc/zinit/system/mosstorage.toml` | Rename to `sysvol.toml`, update config |
| `etc/zinit/system/ready.toml` | Update dependency from mosstorage to sysvol |

## Consequences

### Positive
- No external binary to deploy
- Faster boot (no process spawn overhead)
- Better error handling and logging
- **Safe**: Cannot run on workstations (PID1 gate)
- Clear naming: `sysvol` = system volume for state/cache

### Negative
- Couples zinit-server to sysvol library
- Adds complexity to supervisor (builtin concept)

### Mitigations
- Keep builtin system minimal and explicit
- Clear `@builtin:` prefix makes it obvious in configs
- PID1 gate ensures safety

## Future Work

- Expose storage state via RPC for services that need to know Mounted vs NoDisk
- Environment variable injection for child services
- Additional builtins if needed (keep minimal)