# 08 - Live Binary Updates
## Status
**Proposed** - To be implemented after core functionality is stable.
## Context
zinit consists of two processes:
- **zinit-pid1**: The init process (PID 1), minimal, spawns and monitors zinit-server
- **zinit-server**: The process supervisor, handles service lifecycle, dependency graph, socket activation
Both need to support in-place updates without losing track of running services or dropping connections.
### Constraints
- PID 1 cannot exit on VM/bare-metal (kernel panic)
- Running services must continue uninterrupted
- Open sockets (RPC, socket-activated) should not be dropped
- State consistency: no "lost" processes after update
## Decision
### zinit-pid1: Minimal state via argv
pid1 has almost no state. On self-update:
```rust
fn exec_new_pid1(new_binary: &Path, server_pid: Pid) {
// Pass server PID via argv
let args = [
CString::new(new_binary.to_str().unwrap()).unwrap(),
CString::new("--adopt-server").unwrap(),
CString::new(server_pid.to_string()).unwrap(),
];
// exec replaces process image, keeps PID 1
execv(&args[0], &args).unwrap();
// never returns on success
}
```
New pid1 starts with `--adopt-server <pid>`, monitors that PID instead of spawning fresh.
**Trigger:** `SIGUSR2` to pid1
**Sequence:**
1. pid1 receives SIGUSR2
2. Verify new binary exists and is executable
3. `execv()` into new binary with `--adopt-server <server_pid>`
4. New pid1 resumes monitoring
### zinit-server: Serialize + FD passing
Server has significant state. On update:
```rust
#[derive(Serialize, Deserialize)]
struct PersistentState {
services: HashMap<String, ServiceSnapshot>,
boot_time: u64,
// FD numbers stored separately
}
#[derive(Serialize, Deserialize)]
struct ServiceSnapshot {
name: String,
state: ServiceState,
pid: Option<i32>,
restart_count: u32,
current_restart_delay_ms: u64,
last_exit_code: Option<i32>,
}
```
**Trigger:** `SIGUSR1` to pid1 (which signals server to prepare, then restarts it)
**Sequence:**
1. pid1 receives SIGUSR1
2. pid1 sends `PrepareRestart` RPC to server
3. Server serializes state to `/run/zinit/state.json`
4. Server prepares FDs (clears CLOEXEC on sockets to keep)
5. Server encodes FD map to env: `ZINIT_FDS={"rpc":5,"svc_foo_stdout":7}`
6. Server exits cleanly
7. pid1 spawns new server binary
8. New server detects `/run/zinit/state.json`, restores state
9. New server restores FDs from `ZINIT_FDS` env
10. New server verifies PIDs still exist, adjusts state if needed
11. Cleanup: remove state file
## Approaches Considered
### 1. Serialize to file/memfd → exec → reload
Dump state to JSON/bincode before exec, reload after.
```rust
// Before exec
let snapshot = build_snapshot(&state);
std::fs::write("/run/zinit/state.json", serde_json::to_string(&snapshot)?)?;
// After exec
fn try_restore_state() -> Option<SupervisorState> {
let json = std::fs::read_to_string("/run/zinit/state.json").ok()?;
std::fs::remove_file("/run/zinit/state.json").ok();
serde_json::from_str(&json).ok()
}
```
**Pros:** Simple, debuggable, works for any serializable state.
**Cons:** Race window between serialize and exec. Non-serializable state (FDs) lost.
### 2. FD passing (keep sockets open across exec)
File descriptors survive `exec()` unless marked `CLOEXEC`.
```rust
// Before exec: clear CLOEXEC on FDs to keep
fn prepare_fds_for_exec(fds: &[(&str, RawFd)]) {
for (name, fd) in fds {
let flags = fcntl(*fd, FcntlArg::F_GETFD).unwrap();
let new_flags = FdFlag::from_bits_truncate(flags.bits() & !FdFlag::FD_CLOEXEC.bits());
fcntl(*fd, FcntlArg::F_SETFD(new_flags)).unwrap();
}
let fd_map: HashMap<&str, i32> = fds.iter().map(|(n, f)| (*n, *f)).collect();
std::env::set_var("ZINIT_FDS", serde_json::to_string(&fd_map).unwrap());
}
// After exec: restore FDs
fn restore_fds() -> Option<HashMap<String, RawFd>> {
let fd_json = std::env::var("ZINIT_FDS").ok()?;
serde_json::from_str(&fd_json).ok()
}
```
**Pros:** No reconnection needed, no data loss in pipes.
**Cons:** Only works for FDs, not arbitrary state.
### 3. Shared memory (mmap + memfd)
Create memfd-backed mmap region that survives exec.
```rust
#[repr(C)]
struct SharedState {
magic: u64,
version: u64,
server_pid: AtomicI32,
services: [SharedServiceState; 256],
}
fn init_shared_state() -> (*mut SharedState, RawFd) {
let memfd = memfd::MemfdOptions::new()
.close_on_exec(false)
.create("zinit-shared-state")?;
memfd.as_file().set_len(size_of::<SharedState>() as u64)?;
let fd = memfd.into_raw_fd();
let ptr = mmap(None, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0)?;
std::env::set_var("ZINIT_SHM_FD", fd.to_string());
(ptr as *mut SharedState, fd)
}
```
**Pros:** Zero serialization overhead, lock-free with atomics.
**Cons:** Fixed-size only (no Vec/String/HashMap), `repr(C)` constraints, version migration painful.
### 4. How systemd does it
Combines approaches:
1. Serializes unit states to `/run/systemd/*.service`
2. Passes socket FDs via `LISTEN_FDS` env (SD_LISTEN_FDS_START = fd 3)
3. `exec()` into new binary
4. New binary restores from files, re-adopts processes by PID
Key insight: systemd doesn't preserve in-flight operations. Mid-restart services get re-evaluated from persisted state.
## Implementation
### Phase 1: Server restart (no state preservation)
Simple version for development:
- pid1 receives SIGUSR1
- pid1 sends shutdown to server
- pid1 spawns new server
- Server reloads config, rediscovers running processes via `/proc`
### Phase 2: Server restart with state
- Add serialization before shutdown
- Add restoration on startup
- Verify PID validity after restore
### Phase 3: FD preservation
- Track which FDs to preserve (RPC socket, log pipes)
- Clear CLOEXEC, encode to env
- Restore and verify after restart
### Phase 4: pid1 self-update
- SIGUSR2 triggers self-exec
- Pass server PID via argv
- New pid1 adopts server
## Detecting stale state
After restore, verify PIDs are still valid:
```rust
fn process_exists(pid: i32) -> bool {
// kill with signal 0 checks existence without sending signal
nix::sys::signal::kill(Pid::from_raw(pid), None).is_ok()
}
fn validate_restored_state(state: &mut SupervisorState) {
for (name, svc) in &mut state.services {
if let Some(pid) = svc.pid {
if !process_exists(pid) {
eprintln!("Service {} pid {} gone, marking as failed", name, pid);
svc.pid = None;
svc.state = ServiceState::Failed;
}
}
}
}
```
## Open questions
1. **Timeout for graceful server shutdown?** Suggest 30s, then SIGKILL.
2. **What if new binary crashes immediately?** pid1 could keep old binary as fallback, but adds complexity. Initial version: just restart new binary with backoff.
3. **Config changes during update?** New server re-reads config. Services added/removed get started/stopped as normal.
4. **Socket activation FDs?** These are the most important to preserve. Clients connected to RPC socket shouldn't notice the restart.
## References
- systemd daemon-reexec: https://www.freedesktop.org/software/systemd/man/systemd.html
- memfd_create(2): https://man7.org/linux/man-pages/man2/memfd_create.2.html
- execve(2) and file descriptors: https://man7.org/linux/man-pages/man2/execve.2.html