forjar 1.2.1

Rust-native Infrastructure as Code — bare-metal first, BLAKE3 state, provenance tracing
Documentation
<p align="center">
  <img src="docs/hero.svg" alt="forjar — Rust-native Infrastructure as Code" width="900" />
</p>

<p align="center">
  <a href="#quick-start">Quick Start</a> &middot;
  <a href="docs/book/src/README.md">Book</a> &middot;
  <a href="docs/specifications/forjar-spec.md">Specification</a> &middot;
  <a href="https://github.com/paiml/forjar-cookbook">Cookbook</a> &middot;
  <a href="#benchmarks">Benchmarks</a>
</p>

[![MSRV](https://img.shields.io/badge/MSRV-1.88.0-blue)](https://blog.rust-lang.org/)

---

Forjar is a single-binary IaC tool written in Rust. It manages bare-metal machines over SSH using YAML configs, BLAKE3 content-addressed state, and deterministic DAG execution. No cloud APIs, no runtime dependencies, no remote state backends.

```
forjar.yaml  →  parse  →  resolve DAG  →  plan  →  codegen  →  execute  →  BLAKE3 lock
```

## Why Forjar

| | Terraform | Ansible | **Forjar** |
|---|---|---|---|
| Runtime | Go + providers | Python + SSH | **Single Rust binary** |
| State | S3 / Consul / JSON | None | **Git (BLAKE3 YAML)** |
| Drift detection | API calls | None | **Local hash compare** |
| Bare metal | Weak | Strong | **First-class** |
| Dependencies | ~200 Go modules | ~50 Python pkgs | **17 crates** |
| Apply speed | Seconds–minutes | Minutes | **Milliseconds–seconds** |

## Quick Start

```bash
# Install from source
cargo install --path .

# Initialize a project
forjar init my-infra && cd my-infra

# Edit forjar.yaml (see Configuration below)

# Preview changes
forjar plan -f forjar.yaml

# Apply
forjar apply -f forjar.yaml

# Check for unauthorized changes
forjar drift --state-dir state

# View current state
forjar status --state-dir state
```

## Configuration

A `forjar.yaml` declares machines, resources, and policy:

```yaml
version: "1.0"
name: home-lab
description: "Sovereign AI stack provisioning"

params:
  data_dir: /mnt/data

machines:
  gpu-box:
    hostname: lambda
    addr: 192.168.50.100
    user: noah
    ssh_key: ~/.ssh/id_ed25519
    arch: x86_64
    roles: [gpu-compute]

resources:
  base-packages:
    type: package
    machine: gpu-box
    provider: apt
    packages: [curl, htop, git, tmux, ripgrep]

  data-dir:
    type: file
    machine: gpu-box
    state: directory
    path: "{{params.data_dir}}"
    owner: noah
    mode: "0755"
    depends_on: [base-packages]

  app-config:
    type: file
    machine: gpu-box
    path: /etc/app/config.yaml
    content: |
      data_dir: {{params.data_dir}}
      log_level: info
    owner: noah
    mode: "0644"
    depends_on: [data-dir]

policy:
  failure: stop_on_first
  tripwire: true
  lock_file: true
```

### Resource Types

| Type | States | Key Fields |
|------|--------|------------|
| `package` | present, absent | `provider` (apt/cargo/uv), `packages` |
| `file` | file, directory, symlink, absent | `path`, `content`, `owner`, `group`, `mode` |
| `service` | running, stopped, enabled, disabled | `name`, `enabled`, `restart_on` |
| `mount` | mounted, unmounted, absent | `source`, `path`, `fstype`, `options` |
| `user` | present, absent | `name`, `groups`, `shell`, `home`, `ssh_keys` |
| `docker` | running, stopped, absent | `image`, `ports`, `environment`, `volumes` |
| `cron` | present, absent | `name`, `schedule`, `command`, `user` |
| `network` | present, absent | `port`, `protocol`, `action`, `from_addr` |
| `pepita` | present, absent | `name`, `cgroups`, `overlayfs`, `netns`, `seccomp` |
| `model` | present, absent | `name`, `source`, `format`, `quantization`, `checksum`, `cache_dir` |
| `gpu` | present, absent | `driver_version`, `cuda_version`, `devices`, `persistence_mode`, `compute_mode` |

### Templates

Use `{{params.key}}` to reference global parameters in any string field. Templates are resolved before codegen.

### Recipes

Reusable, parameterized resource patterns (like Homebrew formulae):

```yaml
# recipes/dev-tools.yaml
name: dev-tools
version: "1.0"
inputs:
  user:
    type: string
    required: true
  shell:
    type: enum
    values: [bash, zsh, fish]
    default: zsh
resources:
  packages:
    type: package
    provider: apt
    packages: [build-essential, cmake, pkg-config]
  dotfiles:
    type: file
    state: directory
    path: "/home/{{inputs.user}}/.config"
    owner: "{{inputs.user}}"
    mode: "0755"
```

See the **[Forjar Cookbook](https://github.com/paiml/forjar-cookbook)** for 67 production-ready recipes covering packages, files, services, Docker, GPU, network, pepita sandboxing, multi-machine stacks, and content-addressed store reproducibility. The cookbook includes a [Reproducibility Series](https://github.com/paiml/forjar-cookbook/blob/master/docs/book/src/recipes/reproducibility.md) (recipes 63-67) demonstrating version pinning, sandboxed builds, SSH caching, CI gates, and profile rollback.

## Content-Addressed Store

Forjar includes a Nix-inspired content-addressed store for reproducible builds. Every build output lives at a deterministic path derived from its inputs:

```
/var/lib/forjar/store/<blake3-hash>/
├── meta.yaml          # Input manifest, provenance
└── content/           # Build output
```

### Store Commands

```bash
forjar pin                            # Pin all inputs to current versions
forjar pin --check                    # CI gate — fail if lock file is stale
forjar cache list                     # List local store entries
forjar cache push user@host:path      # Push to SSH binary cache
forjar cache verify                   # Re-hash all entries
forjar store gc --dry-run             # Preview garbage collection
forjar store diff <hash>              # Diff against upstream origin
forjar store-import apt nginx=1.24.0  # Import from any provider
forjar archive pack <hash>            # Pack into .far archive
forjar convert --reproducible         # Auto-convert recipe to store model
```

Supported import providers: `apt`, `cargo`, `uv`, `nix`, `docker`, `tofu`, `terraform`, `apr`.

### 4-Level Purity Model

| Level | Name | Requirement |
|-------|------|-------------|
| 0 | Pure | Version + store + sandbox |
| 1 | Pinned | Version + store (no sandbox) |
| 2 | Constrained | Provider-scoped, floating version |
| 3 | Impure | Unconstrained |

See the [architecture docs](docs/book/src/05-architecture.md) for details on the store model, sandbox lifecycle, substitution protocol, and derivation executor.

## How It Works

1. **Parse** — Read `forjar.yaml`, validate schema and references
2. **Resolve** — Expand templates, build dependency DAG (Kahn's toposort, alphabetical tie-break)
3. **Plan** — Diff desired state against BLAKE3 lock file (hash comparison, no API calls)
4. **Codegen** — Generate shell scripts per resource type
5. **Execute** — Run scripts locally or via SSH (stdin pipe, not argument passing). Files > 1MB use copia delta sync (only changed blocks transferred)
6. **State** — Atomic lock file write (temp + rename), append to JSONL event log

### Failure Policy (Jidoka)

On first failure, execution stops immediately. Partial state is preserved in the lock file. No cascading damage. Re-run to continue from where it stopped.

### Transport

- **Local**: `bash` via stdin pipe (for `127.0.0.1` / `localhost`)
- **SSH**: `ssh -o BatchMode=yes` with stdin pipe (no argument length limits)

## Benchmarks

```bash
cargo bench
```

<!-- BENCH-TABLE-START -->

**Core Operations**

| Operation | Input | Mean | 95% CI |
|---|---|---|---|
| BLAKE3 hash | 64 B string | 27 ns | +/- 0.5 ns |
| BLAKE3 hash | 1 KB string | 92 ns | +/- 1.2 ns |
| BLAKE3 hash | 1 MB file | 172 us | +/- 0.4 us |
| YAML parse | 500 B config | 20.7 us | +/- 0.2 us |
| Topo sort | 100 nodes | 34.6 us | +/- 0.4 us |
| Copia signature | 1 MB file | 294 us | +/- 0.3 us |
| Copia signature | 4 MB file | 1.19 ms | +/- 0.01 ms |
| Copia delta | 4 MB, 2% change | 1.18 ms | +/- 0.01 ms |
| Copia patch gen | 1 MB, 10% change | 60 us | +/- 0.3 us |

**Store Operations** (`cargo bench --bench store_bench`)

| Operation | Input | Target |
|---|---|---|
| Store path hash | 3 inputs | < 1 us |
| Purity classify | 4 levels | < 1 us |
| Closure hash | 3/10/50 nodes | < 10 us |
| Repro score | 1/5/20 resources | < 100 us |
| FAR encode | 1KB/1MB/10MB | < 100 ms |
| FAR decode | 64KB manifest | < 10 ms |
| Lockfile staleness | 10/100/1K pins | < 1 ms |
| Sandbox validate | 4 presets | < 1 us |
| Derivation closure | 5-input DAG | < 10 us |
| Script purify | small/med/large | < 10 ms |

<!-- BENCH-TABLE-END -->

Criterion.rs, 100 samples, 3s warm-up. Run `make bench-update` to refresh table.

## Falsifiable Claims

<details>
<summary>10 testable claims with linked tests (click to expand)</summary>

### C1: Deterministic hashing
BLAKE3 of identical inputs always produces identical outputs.
Tests: `test_fj014_hash_file_deterministic`, `test_fj014_hash_string`

### C2: Deterministic DAG order
Same dependency graph always produces the same execution order.
Tests: `test_fj003_topo_sort_deterministic`, `test_fj003_alphabetical_tiebreak`

### C3: Idempotent apply
Second apply on unchanged config produces zero changes.
Tests: `test_fj012_idempotent_apply`, `test_fj004_plan_all_unchanged`

### C4: Cycle detection
Circular dependencies are rejected at parse time.
Tests: `test_fj003_cycle_detection`

### C5: Content-addressed state
Lock hashes are derived from desired state, not timestamps.
Tests: `test_fj004_hash_deterministic`, `test_fj004_plan_all_unchanged`

### C6: Atomic state persistence
Lock writes use temp file + rename. No corruption on crash.
Tests: `test_fj013_atomic_write`, `test_fj013_save_and_load`

### C7: Recipe input validation
Invalid typed inputs are rejected before expansion.
Tests: `test_fj019_validate_inputs_type_mismatch`, `test_fj019_validate_inputs_enum_invalid`

### C8: Heredoc injection safety
Single-quoted heredoc prevents shell expansion in file content.
Tests: `test_fj007_heredoc_safe`

### C9: Minimal dependencies
Fewer than 20 direct crate dependencies (currently 17 runtime + 1 build). Single binary output.
Verify: `cargo metadata --no-deps --format-version 1 | jq '[.packages[0].dependencies[] | select(.kind == null)] | length'`

### C10: Jidoka failure isolation
First failure stops execution. Previously converged state is preserved.
Tests: `test_fj012_apply_local_file`

</details>

## Testing

```bash
cargo test                    # 6295+ unit tests
cargo test -- --nocapture     # with output
cargo test planner            # specific module
cargo bench                   # Criterion benchmarks
cargo clippy -- -D warnings   # lint
```

## License

MIT OR Apache-2.0