# nornir
<p align="center">
<img src=".nornir/assets/nornir.webp" alt="Urðr, Verðandi, Skuld — the three Norns" width="720" />
</p>
> *The three Norns spin the thread of every fate — what has been, what is, what shall be.*
| the warehouse — all history | guard locks + the step running now | the plan DAG, gates, the release ahead |
---
**nornir** is a project-management tool with time-travel — your **vegvísir**
(Old Norse: *wayfinder*) for a multi-crate Rust workspace. It manages a DAG of
your *plans* and the DAG of your *build dependencies*, and remembers every
version of both. The crossover of what **Maven + Jira** should have been.
## What it does
- **Release & document.** Build, test, and bench-gate every crate, then
`cargo publish` in dependency order — and (re)generate the README/CHANGELOG and
a static website (PDF/HTML via typst) describing exactly what shipped.
- **Plan.** Connect real artifacts (source files, built binaries) and the inputs
that drive them — a prompt or a written requirement (a *krav*, in any language)
— into a Plan-DAG; `topo_ready` hands the next ready node to a human or an agent.
- **Remember & time-travel.** Historize every fact on Iceberg, keyed by git hash
— the plan DAG, the full-text index, the dependency graph, bench/release lineage
— then restore or query the workspace as of any past release.
- **Explore.** Full-text (Tantivy BM25) **and semantic/vector search** side by
side, plus a `syn`/DWARF knowledge map (symbols, call graph) and dependency
blast-radius across the workspace. Code is embedded with `jina-v2-base-code`
(768-dim) and the vectors are **materialized in the warehouse**, so semantic
search at any past git SHA is a warehouse read — never a re-embed or a git
walk. Loading a repo's vector index is single-digit milliseconds (≈7 ms for
~120 vectors, ≈15 ms for ~430); embedding runs on CPU (pure-Rust `tract`) or
GPU (`ort` + CUDA, ~100× faster) behind one `Embedder` interface.
- **Diagrams from data.** Because the whole workspace lives in the warehouse, any
graph in it can be drawn as static SVG via typst (no Mermaid, no JS): the
dependency graph below is generated, and the `urdr-threads` visualizer weaves
release timelines and cross-repo dependency edges into time-travel threads.
- **Guard & revert.** Lock release-critical files (`chmod -w`) against chaotic AI
agents, models, or teammates, and detect tampering against a recorded manifest.
Because every version is keyed by git SHA in the warehouse, time-travel to find
the culprit and roll back to a known-good release — the revert itself rides on git.
- **Accelerate airgapped AI on large codebases.** Every index *and* the Plan-DAG
is served over MCP, so a local, offline agent works a huge workspace by query —
no full-repo scans, and no code ever leaves the machine.
## Design
- **100% Rust.** git via `gix`, Apache Iceberg via `iceberg-rust` over
[**skade-katalog**][icr] — a single-file redb catalog.
- **One warehouse, append-only.** Every fact is an Iceberg row keyed by git SHA:
plan-DAG events, dependency edges, bench results, test outcomes, release
lineage, full-text index snapshots, and code facts — symbols / call-edges from
`syn`, and **DWARF** symbols read out of built binaries via `gimli`.
- **One library, four binaries.** `nornir` (CLI) · `nornir-mcp` (MCP/stdio, feature
`mcp`) · `nornir-server` (gRPC/tonic, feature `server`) · `urdr-threads` (egui
time-travel visualizer, feature `viz`). Build the CLI with `cargo build`; add
`--features mcp,server,viz` as needed. The CLI runs **embedded** (opens the
warehouse in-process, the default) **or as a client** of a running
`nornir-server` — mutually exclusive per warehouse, since one process owns the
write lock.
[icr]: https://codeberg.org/nordisk/nornir-catalog
## Documentation
The full reference — every subsystem chapter — is rendered by `nornir docs book`
into a single manual under [`docs/`](docs/) and historized in the warehouse:
- 📖 [`docs/book.pdf`](docs/book.pdf) — the complete manual (PDF)
- 📝 [`docs/book.md`](docs/book.md) — the same, as Markdown
It covers the design overview, docs generation, the Iceberg warehouse, the
bench framework, vector/semantic search, and the release & history flow (how a
release captures a complete SHA-keyed snapshot of the workspace, and how to read
any past release back). The rendered docs get their own
full-text index — `nornir docs search <repo> "<query>"` — kept separate from
the code index and time-travelable at any past git SHA.
## Example
nornir manages **owned** repos — the **holger** artifact server and the **znippy**
compression library — and maps their full build-dependency DAG down through the
**external** crates they don't own, such as **Apache Arrow**. Inside a repo, cargo
owns the workspace (build order and layout); nornir works at the **repo level** —
it finds each repo's root `Cargo.toml` and only orders *across* repos. The
`nornir.toml` lists the owned repos and their crates.io publish order; the
dependency graph is derived from `cargo metadata`, internal crates kept separate
from external ones:
```toml
[repo.holger] # owned — artifact server
remote = "https://codeberg.org/nordisk/holger"
[repo.znippy] # owned — compression library
remote = "https://codeberg.org/nordisk/znippy"
# crates.io upload order: members go up in dependency order — published
# versions are immutable, so it's sequential, not atomic.
publish_order = [
["znippy-common"],
["znippy-compress", "znippy-decompress"], # phase 2, uploaded in parallel
]
```
```sh
nornir release run # test → bench-gate → publish the owned crates, in dep order
nornir docs render holger # regenerate README + dependency diagram (owned + Apache Arrow, …)
```
## Status
Working today: the warehouse, the plan DAG (`funnel_*` MCP tools), guard, config,
bench history + regression gate, dependency-graph oracle, the knowledge map
(`syn`), the Tantivy index + time-machine restore, **semantic/vector search**
(materialized embeddings + git-SHA time-travel), docs generation, and the
CLI / MCP / gRPC surfaces.
### Vector search — measured
Full vectorization of real repos, then time-travel "load a vector blob"
(reconstruct the in-memory index for a git SHA straight from the warehouse).
Threadripper 3975WX (AVX2) + RTX 4090, f32 model:
| **load vector blob @SHA** (warehouse read) | **6.9 ms** | **14.7 ms** |
| semantic query (embed + search) | 7.9 ms | 15.0 ms |
| re-index a never-pushed 1-line change | only the touched chunk re-embeds | only the touched chunk re-embeds |
Embedding throughput (same f32 model, one `Embedder` interface, two backends):
| `tract` CPU (pure Rust) | ~1900 ms | 147 s | 1× |
| `ort` CPU (ONNX Runtime) | ~337 ms | 41 s | ~5.6× |
| **`ort` CUDA (RTX 4090)** | **13 ms** | **1.6 s** | **~110×** |
Loading a blob is GPU-independent (it's a columnar read); the GPU only
accelerates embedding. Unchanged chunks are never re-embedded — a one-line edit
at a fresh SHA re-embeds a single chunk and still loads in the same ~7/15 ms.
