notarai 0.3.3

CLI validator for NotarAI spec files
# Design Diagrams

All diagrams from the design process, illustrating the NotarAI name and `.notarai/` directory convention.

---

## 1. The Problem: Pre-LLM vs Current LLM Era

### 1a. Pre-LLM: Code Is the Spec

```mermaid
flowchart LR
    Dev["Developer<br/>(intent in head)"]
    Code["Source Code<br/>authoritative spec"]
    Docs["Docs<br/>second-class, often stale"]

    Dev -->|writes| Code
    Code -.->|describes| Docs
```

### 1b. Current LLM Era: The Three-Body Problem

```mermaid
flowchart TD
    Intent["User Intent<br/>natural language prompt"]
    LLM["LLM"]
    Code["Source Code"]
    Docs["Documentation"]

    Intent --> LLM
    Intent -.->|"edits directly"| Code
    Intent -.->|"edits directly"| Docs
    LLM -->|generates| Code
    LLM -->|generates| Docs
    Code <-..->|"drift / desync"| Docs
```

---

## 2. NotarAI: Spec State File as Single Source of Truth

```mermaid
flowchart TD
    Intent["User Intent<br/>natural language"]
    Spec["NotarAI Spec<br/>structured intent representation<br/>canonical source of truth"]
    LLM["LLM (sync engine)"]
    Code["Source Code"]
    Docs["Documentation"]

    Intent -->|updates| Spec
    Spec -->|reads| LLM
    LLM -->|derives| Code
    LLM -->|derives| Docs
    Code -.->|reconcile back| Spec
    Docs -.->|reconcile back| Spec
    Code <-.->|"always in sync via spec"| Docs
```

---

## 3. Spec File Anatomy

### 3a. Required Core

```yaml
# .notarai/auth.spec.yaml
schema_version: '0.6'

intent: |
  Users can sign up, log in, and
  reset passwords. Sessions expire
  after 30 min of inactivity.

behaviors:
  - name: 'signup'
    given: 'valid email + password'
    then: 'account created, welcome email sent'
  - name: 'session_timeout'
    given: '30 min inactivity'
    then: 'session invalidated'

artifacts:
  code:
    - path: 'src/auth/**'
  docs:
    - path: 'docs/auth.md'
```

### 3b. Optional Extensions

```yaml
# Power users add precision as needed

constraints:
  - 'passwords >= 12 chars'
  - 'rate limit: 5 login attempts / min'

invariants:
  - 'no plaintext passwords in DB'
  - 'all endpoints require HTTPS'

decisions:
  - date: '2025-03-12'
    choice: 'JWT over session cookies'
    rationale: 'stateless scaling'

open_questions:
  - 'Should we support OAuth2 providers?'
  - 'MFA timeline?'
```

> **Design note:** The `behaviors` field uses Given/Then language (BDD-adjacent) but stays in natural language -- not formal Gherkin. Structured enough to diff and validate, informal enough that non-engineers can author it.

---

## 4. Reconciliation Lifecycle

### 4a. Scenario A: Human Edits Code

```mermaid
flowchart LR
    A1["Human edits code<br/>adds OAuth endpoint"]
    A2["LLM detects drift<br/>code != spec behaviors"]
    A3["LLM proposes spec update<br/>+ add behavior: oauth_login<br/>+ update docs/auth.md"]
    A4["Human approves<br/>or adjusts and approves"]

    A1 -->|trigger| A2
    A2 -->|reconcile| A3
    A3 -->|resolve| A4
```

### 4b. Scenario B: Human Edits Spec

```mermaid
flowchart LR
    B1["Human edits spec<br/>changes session to 60 min"]
    B2["LLM updates code to match"]
    B3["LLM updates docs to match"]
    B4["Human reviews<br/>code + docs diff<br/>as a single PR"]

    B1 -->|direct| B2
    B1 -->|direct| B3
    B2 --> B4
    B3 --> B4
```

### 4c. Scenario C: Conflict Detected

```mermaid
flowchart LR
    C1["Conflict detected<br/>code says X, spec says Y<br/>docs say Z"]
    C2["LLM presents options<br/>spec says X, but code<br/>does Y -- which is right?"]
    C3["Human decides intent<br/>LLM propagates decision<br/>across spec + code + docs"]
    C4["All three aligned<br/>conflict resolved"]

    C1 -->|detect| C2
    C2 -->|reconcile| C3
    C3 -->|resolve| C4
```

---

## 5. Post-Push Reconciliation in Practice

```mermaid
flowchart LR
    S1["Dev + LLM<br/>write code freely<br/>no spec friction"]
    S2["git push<br/>or open PR"]
    S3["CI hook: LLM reviews<br/>diff vs affected specs<br/>proposes spec updates<br/>proposes doc updates"]
    S4["Adds to PR<br/>spec diff + docs diff<br/>alongside code diff"]
    S5["Single review<br/>code + spec + docs<br/>all land together or not"]

    S1 --> S2 --> S3 --> S4 --> S5
```

> The `artifacts` field in the spec tells the CI hook which specs are affected by which file paths -- so it only reconciles what changed.

---

## 6. Spec Composition -- The Import Model

### 6a. Directory Structure

```
project/
+-- .notarai/
|   +-- system.spec.yaml          # top-level system spec
|   +-- auth.spec.yaml            # auth service (Tier 1)
|   +-- billing.spec.yaml         # billing service (Tier 1)
|   +-- api.spec.yaml             # API layer (Tier 1)
|   +-- utils.spec.yaml           # shared utilities (Tier 2)
|   +-- redis-cache.spec.yaml     # sidecar process (Tier 2)
|   +-- _shared/
|       +-- security.spec.yaml    # cross-cutting
|       +-- logging.spec.yaml     # cross-cutting
+-- src/
+-- docs/
```

### 6b. Composition Relationships

```mermaid
flowchart TD
    System["system.spec.yaml<br/>top-level intent + invariants"]

    Auth[".notarai/auth.spec.yaml"]
    Billing[".notarai/billing.spec.yaml"]
    API[".notarai/api.spec.yaml"]

    Security["_shared/security.spec.yaml<br/>applies to: all subsystems"]
    Logging["_shared/logging.spec.yaml<br/>applies to: all subsystems"]

    System -->|"$ref"| Auth
    System -->|"$ref"| Billing
    System -->|"$ref"| API

    Security -.->|applies| Auth
    Security -.->|applies| Billing
    Security -.->|applies| API
    Logging -.->|applies| Auth
    Logging -.->|applies| Billing
    Logging -.->|applies| API
```

> When the LLM checks `auth.spec.yaml`, it also loads `security.spec.yaml` and validates that auth code satisfies **both** specs' invariants. Cross-cutting concerns are defined once and enforced everywhere.

---

## 7. Coverage Model -- Three Tiers

```mermaid
flowchart LR
    subgraph T1["Tier 1: Full Spec"]
        T1a["Business logic services"]
        T1b["API endpoints"]
        T1c["Data models / schemas"]
        T1d["Anything user-facing"]
    end

    subgraph T2["Tier 2: Registered"]
        T2a["Utility libraries"]
        T2b["Shared helpers / constants"]
        T2c["Config files"]
        T2d["Sidecar processes"]
    end

    subgraph T3["Tier 3: Excluded"]
        T3a["Generated code / build output"]
        T3b["Vendored dependencies"]
        T3c["IDE / editor configs"]
        T3d["node_modules, .git, etc."]
    end
```

**Coverage equation:** `Tier 1 + Tier 2 + Tier 3 = entire repo`

Anything not covered = **unspecced** (a lint warning, not a block).

---

## 8. Bootstrap Flow for Existing Codebases

```mermaid
flowchart LR
    S1["1. Ingest<br/>code + docs +<br/>commit history +<br/>README / ADRs"]
    S2["2. LLM interviews<br/>What's the goal?<br/>Any undocumented rules?"]
    S3["3. Draft spec<br/>required fields only<br/>intent + behaviors +<br/>artifact mappings"]
    S4["4. Human review<br/>correct, enrich,<br/>add constraints /<br/>open questions"]
    S5["5. Activate<br/>sync engine<br/>watches for drift<br/>from this point on"]

    S1 --> S2 --> S3 --> S4 --> S5
```

> Bootstrap starts minimal and accrues precision over time -- the spec is a living document.