harn-hostlib 0.7.38

Opt-in code-intelligence and deterministic-tool host builtins for the Harn VM
Documentation
# harn-hostlib

Opt-in host builtins for the Harn VM that provide:

1. **Code intelligence** — tree-sitter–backed parsing, deterministic
   trigram/word indexing, and project-wide repo scanning. Ports the Swift
   `Sources/ASTEngine/`, `Sources/BurinCodeIndex/`, and
   `Sources/BurinCore/Scanner/` surface from
   [`burin-labs/burin-code`]https://github.com/burin-labs/burin-code.
2. **Deterministic tools** — content search (`grep-searcher` + `ignore`),
   file I/O, directory listing, file outline, git inspection (`gix`), file
   watching (`notify`), and process lifecycle (`run_command`, `run_test`,
   `run_build_command`, `inspect_test_results`, `manage_packages`). Ports
   the Swift `CoreToolExecutor` surface so calls no longer have to bounce
   Harn → Swift → Harn.

## Status

[#563](https://github.com/burin-labs/harn/issues/563) introduced the
scaffold (every method routed through `HostlibError::Unimplemented`).
[#567](https://github.com/burin-labs/harn/issues/567) lights up the
deterministic-tool surface: `search`, `read_file`, `write_file`,
`delete_file`, `list_directory`, `get_file_outline`, and `git`.

| Issue | Module | What lands | Status |
|-------|--------|-----------|--------|
| B1 (#563) | scaffold       | crate + schemas + registration plumbing                                                   | ✅ shipped |
| B2    | `ast/`             | `parse_file`, `symbols`, `outline`                                                        | unimplemented |
| B3    | `code_index/`      | `query`, `rebuild`, `stats`, `imports_for`, `importers_of`                                | unimplemented |
| B4    | `scanner/`         | `scan_project`, `scan_incremental`                                                        | unimplemented |
| C1    | `fs_watch/`        | `subscribe`, `unsubscribe`                                                                | unimplemented |
| #567  | `tools/` (read & search) | `search`, `read_file`, `list_directory`, `get_file_outline`, `git`                 | ✅ shipped (this issue) |
| #567  | `tools/` (mutating)      | `write_file`, `delete_file`                                                        | ✅ shipped (this issue) |
| C2    | `tools/` (process)       | `run_command`, `run_test`, `run_build_command`, `inspect_test_results`, `manage_packages` | unimplemented |

## Why a separate crate?

`harn-vm` powers Harn pipelines that have nothing to do with editing host
code. Pulling tree-sitter grammars, ripgrep, and `notify` into the VM
crate would balloon its compile time and binary size for every embedder
that doesn't index host source. `harn-hostlib` is **opt-in**: nothing
inside `harn-vm` knows the crate exists. Embedders that want the surface
ask for it.

Conversely, the work that *does* belong in `harn-vm` — orchestration,
transcript lifecycle, replay/eval, mutation session audit metadata —
stays there. See
[`AGENTS.md`](../../CLAUDE.md#trust-boundary) for the canonical trust
boundary.

## Per-session opt-in for deterministic tools

The deterministic-tool surface (`tools/{search, read_file, write_file,
delete_file, list_directory, get_file_outline, git}`) is **gated**.
`install_default` registers the contract for every method, but the
handlers refuse to run until the pipeline opts in by calling

```text
hostlib_enable("tools:deterministic")
```

(a builtin registered alongside the rest of the `tools/` surface). This
matches the safety story called out in
[#567](https://github.com/burin-labs/harn/issues/567): a Harn script that
hasn't asked for filesystem / git / search access cannot get it even
though the contract is wired in. The opt-in is per-thread, so each VM
gets an independent enable set.

Embedders that want to enable the surface from Rust without going through
the builtin can use [`tools::permissions::enable_for_test`] (test-only)
or call `tools::permissions::enable("tools:deterministic")` directly.

## How embedders consume it

The `harn-cli` ACP server wires hostlib in by default:

```rust
let mut vm = harn_vm::Vm::new();
let _registry = harn_hostlib::install_default(&mut vm);
```

`install_default` registers every shipped capability and returns a
`HostlibRegistry` that can be introspected (e.g. for
`burin-code`'s schema-drift tests) without mutating the VM further.

Pick-and-choose embedders that only want a subset of modules can build a
custom registry:

```rust
let mut registry = harn_hostlib::HostlibRegistry::new()
    .with(harn_hostlib::tools::ToolsCapability::default())
    .with(harn_hostlib::ast::AstCapability::default());
registry.register_into_vm(&mut vm);
```

The cargo feature `hostlib` on `harn-cli` is **default-on**. Embedders
can disable it with `--no-default-features` for a slimmer build that
omits the tree-sitter/notify/gix dependency tree entirely.

## How `burin-code` consumes it

`burin-code` pulls hostlib in transitively via the harn release pinned in
its `.harn-version` manifest. After this scaffold lands, the parent epic
ships:

1. A harn release bumping the version in this repo (per
   [`scripts/release_ship.sh`]../../scripts/release_ship.sh).
2. A burin-code PR bumping `.harn-version` to that release.
3. burin-code progressively retires its Swift-side `BurinCore`
   counterparts as each implementation issue lands here.

The schemas under `schemas/<module>/<method>.{request,response}.json` are
the **source of truth** for burin-code's schema-drift tests. They ship
with the published crate (see the `include` field in `Cargo.toml`) and
are also mirrored at compile time via `include_str!` into
[`schemas.rs`](src/schemas.rs) so embedders can fetch them
programmatically without locating the on-disk schema directory.

## Directory layout

```text
crates/harn-hostlib/
├── Cargo.toml
├── README.md                  # this file
├── schemas/                   # JSON Schema 2020-12 contracts
│   ├── ast/
│   ├── code_index/
│   ├── scanner/
│   ├── fs_watch/
│   └── tools/
├── src/
│   ├── lib.rs                 # public surface + install_default
│   ├── error.rs               # HostlibError → VmError translation
│   ├── registry.rs            # HostlibCapability + HostlibRegistry
│   ├── schemas.rs             # const SCHEMAS catalog (include_str!)
│   ├── ast/
│   ├── code_index/
│   ├── scanner/
│   ├── fs_watch/
│   └── tools/
└── tests/
    └── registration.rs        # registration + schema parity tests
```

## Adding a new method

1. Add a `register_unimplemented(...)` entry in the relevant module's
   `register_builtins`.
2. Drop `<method>.request.json` and `<method>.response.json` into
   `schemas/<module>/`.
3. Append two `include_str!` entries to `SCHEMAS` in `src/schemas.rs`.
4. Add the method name to the `assert_eq!` list in `tests/registration.rs`.

The integration tests catch any drift between the four locations.