gradatum-engine 0.0.2

On-device inference adapter: provides Chat, Embedder and Reranker trait implementations backed by a shared local compute stack (candle / llama.cpp); optional via feature gate engine-local
Documentation
# gradatum-engine

> On-device inference adapter: provides `Chat`, `Embedder`, and `Reranker` trait implementations backed by candle or llama.cpp. Optional via `engine-local` feature gate.

**Status** : Alpha — placeholder `v0.0.2`. Phase 2.0c-bis Auth Path 2 LIVE 2026-05-07 (git tag `v0.1.0-alpha.5`). Source code private until `v1.0` public release per D5 criterion. See [gradatum.org](https://gradatum.org).

**Part of [`gradatum`](https://crates.io/crates/gradatum)** — Memory backbone for AI agents.

## Public API (Phase 1+)

### Feature flags

| Feature | Description | Default |
|---|---|---|
| `engine-local` | Enables local inference backends (candle / llama.cpp) | disabled |

### Implementations (feature = `engine-local`)

```rust
/// Chat implementation backed by local llama.cpp model.
pub struct LocalChat { ... }

/// Embedder implementation backed by local candle model (bge-small-en-v1.5).
pub struct LocalEmbedder { ... }

/// Reranker implementation backed by local cross-encoder model.
pub struct LocalReranker { ... }
```

## Anti-cycle invariant

`gradatum-engine` MAY depend on `gradatum-chat` and `gradatum-embed`.
`gradatum-chat` and `gradatum-embed` MUST NOT depend on `gradatum-engine`.
Composition happens at binary level (`gradatum-server`, `gradatum-worker`).

## Documentation

- Project : <https://gradatum.org>
- Source : private until v1.0
- Roadmap : Phase 1+ (candle CPU benchmarked at 17ms/embed on Ryzen AI)
- License : Apache-2.0