gradatum-engine 0.0.2

On-device inference adapter: provides Chat, Embedder and Reranker trait implementations backed by a shared local compute stack (candle / llama.cpp); optional via feature gate engine-local
Documentation

gradatum-engine

On-device inference adapter: provides Chat, Embedder, and Reranker trait implementations backed by candle or llama.cpp. Optional via engine-local feature gate.

Status : Alpha — placeholder v0.0.2. Phase 2.0c-bis Auth Path 2 LIVE 2026-05-07 (git tag v0.1.0-alpha.5). Source code private until v1.0 public release per D5 criterion. See gradatum.org.

Part of gradatum — Memory backbone for AI agents.

Public API (Phase 1+)

Feature flags

Feature Description Default
engine-local Enables local inference backends (candle / llama.cpp) disabled

Implementations (feature = engine-local)

/// Chat implementation backed by local llama.cpp model.
pub struct LocalChat { ... }

/// Embedder implementation backed by local candle model (bge-small-en-v1.5).
pub struct LocalEmbedder { ... }

/// Reranker implementation backed by local cross-encoder model.
pub struct LocalReranker { ... }

Anti-cycle invariant

gradatum-engine MAY depend on gradatum-chat and gradatum-embed. gradatum-chat and gradatum-embed MUST NOT depend on gradatum-engine. Composition happens at binary level (gradatum-server, gradatum-worker).

Documentation

  • Project : https://gradatum.org
  • Source : private until v1.0
  • Roadmap : Phase 1+ (candle CPU benchmarked at 17ms/embed on Ryzen AI)
  • License : Apache-2.0