llama-runtime 0.1.0

Execution runtime for llama.rs — oxidizedMLX integration and backend selection
Documentation

llama-runtime

Runtime execution and verification helpers for llama.rs.

This crate includes:

  • A MockEngine demonstrating the narrow-waist LlamaEngine trait
  • A Phase-1 verification harness for LLAMA-006: full_forward(prompt) logits vs prefill(prompt[:-1]) + decode(last_token) logits.