llama-rs 0.16.0

A high-performance Rust implementation of llama.cpp - LLM inference engine with full GGUF support
Documentation