AutoAgents llama.cpp Backend
Local LLM inference backend for AutoAgents using llama-cpp-2 bindings.
Features
- GGUF Model Support: Load local GGUF models via llama.cpp
- Sampling Controls: Temperature, top-k, top-p, penalties
- Structured Output: JSON schema hints with optional grammar enforcement
- Streaming: Token streaming for chat responses
- Production Ready: Robust error handling and configuration