Expand description
§AutoAgents llama.cpp Backend
Local LLM inference backend for AutoAgents using llama-cpp-2 bindings.
§Features
- GGUF Model Support: Load local GGUF models via llama.cpp
- Sampling Controls: Temperature, top-k, top-p, penalties
- Structured Output: JSON schema hints with optional grammar enforcement
- Streaming: Token streaming for chat responses
- Production Ready: Robust error handling and configuration
Re-exports§
pub use builder::LlamaCppProviderBuilder;pub use config::LlamaCppConfig;pub use config::LlamaCppConfigBuilder;pub use config::LlamaCppReasoningFormat;pub use config::LlamaCppSplitMode;pub use error::LlamaCppProviderError;pub use models::ModelSource;pub use provider::LlamaCppProvider;
Modules§
- builder
- config
- Configuration structures for llama.cpp provider.
- conversion
- Type conversions between AutoAgents types and llama.cpp types.
- error
- Error handling and conversions for llama.cpp backend.
- huggingface
- HuggingFace GGUF resolver using hf-hub cache.
- models
- Model source definitions for llama.cpp backend.
- provider
- LlamaCppProvider implementation with LLMProvider traits.
Enums§
- Llama
Split Mode - A rusty wrapper around
llama_split_mode.