llama-gguf 0.14.0

A high-performance Rust implementation of llama.cpp - LLM inference engine with full GGUF support
Documentation

llama-gguf

There is very little structured metadata to build this page from currently. You should check the main library docs, readme, or Cargo.toml in case the author documented the features in them.

This version has 15 feature flags, 6 of them enabled by default.

default

cli (default)

client (default)

cpu (default)

This feature flag does not enable additional features.

huggingface (default)

onnx (default)

server (default)

cuda

distributed

dx12

hailo

metal

rag

rag-sqlite

vulkan

vulkan-shaders

This feature flag does not enable additional features.