llama-gguf 0.2.0

A high-performance Rust implementation of llama.cpp - LLM inference engine with full GGUF support
Documentation

llama-gguf

There is very little structured metadata to build this page from currently. You should check the main library docs, readme, or Cargo.toml in case the author documented the features in them.

This version has 6 feature flags, 1 of them enabled by default.

default

cpu (default)

This feature flag does not enable additional features.

cuda

metal

rag

server

vulkan