llama-rs 0.17.0

A high-performance Rust implementation of llama.cpp - LLM inference engine with full GGUF support
Documentation

llama-rs

There is very little structured metadata to build this page from currently. You should check the main library docs, readme, or Cargo.toml in case the author documented the features in them.

This version has 17 feature flags, 6 of them enabled by default.

default

cli (default)

client (default)

cpu (default)

This feature flag does not enable additional features.

huggingface (default)

onnx (default)

server (default)

council

council-e2e

cuda

distributed

dx12

hailo

metal

rag

rag-sqlite

vulkan

vulkan-shaders

This feature flag does not enable additional features.