docs.rs failed to build llama-cpp-sys-4-0.2.29
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build:
llama-cpp-sys-4-0.2.43
llama-cpp-sys-4
Raw bindgen-generated bindings to llama.cpp,
plus the C/C++ build logic that compiles the library.
llama.cpp version: b8533 · Crate version: 0.2.13
Unless you need access to a symbol not yet exposed by llama-cpp-4,
use that crate instead — it provides a safe API over these raw bindings.
What's included
llama_*functions and types fromllama.hggml_*functions and types fromggml/include/ggml.hLLAMA_*constantscommon_tokenizeandcommon_token_to_piecefromcommon/common.h- The entire llama.cpp static library (or shared, with
dynamic-link)
Feature flags
| Feature | Description |
|---|---|
openmp |
OpenMP multi-threading (default on; auto-detected on ARM platforms) |
cuda |
NVIDIA GPU (requires CUDA toolkit) |
metal |
Apple GPU (macOS/iOS only) |
vulkan |
Vulkan GPU backend |
native |
-march=native — tune for the build machine's CPU |
rpc |
Remote compute backend |
dynamic-link |
Link against a pre-installed shared libllama instead of building from source |
Building
The crate compiles llama.cpp from the vendored submodule at build time using
cc + cmake-style flags. No external llama.cpp installation is required.
# CPU only (default)
# Metal (macOS)
# CUDA
# OpenMPI (distributed inference)
Build dependencies
clang— required bybindgento parse the C++ headers- A C++17 compiler (GCC 9+, Clang 10+, MSVC 2019+)
cmakeis not required — the build is driven entirely bybuild.rs
Regenerating bindings
Bindings are regenerated automatically whenever build.rs or wrapper.h
changes. The allowlist covers llama_*, ggml_*, LLAMA_*, and the two
common_* functions.
# Force a full rebuild including binding regeneration
Notable API changes (b4689 → b8249)
These are the upstream llama.cpp breaks handled in this crate:
| Removed / renamed | Replacement |
|---|---|
llama_kv_cache_* functions |
llama_memory_* via llama_get_memory(ctx) |
llama_set_adapter_lora + llama_rm_adapter_lora |
llama_set_adapters_lora (batch API) |
context_params.flash_attn: bool |
context_params.flash_attn_type: llama_flash_attn_type |
llama-sampling.h |
llama-sampler.h |
| C++11 build flag | C++17 required by new common.h (std::string_view) |
Bindgen configuration
Key decisions in build.rs:
derive_partialeq(true)withno_partialeq(...)overrides for structs containing function-pointer fields (avoids theunpredictable_function_pointer_comparisonslint).opaque_type("std::.*")— C++ STL types are opaque pointers.- OpenMP auto-detection — reads
GGML_OPENMP_ENABLEDfrom the CMake cache rather than relying solely on theopenmpfeature flag, because some ARM toolchains enable OpenMP unconditionally.