slm_ikllama_sys 0.1.1

ik_llama.cpp rust sys bindings
docs.rs failed to build slm_ikllama_sys-0.1.1
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

slm_ikllama_sys

Raw Rust FFI bindings for ik_llama.cpp — a performance-oriented fork of llama.cpp with improved CPU/GPU inference, new quantization types, first-class DeepSeek/MLA support, and fused MoE operations.

The ik_llama.cpp source tree is included as a git submodule and compiled from source during cargo build via a CMake-based build.rs.

What this crate provides

  • bindings.rs — generated by bindgen at build time from ik_llama.cpp/include/llama.h and ik_llama.cpp/ggml/include/ggml.h; exposes the full llama_* and ggml_* C API.
  • ik_llama_cpp_wrapper — a thin C++ static library (wrapper.cpp) compiled with cc that gives Rust a stable link point into the ik_llama.cpp headers.
  • Shared libraries (libllama.so, libggml.so, …) built by CMake and hard-linked into the Cargo target directory so they are found at runtime.

This crate is not intended for direct use — consume it through slm_ikllama, which implements the slm_inference trait layer on top of these bindings.

Build

The build script compiles ik_llama.cpp with CMake. A few environment variables control the build:

Variable Default Description
LLAMA_LIB_PROFILE Release CMake build profile
LLAMA_STATIC_CRT 0 Link against static CRT (Windows MSVC)
BUILD_DEBUG unset Print verbose build diagnostics
CMAKE_VERBOSE unset Make CMake very verbose
CMAKE_* Any CMAKE_* env var is forwarded to CMake as-is

Features

Feature Description
cuda (default) Enable NVIDIA CUDA via GGML_CUDA; links cudart, cublas, cublasLt
native (default) Compile for the host CPU architecture (CMAKE_CUDA_ARCHITECTURES=native, GGML_NATIVE=ON)

When native is off, the build targets a broad x86-64 baseline (AVX2, AVX512-BF16/VBMI/VNNI).

Platform notes

  • Linux — links stdc++ dynamically; shared .so files are placed in the Cargo target directory.
  • macOS — links Foundation, Metal, MetalKit, Accelerate frameworks; .dylib files are placed in the target directory.
  • Windows MSVC — MSVC include paths are discovered via the cc crate; .dll files are copied to the target and deps directories.
  • CUDA architectures — with native, the GPU arch is auto-detected; without it, 86;89;120 (RTX 30/40/50xx) are targeted.