ruvllm 2.0.4

LLM serving runtime with Ruvector integration - Paged attention, KV cache, and SONA learning

ruvllm

There is very little structured metadata to build this page from currently. You should check the main library docs, readme, or Cargo.toml in case the author documented the features in them.

This version has 28 feature flags, 9 of them enabled by default.

default

async-runtime (default)

candle (default)

tokio (default)

tokio-stream (default)

candle-core (default)

candle-nn (default)

candle-transformers (default)

hf-hub (default)

tokenizers (default)

accelerate

This feature flag does not enable additional features.

attention

coreml

cuda

gguf-mmap

gnn

graph

hybrid-ane

inference-cuda

inference-metal

inference-metal-native

metal

metal-compute

minimal

mmap

parallel

ruvector-full

wasm

This feature flag does not enable additional features.

wasm-simd

This feature flag does not enable additional features.