shimmy 2.0.1

Lightweight Ollama-compatible inference server with native SafeTensors support. No Python dependencies, cross-platform WebGPU acceleration via Airframe.
Documentation

shimmy

There is very little structured metadata to build this page from currently. You should check the main library docs, readme, or Cargo.toml in case the author documented the features in them.

This version has 12 feature flags, 2 of them enabled by default.

default

airframe (default)

huggingface (default)

This feature flag does not enable additional features.

apple

coverage

fast

full

gpu

llama

This feature flag does not enable additional features.

llama-cuda

This feature flag does not enable additional features.

llama-opencl

This feature flag does not enable additional features.

llama-vulkan

This feature flag does not enable additional features.

mlx

This feature flag does not enable additional features.