docs.rs failed to build candelabra-0.1.2
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build:
candelabra-0.1.3
candelabra
candelabra is a small Rust crate for desktop applications that want to run
quantized GGUF models (LLaMA, Qwen, Phi, Gemma, etc.) with
candle-core,
candle-transformers, and
hf-hub.
It focuses on the pieces GUI apps usually need:
- Hugging Face downloads that respect the local
hf-hubcache - tokenizer loading helpers
- automatic Metal or CUDA fallback to CPU
- reusable loaded model state
- token streaming with cancellation support
Current Scope
candelabra natively supports quantized GGUF checkpoints with dynamic architecture detection.
Supported architectures include:
llama/mistral/mixtral/gemma/gemma2phi3qwen2(Qwen 2, Qwen 2.5, QwQ)qwen3/qwen3moegemma3glm4
That means the crate is a good fit if you want a lightweight Rust API for local
desktop inference on models such as Qwen 2.5 or SmolLM GGUF variants.
It abstracts away the candle_transformers::models paths into a single unified Model block.
Installation
Add the crate to your Cargo.toml:
[]
= "0.1"
Quick Start
use ;
use ;
Main API
download_model()downloads a model file through the local Hugging Face cache.download_model_with_progress()anddownload_model_with_channel()emit progress updates suitable for UI progress bars.load_tokenizer_from_repo()downloads and loadstokenizer.json.Model::load()loads a quantized GGUF model onto the best available device, dynamically instantiating the correct candle architecture base on metadata.run_inference()streams generated tokens through a callback.run_inference_with_channel()streams generated tokens over a Tokio channel.
Platform Notes
- On macOS, the crate prefers Metal and falls back to CPU.
- On non-macOS platforms, the crate prefers CUDA and falls back to CPU.
- The public
device_usedstring is intended to be easy to surface directly in desktop UIs.
License
Licensed under either of these, at your option:
- Apache License, Version 2.0
- MIT license