callm
About
Callm allows you to easily run Generative AI models (like Large Language Models) directly on your hardware, offline. Under the hood callm relies heavily on the candle crate and is 100% pure Rust.
Supported models
| Model | Safetensors | GGUF (quantized) |
|---|---|---|
| Llama | ✅ | ✅ |
| Mistral | ✅ | ✅ |
| Phi3 | ✅ | ❌ |
| Qwen2 | ✅ | ❌ |
[!NOTE] Callm is still at early development stage and is NOT production ready yet.
Installation
Add callm to your dependencies:
$ cargo add callm
Usage
Callm uses builder pattern for creating inference pipelines.
use PipelineText;
= builder
.with_location
.build?;
let text_completion = pipeline.run?;
println!;
Ok
}