callm

About

Callm allows you to easily run Generative AI models (like Large Language Models) directly on your hardware, offline. Under the hood callm relies heavily on the candle crate and is 100% pure Rust.

Supported models

Model	Safetensors	GGUF (quantized)
Llama	✅	✅
Mistral	✅	✅
Phi3	✅	❌
Qwen2	✅	❌

[!NOTE] Callm is still at early development stage and is NOT production ready yet.

Installation

Add callm to your dependencies:

$ cargo add callm

Usage

Callm uses builder pattern for creating inference pipelines.

use callm::pipelines::PipelineText;

fn main() -> Result<(), Box<dyn std::error::Error> {
	let mut pipeline = PipelineText::builder()
		.with_location("/path/to/model")
		.build()?;

	let text_completion = pipeline.run("Tell me a joke about x86 instruction set")?;
	println!("{}", text_completion);

	Ok(())
}

callm 0.1.0

callm

About

Supported models

Installation

Usage