Skip to main content

Crate chat_mistralrs

Crate chat_mistralrs 

Source
Expand description

Local-inference provider for chat-rs, built on mistral.rs.

Loads weights in-process (no HTTP, no daemon). On first use, model files are downloaded into the standard Hugging Face cache (~/.cache/huggingface/) using HF_TOKEN from the environment when present (only required for gated repos).

use chat_mistralrs::MistralRsBuilder;

let client = MistralRsBuilder::new()
    .with_model("Qwen/Qwen2.5-3B-Instruct-GGUF")
    .with_gguf_file("qwen2.5-3b-instruct-q4_k_m.gguf")
    .build()
    .await?;

See providers/AGENTS.md for the overall provider architecture.

Re-exports§

pub use builder::DeviceChoice;
pub use builder::MistralRsBuilder;
pub use builder::WithModel;
pub use builder::WithoutModel;
pub use client::MistralRsClient;

Modules§

api
builder
client

Enums§

IsqType
Re-exported for ergonomic use in builder calls. In-situ quantization type specifying the format to apply to model weights.