spn-native 0.2.0

Native model inference and storage for SuperNovae ecosystem
Documentation

spn-native

Native model inference and storage for the SuperNovae ecosystem.

Overview

spn-native provides local GGUF model inference using mistral.rs as the backend. This enables running large language models locally without requiring external services like Ollama.

Features

  • Local GGUF Inference: Run quantized models (Q4, Q5, Q8) directly on your hardware
  • Hugging Face Integration: Download models from the Hugging Face Hub
  • Streaming Support: Stream responses token-by-token
  • Cross-Platform: Works on macOS, Linux, and Windows
  • Feature-Gated: Compile only what you need

Installation

[dependencies]
spn-native = "0.1"

# With inference support (requires Rust 1.85+)
spn-native = { version = "0.1", features = ["inference"] }

Usage

Basic Inference

use spn_native::{NativeRuntime, LoadConfig, ChatOptions};

// Create runtime
let mut runtime = NativeRuntime::new();

// Load a GGUF model
let config = LoadConfig::default();
runtime.load("path/to/model.gguf", config).await?;

// Run inference
let response = runtime.infer("Hello, world!", ChatOptions::default()).await?;
println!("{}", response.message.content);

Streaming

use spn_native::NativeRuntime;

let runtime = NativeRuntime::new();
runtime.load("model.gguf", Default::default()).await?;

let stream = runtime.infer_stream("Tell me a story", Default::default()).await?;
while let Some(chunk) = stream.next().await {
    print!("{}", chunk?.delta);
}

Hugging Face Download

use spn_native::HuggingFaceStorage;

let storage = HuggingFaceStorage::new()?;
let path = storage.download("Qwen/Qwen3-8B-GGUF", "qwen3-8b-q4_k_m.gguf").await?;

Feature Flags

Feature Description Default
inference Enable GGUF inference via mistral.rs No
huggingface Enable HF Hub downloads No

Requirements

  • Rust 1.85+ (when inference feature enabled)
  • Metal (macOS) or CUDA (Linux/Windows) for GPU acceleration

Model Compatibility

Supports all GGUF models compatible with mistral.rs:

  • Qwen 3
  • Llama 3.x
  • Mistral / Mixtral
  • Phi-3
  • And more...

License

AGPL-3.0-or-later

Related Crates