Skip to main content

Crate spn_native

Crate spn_native 

Source
Expand description

Native model inference and storage for the SuperNovae ecosystem.

This crate provides:

§Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│  spn-native                                                                 │
│  ├── HuggingFaceStorage     Download GGUF models from HuggingFace Hub       │
│  ├── detect_available_ram_gb()  Platform-specific RAM detection             │
│  ├── default_model_dir()        Default storage path (~/.spn/models)        │
│  └── NativeRuntime (inference)  mistral.rs inference integration            │
└─────────────────────────────────────────────────────────────────────────────┘

§Features

  • progress: Enable terminal progress bars for downloads
  • inference: Enable local LLM inference via mistral.rs
  • native: Alias for inference
  • full: All features

§Example: Download

use spn_native::{HuggingFaceStorage, default_model_dir, detect_available_ram_gb};
use spn_core::{find_model, auto_select_quantization, DownloadRequest};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Detect RAM and select quantization
    let ram_gb = detect_available_ram_gb();
    let model = find_model("qwen3:8b").unwrap();
    let quant = auto_select_quantization(model, ram_gb);

    // Create storage and download
    let storage = HuggingFaceStorage::new(default_model_dir());
    let request = DownloadRequest::curated(model).with_quantization(quant);

    let result = storage.download(&request, |progress| {
        println!("{}: {:.1}%", progress.status, progress.percent());
    }).await?;

    println!("Downloaded to: {:?}", result.path);
    Ok(())
}

§Example: Inference (requires inference feature)

use spn_native::inference::{NativeRuntime, InferenceBackend};
use spn_core::{LoadConfig, ChatOptions};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let mut runtime = NativeRuntime::new();

    // Load a downloaded model
    runtime.load("~/.spn/models/qwen3-8b-q4_k_m.gguf".into(), LoadConfig::default()).await?;

    // Run inference
    let response = runtime.infer("What is 2+2?", ChatOptions::default()).await?;
    println!("{}", response.message.content);

    Ok(())
}

Re-exports§

pub use inference::DynInferenceBackend;
pub use inference::InferenceBackend;
pub use inference::NativeRuntime;

Modules§

inference
Native LLM inference module.

Structs§

ChatOptions
Options for chat completion.
ChatResponse
Response from a chat completion.
DownloadRequest
Request to download a model.
DownloadResult
Result of a model download.
HuggingFaceStorage
Storage backend for HuggingFace Hub models.
KnownModel
A curated model in the registry.
LoadConfig
Configuration for loading a model.
ModelInfo
Information about an installed model.
PullProgress
Progress information during model pull/download.

Enums§

BackendError
Error types for backend operations.
ModelArchitecture
Architecture supported by mistral.rs v0.7.0.
ModelType
Model capability type.
NativeError
Errors that can occur in spn-native operations.
Quantization
Quantization levels for GGUF models.
ResolvedModel
Result of model resolution.

Traits§

ModelStorage
Model storage backend (sync version).

Functions§

auto_select_quantization
Auto-select quantization based on available RAM.
default_model_dir
Default model storage directory.
detect_available_ram_gb
Detect available system RAM in gigabytes.
extract_quantization
Extract quantization type from a filename.
find_model
Find a curated model by ID.
resolve_model
Resolve a model ID to a KnownModel or HuggingFace passthrough.

Type Aliases§

ProgressCallback
Type alias for download progress callbacks.
Result
Result type alias for spn-native operations.