Module models

Structs§

BenchTuneConfig
BenchTuneMetrics
BenchTuneParam
BenchTuneParamValue
BenchTuneResult
DiscoveredModel: A discovered model file.
DownloadState: Download progress information.
GPUBuffer: GPU device buffer reported by llama-server during model loading.
GgufMetadata: Parsed GGUF metadata for a model, cached to avoid re-parsing the file.
LoadProgress: Progress information during model loading, parsed from llama-server log output.
ModelSettings: Settings for loading a model via llama.cpp server.
Samplers: Sampler order string (semicolon-separated). Common types: penalties, dry, top_n_sigma, top_k, typ_p, top_p, min_p, xtc, temperature
SearchResult: A model found via HuggingFace search.
ServerMetrics: Metrics reported by the llama.cpp server.
WsMetrics: WebSocket-friendly metrics snapshot (serializable, no internal state).

BENCHMARK_PROMPT: Default benchmark prompt used when starting a tuning session.

clean_host: Ensure host string is valid for URL construction and CLI arguments. Handles empty strings (defaults to 127.0.0.1), strips display suffixes, and wraps IPv6 addresses in brackets.
estimate_vram_mib: Estimate VRAM usage (in MiB) for a model with the given settings.
format_host: Format a host string for display (e.g. “” or “127.0.0.1” -> “localhost (127.0.0.1)”).
strip_gguf: Strip the .gguf extension from a model name.