Skip to main content Module models Copy item path Source BenchTuneConfig BenchTuneMetrics BenchTuneParam BenchTuneParamValue BenchTuneResult DiscoveredModel A discovered model file. DownloadState Download progress information. GPUBuffer GPU device buffer reported by llama-server during model loading. GgufMetadata Parsed GGUF metadata for a model, cached to avoid re-parsing the file. LoadProgress Progress information during model loading, parsed from llama-server log output. ModelSettings Settings for loading a model via llama.cpp server. Samplers Sampler order string (semicolon-separated).
Common types: penalties, dry, top_n_sigma, top_k, typ_p, top_p, min_p, xtc, temperature SearchResult A model found via HuggingFace search. ServerMetrics Metrics reported by the llama.cpp server. WsMetrics WebSocket-friendly metrics snapshot (serializable, no internal state). Backend Backend used to run the llama.cpp server. BenchTuneMode BenchTuneProgress Progress status for benchmark tuning BenchTuneStatus CacheQuantType KV cache quantization type. CacheType Main KV cache data type. DownloadStatus GpuLayersMode How to handle GPU layer offloading. Mirostat Mirostat version. ModelState The state of a model in the manager. NumMode NUMA optimization mode. RopeScaling RoPE frequency scaling method. SearchSort Sort order for search results. ServerMode Server mode: normal (single model) or router (multiple models). SplitMode Split mode for multi-GPU. BENCHMARK_PROMPT Default benchmark prompt used when starting a tuning session. clean_host Ensure host string is valid for URL construction and CLI arguments.
Handles empty strings (defaults to 127.0.0.1), strips display suffixes,
and wraps IPv6 addresses in brackets. estimate_vram_mib Estimate VRAM usage (in MiB) for a model with the given settings. format_host Format a host string for display (e.g. “” or “127.0.0.1” -> “localhost (127.0.0.1)”). strip_gguf Strip the .gguf extension from a model name. CacheTypeK CacheTypeV