Skip to main content

Module gguf_runtime

Module gguf_runtime 

Source
Expand description

GGUF Runtime Adapter

Wraps the existing llama-server (llama.cpp) for GGUF models. This adapter spawns the llama-server process and proxies requests via HTTP.

Structsยง

GGUFRuntime