Skip to main content

Crate forgellm_runtime

Crate forgellm_runtime 

Source
Expand description

Forge Runtime — minimal inference runtime.

Provides KV cache management, token sampling, tokenizer integration, and an OpenAI-compatible API server.

Functions§

version
Placeholder for the runtime. Will be implemented in PR 7.