Crate catgrad_llm

Crate catgrad_llm

Expand description

LLM-specific code like tokenization and kv-cache logic which (currently) has to live outside the model graph.

Modules§

models
run: A stripped-down version of ModelRunner from catgrad examples, intended for serving
serve: Abstract interfaces for serving LLMs
utils