Crate catgrad_llm

Crate catgrad_llm 

Source
Expand description

LLM-specific code like tokenization and kv-cache logic which (currently) has to live outside the model graph.

Modulesยง

models
run
A stripped-down version of ModelRunner from catgrad examples, intended for serving
serve
Abstract interfaces for serving LLMs
utils