Expand description
Constrained decoding (structured outputs) for Large Language Models.
This crate enforces arbitrary context-free grammars on LLM output, enabling structured generation with negligible overhead (~50μs per token for a 128k tokenizer). Context-free grammars can be provided with a Lark-like syntax, with specialised support for JSON.
§Key types
ParserFactory— compiles grammars and holds shared tokenizer state.TokenParser— built viaParserFactory::create_parser(); drives a single generation session.Constraint— main entry point wrapping aTokenParserand exposing the sampling-loop API.
§Usage pattern
- Call
Constraint::compute_mask()to obtain the set of allowed tokens. This may take >1 ms and is best run on a background thread. - Sample a token from the LLM using the mask.
- Pass the sampled token to
Constraint::commit_token()(very fast). - Repeat until a stop result is returned.
See the sample_parser crate for a complete usage example.
Re-exports§
Modules§
- api
- earley
- This is the primary interface for llguidance – the one on which the others (FFI and LLInterpreter) are built. While not cleanest of these interfaces, it is the most inclusive.
- ffi
- output
- panic_
utils - substring
Macros§
Structs§
- Commit
Result - Constraint
- High-level entry point for constrained decoding.
- Grammar
Builder - Instant
- A measurement of a monotonically nondecreasing clock.
Opaque and useful only with
Duration. - Json
Compile Options - Logger
- Matcher
- This is meant to be used in server-side scenarios. The Constraint interface is more for usage in Python Guidance.
- NodeRef
- Parser
Factory - Compiles grammars and holds shared tokenizer state.
- Stop
Controller - Token
Parser - Token-level parser that drives a single constrained-generation session.
Functions§
- json_
merge - regex_
to_ lark - Make sure given regex can be used inside /…/ in Lark syntax.
Also if
use_ascii.contains('d')replace\dwith[0-9]and\Dwith[^0-9]. Similarly for\w/\W([0-9a-zA-Z_]) and\s/\S([ \t\n\r\f\v]). For standard Unicode Python3 or Rust regex crate semanticsuse_ascii = ""For JavaScript or JSON Schema semanticsuse_ascii = "dw"For Python2 or byte patters in Python3 semanticsuse_ascii = "dws"More flags may be added in future. - token_
bytes_ from_ tokenizer_ json - Parse HF tokenizer.json file and return bytes for every token