Expand description
llmtask
Engine-agnostic structured-output abstraction for LLMs — Task trait, Grammar enum (JSON Schema, Lark, Regex), and the canonical ImageAnalysis data type. Decouples a prompt + grammar + parser from any specific inference backend.
§Overview
llmtask is the engine-agnostic side of every “structured output from an LLM” pipeline:
Task— a trait carrying the four things every constrained-decoding call needs: a prompt, a borrowed schema (type Value), a grammar wrapper (Grammarenum), and a typed parser (type Output,type ParseError). Engines accept anyT: Task<Value = ...>, so aTaskwritten once runs against any engine in the ecosystem (lfm,qwen, …) without translation.Grammar— an enum over the constrained-decoding surfaces real engines accept: JSON Schema (Grammar::JsonSchema, behind thejsonfeature), Lark (Grammar::Lark), and Regex (Grammar::Regex(RegexGrammar), behind theregexfeature — the wrapper holds both the source pattern and a default-options compiled regex, guaranteeing engine grammar and local validation describe the same language). Engines pattern-match and returnUnsupportedGrammarwhen they don’t speak a given variant — the caller can then route to a different backend.ImageAnalysis— the canonical single-image VLM output shape (scene category, description, subjects/objects/actions/mood/lighting lists, shot-type label, search tags). Lets multiple VLM engines (lfm,qwen) produce values of the same type so downstream consumers compare and merge results without per-engine adapters.
§Why an engine-agnostic Task layer?
Without a shared trait, every “structured output” prompt re-implements the same plumbing per engine: prompt string, schema construction, parser, error type, validation predicate. The plumbing is engine-independent — the same ImageAnalysis JSON Schema works against lfm’s llguidance backend and qwen’s mistralrs backend; only the constraint API differs.
llmtask separates them:
┌──────────────────────────┐
YourTask: impl Task ──▶ │ llmtask::Task contract │ ──▶ any engine that
│ prompt + Grammar │ takes &impl Task
│ parse → Output │
└──────────────────────────┘A Task written today against a JSON Schema runs through lfm (llguidance) and qwen (mistralrs) unchanged. A Task returning a Lark grammar runs through lfm natively; qwen rejects it cleanly via Error::UnsupportedGrammar so the caller can dispatch elsewhere.
§Features
- Engine-agnostic
Tasktrait with associatedOutput,Value, andParseErrortypes — engines bound to a specific schema kind get typed access (fn run<T: Task<Value = serde_json::Value>>). - Three grammar surfaces in a single
#[non_exhaustive]enum: JSON Schema (default), Lark, and pre-compiledregex::Regex. Engines pattern-match and route. UnsupportedGrammarerror carrying the rejected variantkind()and the engine’ssupportedlist — callers can route to a different engine when one variant isn’t accepted.- Optional
jsonfeature (default-on) —Grammar::JsonSchema(serde_json::Value)plus theJsonParseErrorconvenience type. Drop it viadefault-features = false, features = ["alloc"](or"regex"/"serde", both of which implyalloc) to get a Lark-or-Regex-only build with noserde_jsondep. NOTE:allocis required to reach any public API —default-features = falsealone exposes nothing. - Optional
regexfeature — pre-compiledregex::Regexin the variant (validation enforced by the type), plusas_regex()/as_regex_pattern()helpers. - Optional
serdefeature —Serialize/DeserializeonImageAnalysisfor downstream wire formats. - Canonical
ImageAnalysis— nine-field single-image VLM output shape with builder-style API (with_*/set_*), shared across the findit-studio engines.
§Example
// Marked `ignore` so doctests don't pull serde_derive under
// default features (`std + json` only). The example needs
// `--features serde` to compile (the `serde::Deserialize`
// derive); enable that on the dev-deps in any consumer who
// wants to lift this verbatim.
use std::sync::OnceLock;
use llmtask::{Grammar, ImageAnalysis, JsonParseError, Task};
use serde_json::{Value, json};
/// A minimal Task: "summarize what's in this image" as a JSON object
/// with a single `summary` string field.
struct SummaryTask;
impl Task for SummaryTask {
type Output = String;
type Value = Value;
type ParseError = JsonParseError;
fn prompt(&self) -> &str {
"Reply with JSON: {\"summary\": \"<one sentence>\"}"
}
fn schema(&self) -> &Value {
static SCHEMA: OnceLock<Value> = OnceLock::new();
SCHEMA.get_or_init(|| json!({
"type": "object",
"properties": { "summary": { "type": "string" } },
"required": ["summary"],
}))
}
fn grammar(&self) -> Grammar {
Grammar::JsonSchema(self.schema().clone())
}
fn parse(&self, raw: &str) -> Result<String, JsonParseError> {
#[derive(serde::Deserialize)]
struct R { summary: String }
let r: R = serde_json::from_str(raw.trim())?;
Ok(r.summary)
}
}
// `&SummaryTask` now satisfies any engine taking `&impl Task<Value = Value>`:
// lfm::Engine::run(&SummaryTask, &images, &opts)
// qwen::Engine::run(&SummaryTask, images).await
// And the canonical multi-field VLM output type lives right here:
let _empty = ImageAnalysis::new();A regex-only Task (no JSON dep at all):
// Marked `ignore` so doctests don't try to compile this under
// default features (the `regex` feature is opt-in).
use llmtask::{Grammar, Task};
use smol_str::SmolStr;
struct TimestampTask {
// Source pattern as the canonical Value — what engines like
// llguidance receive (anchor-implicit / full-match semantics).
pattern: SmolStr,
// Cached `Grammar` so `parse` can call `is_regex_full_match`
// without rebuilding the grammar each call.
grammar: Grammar,
}
#[derive(Debug)]
struct StringErr(String);
impl std::fmt::Display for StringErr {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(&self.0)
}
}
impl std::error::Error for StringErr {}
impl Task for TimestampTask {
type Output = String;
type Value = SmolStr;
type ParseError = StringErr;
fn prompt(&self) -> &str { "Output a date in YYYY-MM-DD format." }
fn schema(&self) -> &SmolStr { &self.pattern }
fn grammar(&self) -> Grammar { self.grammar.clone() }
fn parse(&self, raw: &str) -> Result<String, StringErr> {
let trimmed = raw.trim();
// `is_regex_full_match` is the engine-parity validator:
// `find()` + span equality, so it agrees with llguidance's
// anchor-implicit grammar. Bare `as_regex().is_match(...)` is
// unanchored substring matching and would accept e.g.
// `"abc2026-05-09xyz"` for `[0-9]{4}-[0-9]{2}-[0-9]{2}` — not
// what the engine produced, so don't use it for validation.
if self.grammar.is_regex_full_match(trimmed) != Some(true) {
return Err(StringErr(format!("output {trimmed:?} does not match")));
}
Ok(trimmed.to_string())
}
}§Installation
[dependencies]
# Default: JSON Schema support on, Lark always available, regex off, serde off.
llmtask = "0.1"
# Lark-only build (no serde_json, no regex):
# `alloc` is required — without it the public API is empty.
llmtask = { version = "0.1", default-features = false, features = ["alloc"] }
# Regex-only build (no serde_json; `regex` already implies `alloc`):
llmtask = { version = "0.1", default-features = false, features = ["regex"] }
# Everything:
llmtask = { version = "0.1", features = ["json", "regex", "serde"] }| Feature | Default | What it adds |
|---|---|---|
json | yes | Grammar::JsonSchema(serde_json::Value) variant + JsonParseError + the serde_json dep |
regex | no | Grammar::Regex(RegexGrammar) variant + validating Grammar::regex constructor + regex dep |
serde | no | Serialize / Deserialize on ImageAnalysis |
§MSRV
Rust 1.95.
§License
llmtask is under the terms of both the MIT license and the
Apache License (Version 2.0).
See LICENSE-APACHE, LICENSE-MIT for details.
Copyright (c) 2026 FinDIT Studio authors.
Re-exports§
pub use grammar::Grammar;stdorallocpub use grammar::UnsupportedGrammar;stdorallocpub use image_analysis::ImageAnalysis;stdorallocpub use task::Task;stdorallocpub use task::JsonParseError;jsonand (stdoralloc)
Modules§
- grammar
stdoralloc Grammar— engine-agnostic constrained-decoding grammar.- image_
analysis stdoralloc ImageAnalysis— the canonical single-image VLM output type, shared acrossqwenandlfmengines. Each engine’sImageAnalysisTaskconstructs values of this type; downstream consumers can pass&ImageAnalysisreferences between engine outputs without conversion.- task
stdoralloc Tasktrait and parse-error types — the cross-engine abstraction.