Skip to main content

Crate prosaic_core

Crate prosaic_core 

Source
Expand description

General-purpose natural language generation from structured data.

Takes structured events and produces natural-sounding prose, not just grammatically correct output. The engine tracks discourse state across calls, so multiple renders flow together like human-written prose — using pronouns, varying phrasing, matching verbosity to impact, and structuring multi-paragraph narratives.

English, Spanish, and German grammars ship out of the box via the prosaic-grammar-en, -es, and -de sibling crates. Add more languages by implementing the Language trait.

§Quick start

use prosaic_core::{Context, Engine, Session, Strictness, Value, Variation};
use prosaic_grammar_en::English;

let mut engine = Engine::new(English::new())
    .strictness(Strictness::Strict)
    .variation(Variation::Fixed);

engine.register_template(
    "entity.renamed",
    "{old_name|refer} was renamed to {new_name}",
).unwrap();

let mut ctx = Context::new();
ctx.insert("entity_type", Value::String("class".into()));
ctx.insert("old_name", Value::String("Foo".into()));
ctx.insert("new_name", Value::String("Foobar".into()));

let mut session = Session::new();
let sentence = engine.render(&mut session, "entity.renamed", &ctx).unwrap();
assert_eq!(sentence, "The class Foo was renamed to Foobar.");

§Type-aware template validation

Context types that derive IntoContext also get a HasProsaicSchema impl for free. Pair it with the context: argument of prosaic_template! to validate slot types at compile time:

use prosaic_core::{Engine, Session};
use prosaic_derive::{IntoContext, prosaic_template};
use prosaic_grammar_en::English;

#[derive(IntoContext)]
struct RenameCtx {
    old_name: String,
    new_name: String,
    consumer_count: i64,
}

// Compile error if `consumer_count` were declared as `String`:
let tpl = prosaic_template! {
    template: "{old_name} → {new_name} ({consumer_count|pluralize:consumer})",
    slots: [old_name, new_name, consumer_count],
    context: RenameCtx,
};

let mut engine = Engine::new(English::new());
engine.register_template("rename", tpl).unwrap();

For templates loaded dynamically (JSON manifests, on-disk sources), use Engine::register_template_with_schema to get the same check at registration time.

§Feature flags

  • std (default): std::error::Error on ProsaicError, SystemTime::now() fallbacks. Disable for no_std + alloc targets.
  • time (default): {ts|relative} and {ts|since_last} pipes.
  • polish (default): sentence-length budgeting and smart quotes.
  • reg (default): referring expression generation (Dale-Reiter + graph-based).
  • serde (off): Serialize/Deserialize on public types.
  • parallel (off): DocumentPlan::render_parallel via rayon.

§no_std support

Disable the std feature to compile under no_std + alloc:

prosaic-core = { version = "0.2", default-features = false }

Without the std feature:

  • Variation::Random falls back to Variation::Fixed (variant 0).
  • {ts|relative} and {ts|since_last} require engine.reference_time().
  • ProsaicError does not implement std::error::Error.

Re-exports§

pub use agreement::AgreementFeatures;
pub use agreement::AgreementPerson;
pub use agreement::Animacy;
pub use agreement::Case;
pub use agreement::Definiteness;
pub use agreement::Gender;
pub use agreement::Number as GrammaticalNumber;
pub use rst::RstRelation;

Modules§

agreement
Grammatical agreement features for multilingual rendering.
rst
Rhetorical Structure Theory relations for discourse labeling.

Macros§

assert_faithful
Assert that a rendered output is faithful to its context and template.
ctx
Build a Context from key/value pairs. Values may be any type that implements IntoValue&str, String, integer types, bool, Vec<&str>, [&str; N], or an explicit Value::*.

Structs§

AntonymRegistry
Registry of phrase-level antonym substitutions used for positive framings of negative statements.
Cf
A forward-looking center: an entity realized in an utterance with its grammatical-role-based salience rank (lower = more prominent).
Clause
A subordinate clause attached to a sentence.
ConnectiveFamilySaturation
Detects when one connective family (continuation, similarity, contrast) emits more than its document-scope budget. Per the existing engine trailing-window cap, each family caps at the size of its pool inside any FAMILY_WINDOW span; this diagnoser aggregates across the whole document and fires when the cumulative count exceeds the max_per_family budget.
ConnectivePreferences
Per-RST-relation connective preferences.
Context
Holds the key-value pairs passed to a template for rendering.
Diagnostic
A failure pattern detected over a rendered document. Carries a severity for the scorer and a hint about what constraints would address it.
DiscourseState
Tracks discourse state across multiple render calls for natural output.
DocumentPlan
A planned document — a structured narrative with paragraph breaks.
DocumentScopeRhythm
Detects when sentence-length variance across the whole document drops below min_stdev words. The per-decision rhythm scorer can land each individual sentence inside a healthy local window while the aggregate flattens to a monotone cadence — this catches the latter.
Engine
The core NLG engine. Holds a language implementation, template registry, and immutable configuration. All per-render mutable state lives in Session, which callers pass into render methods.
EntityDescriptor
A described entity. Attributes are intentionally ordered so the default preference ordering respects registration order.
EntityRegistry
Registry of descriptors for entities known to the engine.
EntityValue
Fluent builder for entity-typed context values with agreement features.
FaithfulnessScore
A faithfulness score for a rendered hypothesis against its source.
HedgingCalibration
Hedging calibration — shifts the deterministic confidence-to-hedge mapping.
LengthDistribution
Sentence-length distribution target.
ListStyleFatigue
Detects when the same ListStyle dominates the document’s list-style emissions. Fires when one style accounts for ≥ threshold of the most recent window emissions, with a minimum-emissions gate.
Paragraph
A paragraph in a document plan — a group of related events rendered together.
ParagraphOpenerMonotony
Detects when the same connective opens ≥ threshold of the document’s paragraphs. Default threshold is 3 with a minimum-paragraphs gate of 4 — fires when at least 3 of 4+ paragraphs share an opener, never on short documents where the pattern is statistically meaningless.
Pipe
A pipe transform applied to a slot value.
PolarityDrift
A single polarity token whose count differs between source and hypothesis.
ProfileDistributionDrift
Active only when a StyleProfile is provided. Detects when any of the profile’s target distributions (length, list-style, connective frequency) diverges from observed by more than delta.
RefineConfig
Configuration for the retrospective refine pass on DocumentPlan::render.
RefineOutcome
Outcome of one refine pass: the final flattened text, the iterations that ran, and the final composite score. Returned by crate::DocumentPlan::render_refined.
RefineWeights
Composite-scorer weights. Documented defaults are produced via RefineWeights::default and serve as the v1 hand-tuned baseline. A future offline tuner (see docs/plans/refine-scorer-tuner.md) will emit alternative weight sets for projects that have a reason to deviate.
RenderExplanation
Per-render diagnostics — everything the engine decided along the way to produce the final output. Returned by Engine::render_explained. Useful for template-author debugging (“why did variant B win?”) and for vocab-module linting.
RenderIter
Iterator returned by Engine::render_iter. Wraps the batch rendering logic so callers can consume the output sentence-by-sentence without waiting for the full batch to complete.
RenderedDocument
A structured view of a rendered document.
RenderedParagraph
RenderedSentence
RstRelationImbalance
Detects when one RST relation accounts for more than max_share of the document’s inter-sentence connectives. Default max_share is 0.6 (60%); minimum-emissions gate prevents short-document false positives.
SalienceThresholds
Thresholds for automatic salience derivation from context.
Sentence
Builder for constructing sentences programmatically.
Session
Mutable state for a render sequence. See module docs.
StyleProfile
A declarative voice configuration for the engine.
StyleProfileBuilder
Builder for StyleProfile.
SubgraphDescription
Output of the graph-based REG algorithm.
Subject
A subject in a sentence, optionally with an entity type prefix.
SynonymRegistry
Registry of synonym groups. Each group is an ordered list; ties in recency are broken by registration order (first-registered wins).
Template
A parsed template ready for rendering.
UsedConnective
UsedListStyle
VariantScore
Diagnostic output from Engine::score_variants: one entry per variant that would be considered for the given key and context, with the choose-best score the engine would assign and a flag marking the variant that render() would currently emit.

Enums§

Aspect
Grammatical aspect — whether the action is simple, completed, or ongoing.
BareSegment
A segment from a bare-slot-only template decomposition.
Conjunction
Conjunction used when joining lists.
GroupingStrategy
How events should be grouped into paragraphs.
HedgeMode
Which form the hedge should take.
ListStyle
List formatting style.
ListStyleBias
List-style cycle tiebreaker.
Mood
Grammatical mood — indicative (factual) vs conditional (“would …”).
Person
Grammatical person for conjugation.
PipeArg
An argument passed to a pipe transform.
PluralCategory
CLDR plural categories. A language’s Language::plural_category implementation maps an integer count into one of these six buckets. The subset a language actually uses depends on its grammar:
PronounDensity
Pronoun density dial — adjusts the threshold at which {name|refer} switches from full form to short form to pronoun.
ProsaicError
QuantifyMode
How the quantifier should be framed.
ReferenceForm
How an entity should be referred to based on discourse context.
RefineConstraint
An adversarial constraint applied to one refinement iteration. Constraints are additive within an iteration but never persist across iterations — each new iteration re-derives them from the latest diagnosis.
RegAlgorithm
Selects the REG (Referring Expression Generation) algorithm used by the {name|refer} pipe when rendering the Full form of a reference.
RhetoricalCategory
Rhetorical classification of an event based on its template key.
Salience
Salience level of an event — how much detail/emphasis it deserves.
SalienceBias
Salience-threshold bias.
Strictness
Controls how missing slots are handled during rendering.
StyleProfileError
Validation error returned by StyleProfile::validate and StyleProfileBuilder::build.
Tense
Verb tense for conjugation.
Transition
Centering Theory transition class between consecutive utterances.
Value
A value that can be inserted into a rendering context.
ValueType
A linguistic type that a template slot or pipe can carry.
Variation
Controls how template alternatives are selected.
VerbForm
A fully-specified verb form. Convenience enum covering the most common tense × aspect × mood combinations in English. Use with Language::verb_phrase or template {…|verb:<form>} pipes.
Verbosity
Verbosity dial — biases salience-tier preference at variant selection.
Voice
Voice controls whether the verb is rendered in active or passive form.

Traits§

Diagnoser
Pluggable detector. Built-in diagnosers ship in prosaic-core::refine_diagnosers; external diagnosers register via RefineConfig::with_diagnoser.
HasProsaicSchema
Compile-time schema for a context type.
IntoContext
Convenience trait for converting types into Context.
IntoValue
Convert a host value into a Value. Used by the ctx! macro to accept strings, integers, and slices without requiring the caller to wrap each in a Value::* constructor.
Language
Trait abstracting over a natural language’s grammar rules.

Functions§

default_classifier
Default template-key classifier used by GroupingStrategy::ByAction.
distinguishing_attributes
Dale & Reiter’s Incremental Algorithm.
distinguishing_subgraph
Krahmer et al. 2003 graph-based greedy REG algorithm.
em_dash_nested_parentheticals
Promote comma-bounded parentheticals whose inner clause itself contains commas to em-dash-bounded parentheticals. Improves readability by disambiguating which comma closes the outer clause.
english_proportion
Produce the English proportion phrase for matching / total, optionally suffixed with a pluralized form of noun.
english_verb_phrase
Default English-style verb phrase composition. Provided as a free function so custom Language impls can delegate to it selectively (fn verb_phrase(…) { english_verb_phrase(self, …) }).
entity
Create an EntityValue for a named entity with default (unknown) agreement features. Chain builder methods to set gender, number, etc.
format_relative
Format a positive-is-past difference in seconds as a natural English phrase. Positive values mean the target is in the past ("yesterday"), negative values mean the future ("tomorrow"), zero means right now.
hedge
Map a 0..=100 score into a hedge word for the given mode. Values outside the range are clamped.
insert_not
Fallback negation: split the phrase after its first auxiliary and insert “not”. If the first word isn’t a recognizable aux, prepend “not “ — ungrammatical on its own but better than silently losing the negation (callers should register an antonym for this case).
named
Create a plain subject without an entity type.
pipe_spec
Look up a pipe by name. const fn so it is usable inside const _: () = { ... } assertion blocks emitted by the prosaic_template! macro.
quantify
Produce a natural-language quantifier for count using the given mode. The language is used to spell out small numbers (“two”, “three”, …) via its number_to_words method.
schema_lookup
Look up a slot’s declared ValueType in a HasProsaicSchema-style schema slice. const fn so it is usable inside compile-time assertion blocks emitted by the prosaic_template! macro.
score_document
Compute the composite score for document under weights and profile. Returns a value in [0.0, sum_of_weights]. Higher is better.
score_faithfulness
Score the faithfulness of a rendered output against the Context that produced it and the template_literals that anchored it.
smart_quotes
Convert straight quotes into curly (typographic) quotes using a simple open/close alternation. Honours nested and adjacent punctuation; the state machine is naive but robust enough for standard prose.
split_long
Try to split sentence so each piece fits within max_chars.
subject
Create a subject with an entity type prefix. e.g., subject("class", "Foo") renders as “the class Foo”.
types_compatible
Returns true when a value of type actual can satisfy a slot or pipe that expects type expected. ValueType::Any is compatible with every concrete type in either direction; concrete types are compatible only with themselves.