The constrained-rewrite prompt for the LLM formatter (consumed by the Candle
façade in T7). system is hard restraint that holds at every level; the
per-level rule only widens which edits are permitted. Restraint is the
wording, so it lives here in the pure core, not in the inference façade.
Remove a self-correction: when a backtrack trigger appears AS A WHOLE PHRASE,
drop the words immediately preceding it (the spec’s >3-word-reduction guard:
only fire when at least 3 words precede the trigger, so we don’t nuke a short
true clause). Word-bounded so it never deletes content words it matched inside.
Apply spoken formatting commands deterministically. Padding the input with
spaces lets a command at the phrase start or end match too (the replacements
are space-delimited). Note: back-to-back identical commands (“new line new
line”) collapse to one — an accepted Plan-1 edge case.
Lowercase the first letter of text when it CONTINUES the previous block —
the previous block didn’t end a sentence (no terminal .!?) AND the first word
is an allow-listed continuation word. Whisper cases each segment as a fresh
sentence; this undoes the spurious mid-sentence capital when a sentence spans a
pause. Conservative by construction (only the allow-list; never a proper noun).
Format a pass-2 Whisper revise. Whisper already cased + punctuated, so this does
NOT re-capitalize sentence starts or force terminal punctuation (that re-creates
the per-segment mid-sentence capital). It applies only the spoken-command and
scratch that backtrack features and the continuation de-capitalizer.
The moat: accept a rewrite only if it preserves every content word from the
input, in order, adding/removing nothing but allowed fillers. Guards harm
(a substituted/dropped meaning word) rather than edit volume.
Group a long monologue into readable paragraphs (the High level). Existing blank
lines (a spoken “new paragraph”) are preserved as hard breaks; within each run, a
new paragraph starts at a sentence that opens with a discourse marker once the
current paragraph already holds MIN_SENTENCES_PER_PARA sentences, or unconditionally
once it reaches MAX_SENTENCES_PER_PARA — so breaks fall at thought shifts, never
after every sentence, and no paragraph runs on forever.
Build the per-level rewrite prompt for T7’s Candle façade. The Light rule keeps
filler removal to LEADING disfluencies only — mid-sentence you know/i mean
removal is deliberately NOT requested, because the content-word guard would
accept such drops (see the FILLERS note). T7 must preserve this restriction.
Apply whole-entry shaping for a level (called at session end on the joined clean
text). High paragraphizes; lower levels pass through (per-phrase Light already ran).
Remove Whisper’s non-speech tags. [...] spans go only when a known tag or an
all-caps event shape ([BLANK_AUDIO]); (...) spans go only when every inner
word is a sound word. Runs in the pre-layer (before the content-word guard), so
nothing it removes ever reaches the guard as a content-word change.