Expand description
Streaming filter that splits assistant text into a hidden
<thought> channel and visible content.
A host typically prompts the model to begin every turn with exactly
one <thought>...</thought> block (private scratch space — the typed
record this filter extracts and routes off the visible stream),
optionally followed by a short <narrate>...</narrate> sentence
(user-visible diary text), and then the tool call(s). Emitting
<thought> first preserves the audit record even if generation is
cut short before any user-visible token streams. This filter sits
in the OpenRouter SSE path and routes content tokens to the right
place as they arrive:
- Text outside any thinking tag flows through as visible text.
- Text inside a recognized hidden tag (
<thought>,<thinking>,<think>,<reasoning>,<reflection>) is buffered separately and surfaced viaThinkingTagStreamFilter::take_completed_thoughtwhen the closing tag arrives.
The filter is delta-aware: a tag may be split across SSE chunk
boundaries (<thi then nking> then hidden then </thinking>).
Ambiguous prefixes (anything starting with < that could be a
thinking tag) are buffered until they can be confirmed or rejected.
Why a fresh copy in clark-agent rather than reusing
clark_core::runtime_core::json_extract::ThinkingTagStreamFilter:
clark-agent is the lean loop crate (no redis, no chrono-tz, no
sentry); pulling in clark-core would bloat its compile graph by an
order of magnitude. The two implementations are kept narrow enough
that drift is cheap to spot in review.
Structs§
- Thinking
TagStream Filter - Streaming-aware filter that suppresses content inside thinking XML tags as deltas arrive token-by-token.
Functions§
- strip_
thinking_ tags - Remove XML-like thinking blocks from a complete string. Used as a
safety net by
crate::types::AssistantContent::plain_text’s callers when serializing assistant text back to the wire on the next request — defends against blocks that slipped past the streaming filter (provider buffering, malformed nesting, etc.).