1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
# Harn provider capability matrix source fragments.
#
# The files under capability_sources/ are the source of truth for Harn's
# built-in provider/model capability rules. `harn providers build-capabilities`
# concatenates these fragments into llm/capabilities.toml, which is compiled
# into the VM with include_str!.
#
# One `[[provider.<name>]]` array entry per rule; first match wins per
# (provider, model). Place more specific `model_match` patterns before
# wildcards. `version_min = [major, minor]` narrows the match to a model
# ID whose `(major, minor)` version (parsed from the Anthropic / OpenAI
# naming schemes) is greater than or equal to the given tuple. Rules
# whose `version_min` is unparseable for the given model are skipped.
#
# `[provider_family]` declares the sibling providers that inherit rules
# from a canonical family when they have no rule of their own (OpenRouter
# et al. speak the same Responses API and forward `tool_search` /
# `defer_loading` unchanged — they fall through to `[[provider.openai]]`
# by default).
#
# Users override or extend this table per-project via
# `[[capabilities.provider.<name>]]` entries in `harn.toml`. Project
# overrides are checked before the built-in rules for the same provider
# name and are authoritative on overlap.
#
# Supported per-rule fields:
# model_match : glob pattern matched against the lowercased model ID.
# version_min : [major, minor] lower bound, provider-aware parse.
# native_tools : whether the model accepts native tool-call wire shape.
# message_wire_format:
# shared helper wire format: openai, anthropic, gemini, or ollama.
# native_tool_wire_format:
# native tool definition shape: openai or anthropic.
# defer_loading : whether `defer_loading: true` is honored on tool defs.
# tool_search : list of native tool-search variants, preferred first.
# Anthropic = ["bm25", "regex"];
# OpenAI = ["hosted", "client"].
# responses_api : whether Harn has a native provider path for OpenAI
# Responses semantics on this route.
# hosted_tools : provider-hosted tools Harn can pass through without
# local execution.
# remote_mcp : whether provider-hosted remote MCP connectors are
# available.
# conversation_state:
# whether previous_response_id-style provider state is
# available.
# compaction : whether provider-side truncation/compaction controls are
# available.
# background_mode : whether provider-side background jobs are available.
# tool_approval_policy:
# approval policy story for provider-executed tools.
# max_tools : cap on tool-definition count the provider will accept.
# Used by harn-lint to warn about oversized registries.
# prompt_caching : whether provider-side prompt caching is available.
# cache_breakpoint_style:
# explicit cache_control strategy: none, top_level, or last_block.
# vision : whether Harn can send visual input blocks on this route.
# audio_supported : whether Harn can send audio input blocks on this route.
# pdf_supported : whether Harn can send PDF/document input blocks on this route.
# video_supported : whether Harn can send video input blocks on this route.
# files_api_supported:
# whether file_id references from std/files::upload are accepted.
# file_upload_wire_format:
# file-upload API family for std/files.upload: anthropic or gemini.
# structured_output: structured-output transport: native, tool_use, format_kw, none.
# prefers_xml_scaffolding:
# prompt sections should use XML tags (`<task>`, `<examples>`).
# prefers_markdown_scaffolding:
# prompt sections should use Markdown headings (`## Task`).
# structured_output_mode:
# preferred logical output shape: native_json, delimited, xml_tagged, none.
# supports_assistant_prefill:
# whether assistant-role prefill turns are accepted.
# prefers_role_developer:
# whether durable instructions should use `developer` role.
# prefers_xml_tools:
# whether text-rendered tool specs should use XML wrappers.
# thinking_block_style:
# preferred transcript thinking style: none, thinking_blocks,
# reasoning_summary, inline.
# thinking_modes : supported script-facing modes: enabled, adaptive, effort.
# interleaved_thinking_supported:
# whether `thinking` can opt Anthropic Messages API
# requests into the interleaved-thinking beta header.
# anthropic_beta_features:
# unconditional Anthropic beta feature names to request
# for this route.
# vision_supported: whether image content blocks are accepted.
# image_url_input_supported:
# whether image content blocks may reference remote URLs.
# preserve_thinking: whether prior <think> blocks should be carried forward.
# server_parser : server-side response parser that transforms model output.
# honors_chat_template_kwargs: whether chat_template_kwargs are honored.
# requires_completion_tokens: whether to send max_completion_tokens instead of max_tokens.
# reasoning_effort_supported: whether reasoning_effort is accepted.
# reasoning_effort_levels:
# accepted reasoning_effort values when the provider
# accepts only a subset of Harn's neutral enum.
# reasoning_none_supported: whether reasoning_effort="none" is accepted.
# max_thinking_budget:
# max thinkingBudget tokens for high/xhigh reasoning when
# the provider takes an explicit token budget (native
# Gemini API thinkingConfig). Differs per model
# (2.5 Flash 24576, 2.5 Pro 32768).
# reasoning_disable_supported:
# whether `reasoning: {enabled:false}` is accepted when
# the provider uses an enabled/disabled reasoning switch.
# reasoning_required_for_tools:
# whether the model calls tools *inside* its reasoning
# channel, so disabling reasoning breaks tool calling
# (the gpt-oss / Harmony quirk — opposite of Qwen3).
# When true, reasoning_policy never resolves the auto
# reasoning level to "off" for tool tasks (agent/code/
# verify); it floors to the lowest supported effort.
# reasoning_text_promotable:
# whether a reasoning-only clean stop may be promoted into
# visible text when the provider omits content.
# reasoning_wire_format:
# OpenAI-compatible non-standard reasoning transport:
# openrouter, enabled, or minimax.
# recommended_endpoint: preferred endpoint family for this route.
# text_tool_wire_format_supported: whether Harn text tool calls survive.
# preferred_tool_format:
# default tool mode for this route: native or text.
# tool_mode_parity:
# empirical native/text interchangeability status:
# interchangeable, native_unreliable, text_unreliable,
# native_only, text_only, unknown.
# tool_mode_parity_notes:
# short explanation for known non-interchangeable modes.
# thinking_disable_directive:
# in-prompt directive (e.g. "/no_think" for Qwen3 chat
# templates) that disables the model's thinking mode.
# When set, Harn auto-prepends this to the system message
# whenever the resolved `thinking` config is `Disabled`,
# so script authors don't need to know provider-specific
# prompt directives. Idempotent — never injected twice.
# provider_route_denylist:
# [openrouter only] DENYLIST of upstream sub-providers to
# exclude for this route. Materialized into the request
# body's `provider.ignore`. Use when a SPECIFIC upstream is
# positively known to mis-serve a route while others are
# fine (e.g. Ambient billing reasoning tokens then finishing
# with empty tool_calls for qwen3.6). Prefer the allowlist
# (openrouter_provider_order) when the bad upstreams are
# intermittent/hard to enumerate.
# openrouter_provider_order:
# [openrouter only] ALLOWLIST of upstream sub-providers this
# route is PINNED to, in preference order. Materialized into
# `provider.order` + `allow_fallbacks:false`, so OpenRouter
# only ever routes to these known-clean upstreams. Use this
# for routes on OpenRouter's sub-provider lottery where the
# bad upstreams are intermittent (e.g. openai/gpt-oss-*,
# pinned to ["Cerebras","Groq"]). When both a pin and a
# denylist are set the pin wins (a closed allowlist already
# excludes everything else).
#
# OPINIONATED PROVIDER/MODEL/CONFIG POLICY (enforced by the footgun gate in
# crate::llm::capability_audit, wired into `providers build-capabilities
# --check` / `make check-provider-capabilities`). Harn refuses to ship a matrix
# that declares a known footgun, so harness authors can't reach these states:
#
# * FORBIDDEN: reasoning_required_for_tools = true together with an
# auto_reasoning_overrides that forces a tool task (agent/code/verify) to
# "off". The two are contradictory — a model that calls tools inside its
# reasoning channel emits 0 tool_calls (billed-noncommittal) when reasoning
# is off. (The opposite Qwen quirk — reasoning-OFF-for-tools WITHOUT the
# required-for-tools flag — is legitimate and allowed.)
#
# * FORBIDDEN: an `openrouter` route with reasoning_required_for_tools = true
# (a Harmony-style tool route on the sub-provider lottery) that declares no
# openrouter_provider_order pin. Some OpenRouter upstreams mis-serialize the
# Harmony tool call even with reasoning ON, so such a route MUST pin a
# closed allowlist of known-clean upstreams.
#
# * BLESSED (live-probed 2026-06-13, openai/gpt-oss-120b, reasoning effort
# low): Cerebras and Groq serve Harmony tool calls cleanly (order-pinned
# requests gave 0 billed-noncommittal); Together was flaky (1/3); the
# free OpenRouter lottery fans out across ~17 upstreams. Hence the
# openai/gpt-oss-* OpenRouter row is pinned to ["Cerebras","Groq"].