# Validation Rules
This document specifies all validation rules enforced by capns implementations. All implementations (Rust, JavaScript, server-side) MUST enforce these rules identically. No fallbacks, no exceptions.
## Overview
Validation is organized into categories:
1. **Cap URN Rules** - Rules for cap URN format (CU1-CU2)
2. **Cap Definition Rules** - Rules for capability definitions (RULE1-RULE12)
3. **Media Spec Rules** - Rules for media specifications (MS1-MS3)
4. **Cross-Validation Rules** - Rules for reference integrity (XV1-XV4)
For base Tagged URN rules (case handling, tag ordering, wildcards, quoting, character restrictions, etc.), see the [Tagged URN RULES.md](https://github.com/filegrind/tagged-urn-rs/blob/main/docs/RULES.md).
---
## Cap URN Rules
Cap URNs extend Tagged URNs with capability-specific requirements.
### CU1: Required Direction Specifiers
Cap URNs **must** include `in` and `out` tags that specify input/output media types:
```
cap:in="media:void";op=generate;out="media:object"
cap:in="media:binary";op=extract;out="media:object";target=metadata
```
- `in` - The input media type (what the cap accepts)
- `out` - The output media type (what the cap produces)
- Values must be valid Media URNs (starting with `media:`) or wildcard `*`
- Caps without `in` and `out` are invalid
**Error**: `Cap URN requires 'in' tag` / `Cap URN requires 'out' tag`
### CU2: Valid Media URN References
Direction specifier values (`in` and `out`) must be:
- A valid Media URN: `media:<type>;v=<version>[;profile=<profile>]`
- Or a wildcard: `*`
Invalid direction specifier values cause parsing to fail.
**Error**: `Invalid 'in' media URN: <value>. Must start with 'media:' or be '*'`
### URL Length Constraint
The URL `https://capns.org/{cap_urn}` must be valid, imposing practical length limits (~2000 characters).
---
## Matching Semantics
Cap matching follows Tagged URN matching semantics. See [MATCHING.md](./MATCHING.md) for full details.
### Per-Tag Value Semantics
| (missing) | No constraint | OK | OK | OK |
| `K=?` | No constraint (explicit) | OK | OK | OK |
| `K=!` | Must-not-have | OK | NO | NO |
| `K=*` | Must-have, any value | NO | OK | OK |
| `K=v` | Must-have, exact value | NO | OK | NO |
### Direction Specifier Matching
Direction specs (`in`/`out`) follow standard tag matching:
```
# Exact match
Cap: cap:in="media:binary";op=extract;out="media:object"
Request: cap:in="media:binary";op=extract;out="media:object"
Result: MATCH
# Direction mismatch
Cap: cap:in="media:binary";op=extract;out="media:object"
Request: cap:in="media:text";op=extract;out="media:object"
Result: NO MATCH (input types differ)
```
### Must-Have-Any Direction Specs
`*` in direction specs means "must have any value":
```
Cap: cap:in=*;op=convert;out=*
Request: cap:in="media:binary";op=convert;out="media:text"
Result: MATCH (request has in/out, cap accepts any values)
Cap: cap:in=*;op=convert;out=*
Request: cap:op=convert
Result: NO MATCH (cap requires in/out presence, request lacks them)
```
### Graded Specificity
Specificity uses graded scoring:
- Exact value (K=v): 3 points
- Must-have-any (K=*): 2 points
- Must-not-have (K=!): 1 point
- Unspecified (K=?) or missing: 0 points
Examples:
- `cap:in="media:binary";op=extract;out="media:object"` → 3+3+3 = 9
- `cap:in=*;op=extract;out=*` → 2+3+2 = 7
---
## Cap-Specific Error Codes
In addition to Tagged URN error codes:
| 10 | MissingInSpec | Cap URN missing required `in` tag |
| 11 | MissingOutSpec | Cap URN missing required `out` tag |
| 12 | InvalidMediaUrn | Direction spec value is not a valid Media URN |
---
## Cap Definition Rules (RULE1-RULE12)
These rules validate the `args` array in capability definitions.
### RULE1: No Duplicate Media URNs
**Description**: No two arguments in a capability may have the same `media_urn`.
**Rationale**: Each argument is uniquely identified by its `media_urn`. Duplicates cause ambiguity.
**Error**: `RULE1: Duplicate media_urn '<urn>'`
### RULE2: Sources Must Not Be Empty
**Description**: Every argument MUST have a non-empty `sources` array.
**Rationale**: An argument without sources cannot receive input.
**Error**: `RULE2: Argument '<media_urn>' has empty sources`
### RULE3: Identical Stdin Media URNs
**Description**: If multiple arguments have `stdin` sources, all stdin sources MUST specify identical `media_urn` values.
**Rationale**: There is only one stdin stream. Multiple args reading from stdin must expect the same media type.
**Error**: `RULE3: Multiple args have different stdin media_urns: '<urn1>' vs '<urn2>'`
### RULE4: No Duplicate Source Types Per Argument
**Description**: No argument may specify the same source type (stdin, position, cli_flag) more than once.
**Rationale**: Each source type represents a single input channel per argument.
**Error**: `RULE4: Argument '<media_urn>' has duplicate source type '<type>'`
### RULE5: No Duplicate Positions
**Description**: No two arguments may have the same positional index.
**Rationale**: Positional arguments must be unambiguous.
**Error**: `RULE5: Duplicate position <n> in argument '<media_urn>'`
### RULE6: Sequential Positions
**Description**: Positions must be sequential starting from 0 with no gaps.
**Rationale**: Ensures predictable positional argument ordering.
**Error**: `RULE6: Position gap - expected <n> but found <m>`
### RULE7: No Position and CLI Flag Combination
**Description**: No argument may have both a `position` source and a `cli_flag` source.
**Rationale**: An argument is either positional or named, not both. This prevents ambiguity in argument parsing.
**Error**: `RULE7: Argument '<media_urn>' has both position and cli_flag sources`
### RULE8: No Unknown Source Keys
**Description**: Source objects may only contain recognized keys: `stdin`, `position`, or `cli_flag`.
**Rationale**: Prevents typos and invalid configurations.
**Enforcement**: Validated by JSON Schema and deserialization.
**Error**: `RULE8: Argument '<media_urn>' has source with unknown keys`
### RULE9: No Duplicate CLI Flags
**Description**: No two arguments may have the same `cli_flag` value.
**Rationale**: CLI flags must uniquely identify arguments.
**Error**: `RULE9: Duplicate cli_flag '<flag>' in argument '<media_urn>'`
### RULE10: Reserved CLI Flags
**Description**: Certain CLI flags are reserved and cannot be used: `manifest`, `--help`, `--version`, `-v`, `-h`
**Rationale**: Reserved for system use.
**Error**: `RULE10: Argument '<media_urn>' uses reserved cli_flag '<flag>'`
### RULE11: CLI Flag Verbatim Usage
**Description**: CLI flags are used exactly as specified (no automatic prefixing).
**Enforcement**: By design - implementations use the flag string verbatim.
### RULE12: Media URN as Key
**Description**: Arguments are identified by `media_urn`, not a separate `name` field.
**Enforcement**: By schema - no `name` field allowed in argument definitions.
---
## Media Spec Rules
### MS1: Title Required
**Description**: Every media spec MUST have a `title` field.
**Rationale**: Titles provide human-readable identification for display and documentation.
**Error**: `Media spec '<urn>' has no title`
### MS2: Valid URN Format
**Description**: Media URNs MUST start with `media:` prefix.
**Rationale**: Distinguishes media specs from other URN types.
**Error**: `Invalid media URN: expected 'media:' prefix`
### MS3: Media Type Required
**Description**: Every media spec MUST have a `media_type` field (MIME type).
**Rationale**: Specifies the content type for proper handling.
**Error**: `Media spec '<urn>' has no media_type`
---
## Cross-Validation Rules
### XV1: No Duplicate Cap URNs
**Description**: No two capability definitions may have the same canonical Cap URN.
**Rationale**: Cap URNs uniquely identify capabilities.
**Error**: `Duplicate cap URN: <urn>`
### XV2: No Duplicate Media URNs (Global)
**Description**: No two standalone media spec definitions may have the same URN.
**Rationale**: Media URNs uniquely identify media specs in the global registry.
**Error**: `Duplicate media URN: <urn>`
### XV3: Media URN Resolution Required
**Description**: All media URNs referenced in capabilities MUST resolve to a defined media spec.
**Resolution Order**:
1. Capability's local `media_specs` table (inline definitions)
2. Standalone media spec files (global registry)
**No Fallbacks**: If a media URN cannot be resolved, validation FAILS. No auto-generation of specs.
**Checked Locations**:
- `urn.tags.in` (unless wildcard `*`)
- `urn.tags.out` (unless wildcard `*`)
- Every `args[].media_urn`
- `output.media_urn`
- Every `args[].sources[].stdin` (media URN in stdin source)
**Error**: `Unresolved media URN '<urn>' referenced in <location>`
### XV4: Inline Media Spec Title Required
**Description**: Media specs defined inline in capability `media_specs` tables MUST also have a `title` field.
**Rationale**: Same as MS1 - all media specs need titles regardless of definition location.
**Error**: `Inline media spec '<urn>' in <file> has no title`
### XV5: No Redefinition of Registry Media Specs
**Description**: Inline media specs in a capability's `media_specs` table MUST NOT redefine media specs that already exist in the global media registry.
**Rationale**: The global registry is the canonical source for standard media specs. Redefining them inline creates confusion, inconsistency, and potential for conflicting definitions. If a media spec exists in the registry, capabilities should reference it, not redefine it.
**Enforcement Behavior**:
- **With network access**: Strictly enforce. If any inline media URN matches a URN in the global registry, validation FAILS.
- **Without network access**: Check against cached/bundled specs only. If a conflict is found with cached specs, validation FAILS. If no conflict is found with cached specs but online registry is unreachable, log a warning and allow the operation to proceed (graceful degradation).
**Error**: `XV5: Inline media spec '<urn>' redefines existing registry spec`
**Warning (offline mode)**: `XV5: Could not verify inline spec '<urn>' against online registry (offline mode)`
---
## Implementation Requirements
### Fail Hard
All validation errors MUST cause immediate failure with a clear error message. No fallbacks, no silent recovery, no default values for required fields.
### Consistent Behavior
All implementations (Rust capns, JavaScript capns-js, server functions) MUST enforce identical rules with identical error messages.
### Order of Validation
1. Structural validation (JSON Schema)
2. Cap URN validation (CU1, CU2)
3. Cap definition rules (RULE1-RULE12)
4. Media spec rules (MS1-MS3)
5. Cross-validation (XV1-XV4)
---
## Error Code Summary
| 10-12 | Cap URN Errors |
| RULE1-RULE12 | Cap Args Validation |
| MS1-MS3 | Media Spec Validation |
| CU1-CU2 | Cap URN Validation |
| XV1-XV4 | Cross-Validation |
---
## Implementation Notes
- All implementations must validate presence of `in` and `out` tags
- All implementations must validate Media URN format for direction specs
- All implementations must include direction specs in matching logic
- Direction specs use quoted values to preserve Media URN case and special characters
---
## Changelog
- 2026-01-29: Added XV5 (No Redefinition of Registry Media Specs) validation
- Inline media specs must not redefine existing registry specs
- Strict enforcement with network, graceful degradation without
- 2024-01-29: Added comprehensive validation rules documentation
- Added RULE1-RULE12 for cap args validation
- Added MS1-MS3 for media spec validation
- Added XV1-XV4 for cross-validation
- Added RULE7 enforcement for position/cli_flag mutual exclusivity
- Added MS1 (title required) validation
- Added XV3 strict media URN resolution (no fallbacks)
- Added XV4 inline media spec title requirement