# Unilang Framework Specification
**Version:** 2.0.0
**Status:** Final
---
### 0. Introduction & Core Concepts
**Design Focus: `Strategic Context`**
This document is the single source of truth for the `unilang` framework. It defines the language, its components, and the responsibilities of its constituent crates.
#### 0.1. Scope: A Multi-Crate Framework
The Unilang specification governs a suite of related crates that work together to provide the full framework functionality. This document is the canonical specification for all of them. The primary crates are:
* **`unilang`**: The core framework crate that orchestrates parsing, semantic analysis, execution, and modality management.
* **`unilang_instruction_parser`**: A dedicated, low-level crate responsible for the lexical and syntactic analysis of the `unilang` command language (implements Section 2 of this spec).
* **`unilang_meta`**: A companion crate providing procedural macros to simplify compile-time command definition (implements parts of Section 3.4).
#### 0.2. Goals of `unilang`
`unilang` provides a unified way to define command-line utility interfaces once, automatically enabling consistent interaction across multiple modalities such as CLI, GUI, TUI, and Web APIs. The core goals are:
1. **Consistency:** A single way to define commands and their arguments, regardless of how they are presented or invoked.
2. **Discoverability:** Easy ways for users and systems to find available commands and understand their usage.
3. **Flexibility:** Support for various methods of command definition (compile-time, run-time, declarative, procedural).
4. **Extensibility:** Provide structures that enable an integrator to build an extensible system with compile-time `Extension Module`s and run-time command registration.
5. **Efficiency:** Support for efficient parsing and command dispatch. The architecture **must** support near-instantaneous lookup for large sets (100,000+) of statically defined commands by performing maximum work at compile time.
6. **Interoperability:** Standardized representation for commands, enabling integration with other tools or web services, including auto-generation of WEB endpoints.
7. **Robustness:** Clear error handling and validation mechanisms.
8. **Security:** Provide a framework for defining and enforcing secure command execution.
#### 0.3. System Actors
* **`Integrator (Developer)`**: The primary human actor who uses the `unilang` framework to build a `utility1` application. They define commands, write routines, and configure the system.
* **`End User`**: A human actor who interacts with the compiled `utility1` application through one of its exposed `Modalities` (e.g., CLI, GUI).
* **`Operating System`**: A system actor that provides the execution environment, including the CLI shell, file system, and environment variables that `utility1` consumes for configuration.
* **`External Service`**: Any external system (e.g., a database, a web API, another process) that a command `Routine` might interact with.
#### 0.4. Key Terminology (Ubiquitous Language)
* **`unilang`**: This specification and the core framework crate.
* **`utility1`**: A generic placeholder for the primary application that implements and interprets `unilang`.
* **`Command Lexicon`**: The complete set of all commands available to `utility1` at any given moment.
* **`Command Registry`**: The runtime data structure that implements the `Command Lexicon`.
* **`Command Manifest`**: An external file (e.g., in YAML or JSON format) that declares `CommandDefinition`s for runtime loading.
* **`Command`**: A specific action that can be invoked, identified by its `FullName`.
* **`FullName`**: The complete, unique, dot-separated path identifying a command (e.g., `.files.copy`).
* **`Namespace`**: A logical grouping for commands and other namespaces.
* **`CommandDefinition` / `ArgumentDefinition`**: The canonical metadata for a command or argument.
* **`Routine`**: The executable code (handler function) associated with a command. Its signature is `fn(VerifiedCommand, ExecutionContext) -> Result<OutputData, ErrorData>`.
* **`Modality`**: A specific way of interacting with `utility1` (e.g., CLI, GUI).
* **`parser::GenericInstruction`**: The output of the `unilang_instruction_parser`.
* **`VerifiedCommand`**: A command that has passed semantic analysis and is ready for execution.
* **`ExecutionContext`**: An object providing routines with access to global settings and services.
* **`OutputData` / `ErrorData`**: Standardized structures for returning success or failure results.
---
### 1. Architectural Mandates & Design Principles
This section outlines the non-negotiable architectural rules and mandatory dependencies for the `unilang` ecosystem. Adherence to these principles is required to ensure consistency, maintainability, and correctness across the framework.
#### 1.1. Parser Implementation (`unilang_instruction_parser`)
* **Mandate:** The `unilang_instruction_parser` crate **must not** implement low-level string tokenization (splitting) logic from scratch. It **must** use the `strs_tools` crate as its core tokenization engine.
* **Rationale:** This enforces a clean separation of concerns. `strs_tools` is a dedicated, specialized tool for string manipulation. By relying on it, `unilang_instruction_parser` can focus on its primary responsibility: syntactic analysis of the token stream, not the raw tokenization itself.
##### Overview of `strs_tools`
`strs_tools` is a utility library for advanced string splitting and tokenization. Its core philosophy is to provide a highly configurable, non-allocating iterator over a string, giving the consumer fine-grained control over how the string is divided.
* **Key Principle:** The library intentionally does **not** interpret escape sequences (e.g., `\"`). It provides raw string slices, leaving the responsibility of unescaping to the consumer (`unilang_instruction_parser`).
* **Usage Flow:** The typical workflow involves using a fluent builder pattern:
1. Call `strs_tools::string::split::split()` to get a builder (`SplitOptionsFormer`).
2. Configure it with methods like `.delimeter()`, `.quoting(true)`, etc.
3. Call `.perform()` to get a `SplitIterator`.
4. Iterate over the `Split` items, which contain the string slice and metadata about the token.
* **Recommended Components:**
* **`strs_tools::string::split::split()`**: The main entry point function that returns the builder.
* **`SplitOptionsFormer`**: The builder for setting options. Key methods include:
* `.delimeter( &[" ", "::", ";;"] )`: To define what separates tokens.
* `.quoting( true )`: To make the tokenizer treat quoted sections as single tokens.
* `.preserving_empty( false )`: To ignore empty segments resulting from consecutive delimiters.
* **`SplitIterator`**: The iterator produced by the builder.
* **`Split`**: The struct yielded by the iterator, containing the `string` slice, its `typ` (`Delimiter` or `Delimited`), and its `start`/`end` byte positions in the original source.
#### 1.2. Macro Implementation (`unilang_meta`)
* **Mandate:** The `unilang_meta` crate **must** prefer using the `macro_tools` crate as its primary dependency for all procedural macro development. Direct dependencies on `syn`, `quote`, or `proc-macro2` should be avoided.
* **Rationale:** `macro_tools` not only re-exports these three essential crates but also provides a rich set of higher-level abstractions and utilities. Using it simplifies parsing, reduces boilerplate code, improves error handling, and leads to more readable and maintainable procedural macros.
> ❌ **Bad** (`Cargo.toml` with direct dependencies)
> ```toml
> [dependencies]
> syn = { version = "2.0", features = ["full"] }
> quote = "1.0"
> proc-macro2 = "1.0"
> ```
> ✅ **Good** (`Cargo.toml` with `macro_tools`)
> ```toml
> [dependencies]
> macro_tools = "0.57"
> ```
##### Recommended `macro_tools` Components
To effectively implement `unilang_meta`, the following components from `macro_tools` are recommended:
* **Core Re-exports (`syn`, `quote`, `proc-macro2`):** Use the versions re-exported by `macro_tools` for guaranteed compatibility.
* **Diagnostics (`diag` module):** Essential for providing clear, professional-grade error messages to the `Integrator`.
* **`syn_err!( span, "message" )`**: The primary tool for creating `syn::Error` instances with proper location information.
* **`return_syn_err!(...)`**: A convenient macro to exit a parsing function with an error.
* **Attribute Parsing (`attr` and `attr_prop` modules):** The main task of `unilang_meta` is to parse attributes like `#[unilang::command(...)]`. These modules provide reusable components for this purpose.
* **`AttributeComponent`**: A trait for defining a parsable attribute (e.g., `unilang::command`).
* **`AttributePropertyComponent`**: A trait for defining a property within an attribute (e.g., `name = "..."`).
* **`AttributePropertySyn` / `AttributePropertyBoolean`**: Reusable structs for parsing properties that are `syn` types (like `LitStr`) or booleans.
* **Item & Struct Parsing (`struct_like`, `item_struct` modules):** Needed to analyze the Rust code (struct or function) to which the macro is attached.
* **`StructLike`**: A powerful enum that can represent a `struct`, `enum`, or `unit` struct, simplifying the analysis logic.
* **Generics Handling (`generic_params` module):** If commands can be generic, this module is indispensable.
* **`GenericsRef`**: A wrapper that provides convenient methods for splitting generics into parts needed for `impl` blocks and type definitions.
* **General Utilities:**
* **`punctuated`**: Helpers for working with `syn::punctuated::Punctuated` collections.
* **`ident`**: Utilities for creating and manipulating identifiers, including handling of Rust keywords.
#### 1.3. Framework Parsing (`unilang`)
* **Mandate:** The `unilang` core framework **must** delegate all command expression parsing to the `unilang_instruction_parser` crate. It **must not** contain any of its own CLI string parsing logic.
* **Rationale:** This enforces the architectural separation between syntactic analysis (the responsibility of `unilang_instruction_parser`) and semantic analysis (the responsibility of `unilang`). This modularity makes the system easier to test, maintain, and reason about.
---
### 2. Language Syntax & Processing (CLI)
**Design Focus: `Public Contract`**
**Primary Implementor: `unilang_instruction_parser` crate**
This section defines the public contract for the CLI modality's syntax. The `unilang_instruction_parser` crate is the reference implementation for this section.
#### 2.1. Unified Processing Pipeline
The interpretation of a `unilang` CLI string by `utility1` **must** proceed through the following conceptual phases:
1. **Phase 1: Syntactic Analysis (String to `GenericInstruction`)**
* **Responsibility:** `unilang_instruction_parser` crate.
* **Process:** The parser consumes the input and, based on the `unilang` grammar (Appendix A.2), identifies command paths, positional arguments, named arguments (`key::value`), and operators (`;;`, `?`).
* **Output:** A `Vec<parser::GenericInstruction>`. This phase has no knowledge of command definitions; it is purely syntactic.
2. **Phase 2: Semantic Analysis (`GenericInstruction` to `VerifiedCommand`)**
* **Responsibility:** `unilang` crate.
* **Process:** Each `GenericInstruction` is validated against the `CommandRegistry`. The command name is resolved, arguments are bound to their definitions, types are checked, and validation rules are applied.
* **Output:** A `Vec<VerifiedCommand>`.
3. **Phase 3: Execution**
* **Responsibility:** `unilang` crate's Interpreter.
* **Process:** The interpreter invokes the `Routine` for each `VerifiedCommand`, passing it the validated arguments and execution context.
* **Output:** A `Result<OutputData, ErrorData>` for each command, which is then handled by the active `Modality`.
#### 2.2. Naming Conventions
To ensure consistency across all `unilang`-based utilities, the following naming conventions **must** be followed:
* **Command & Namespace Segments:** Must consist of lowercase alphanumeric characters (`a-z`, `0-9`) and underscores (`_`). Dots (`.`) are used exclusively as separators. Example: `.system.info`, `.file_utils.read_all`.
* **Argument Names & Aliases:** Must consist of lowercase alphanumeric characters and may use `kebab-case` for readability. Example: `input-file`, `force`, `user-name`.
#### 2.3. Command Expression
A `command_expression` can be one of the following:
* **Full Invocation:** `[namespace_path.]command_name [argument_value...] [named_argument...]`
* **Help Request:** `[namespace_path.][command_name] ?` or `[namespace_path.]?`
#### 2.4. Parsing Rules and Precedence
To eliminate ambiguity, the parser **must** adhere to the following rules in order.
* **Rule 0: Whitespace Separation**
* Whitespace characters (spaces, tabs) serve only to separate tokens. Multiple consecutive whitespace characters are treated as a single separator. Whitespace is not part of a token's value unless it is inside a quoted string.
* **Rule 1: Command Path Identification**
* The **Command Path** is the initial sequence of tokens that identifies the command to be executed.
* A command path consists of one or more **segments**.
* Segments **must** be separated by a dot (`.`). Whitespace around the dot is ignored.
* A segment **must** be a valid identifier according to the `Naming Conventions` (Section 2.2).
* The command path is the longest possible sequence of dot-separated identifiers at the beginning of an expression.
* **Rule 2: End of Command Path & Transition to Arguments**
* The command path definitively ends, and argument parsing begins, upon encountering the **first token** that is not a valid, dot-separated identifier segment.
* This transition is triggered by:
* A named argument separator (`::`).
* A quoted string (`"..."` or `'...'`).
* The help operator (`?`).
* Any other token that does not conform to the identifier naming convention.
* **Example:** In `utility1 .files.copy --force`, the command path is `.files.copy`. The token `--force` is not a valid segment, so it becomes the first positional argument.
* **Rule 3: Dot (`.`) Operator Rules**
* **Leading Dot:** A single leading dot at the beginning of a command path (e.g., `.files.copy`) is permitted and has no semantic meaning. It is consumed by the parser and does not form part of the command path's segments.
* **Trailing Dot:** A trailing dot after the final command segment (e.g., `.files.copy.`) is a **syntax error**.
* **Rule 4: Help Operator (`?`)**
* The `?` operator marks the entire instruction for help generation.
* It **must** be the final token in a command expression.
* It **may** be preceded by arguments. If it is, this implies a request for contextual help. The `unilang` framework (not the parser) is responsible for interpreting this context.
* **Valid:** `.files.copy ?`
* **Valid:** `.files.copy from::/src ?`
* **Invalid:** `.files.copy ? from::/src`
* **Rule 5: Argument Types**
* **Positional Arguments:** Any token that follows the command path and is not a named argument is a positional argument.
* **Named Arguments:** Any pair of tokens matching the `name::value` syntax is a named argument. The `value` can be a single token or a quoted string.
---
### 3. Core Definitions
**Design Focus: `Public Contract`**
**Primary Implementor: `unilang` crate**
This section defines the core data structures that represent commands, arguments, and namespaces. These structures form the primary API surface for an `Integrator`.
#### 3.1. `NamespaceDefinition` Anatomy
A namespace is a first-class entity to improve discoverability and help generation.
| `name` | `String` | Yes | The unique, dot-separated `FullName` of the namespace (e.g., `.files`, `.system.internal`). |
| `hint` | `String` | No | A human-readable explanation of the namespace's purpose. |
#### 3.2. `CommandDefinition` Anatomy
| `name` | `String` | Yes | The final segment of the command's name (e.g., `copy`). The full path is derived from its registered namespace. |
| `namespace` | `String` | Yes | The `FullName` of the parent namespace this command belongs to (e.g., `.files`). |
| `hint` | `String` | No | A human-readable explanation of the command's purpose. |
| `arguments` | `Vec<ArgumentDefinition>` | No | A list of arguments the command accepts. |
| `routine` | `Routine` | Yes (for static) | A direct reference to the executable code (e.g., a function pointer). |
| `routine_link` | `String` | No | For commands loaded from a `Command Manifest`, this is a string that links to a pre-compiled, registered routine. |
| `permissions` | `Vec<String>` | No | A list of permission identifiers required for execution. |
| `status` | `Enum` | No (Default: `Stable`) | Lifecycle state: `Experimental`, `Stable`, `Deprecated`. |
| `deprecation_message` | `String` | No | If `status` is `Deprecated`, explains the reason and suggests alternatives. |
| `http_method_hint`| `String` | No | A suggested HTTP method (`GET`, `POST`, etc.) for the Web API modality. |
| `idempotent` | `bool` | No (Default: `false`) | If `true`, the command can be safely executed multiple times. |
| `examples` | `Vec<String>` | No | Illustrative usage examples for help text. |
| `version` | `String` | No | The SemVer version of the individual command (e.g., "1.0.2"). |
| `tags` | `Vec<String>` | No | Keywords for grouping or filtering commands (e.g., "filesystem", "networking"). |
#### 3.3. `ArgumentDefinition` Anatomy
| `name` | `String` | Yes | The unique (within the command), case-sensitive identifier (e.g., `src`). |
| `hint` | `String` | No | A human-readable description of the argument's purpose. |
| `kind` | `Kind` | Yes | The data type of the argument's value. |
| `optional` | `bool` | No (Default: `false`) | If `true`, the argument may be omitted. |
| `default_value` | `Option<String>` | No | A string representation of the value to use if an optional argument is not provided. It will be parsed on-demand. |
| `is_default_arg`| `bool` | No (Default: `false`) | If `true`, its value can be provided positionally in the CLI. |
| `multiple` | `bool` | No (Default: `false`) | If `true`, the argument can be specified multiple times. |
| `sensitive` | `bool` | No (Default: `false`) | If `true`, the value must be protected (masked in UIs, redacted in logs). |
| `validation_rules`| `Vec<String>` | No | Custom validation logic (e.g., `"min:0"`, `"regex:^.+$"`). |
| `aliases` | `Vec<String>` | No | A list of alternative short names (e.g., `s` for `source`). |
| `tags` | `Vec<String>` | No | Keywords for UI grouping (e.g., "Basic", "Advanced"). |
| `interactive` | `bool` | No (Default: `false`) | If `true`, modalities may prompt for input if the value is missing. |
#### 3.4. Methods of Command Specification
The methods for defining commands. The "Compile-Time Declarative" method is primarily implemented by the `unilang_meta` crate.
1. **Compile-Time Declarative (via `unilang_meta`):** Using procedural macros on Rust functions or structs to generate `CommandDefinition`s at compile time.
2. **Run-Time Procedural:** Using a builder API within `utility1` to construct and register commands dynamically.
3. **External Definition:** Loading `CommandDefinition`s from external files (e.g., YAML, JSON) at compile-time or run-time.
#### 3.5. The Command Registry
**Design Focus: `Internal Design`**
**Primary Implementor: `unilang` crate**
The `CommandRegistry` is the runtime data structure that stores the entire `Command Lexicon`. To meet the high-performance requirement for static commands while allowing for dynamic extension, it **must** be implemented using a **Hybrid Model**.
* **Static Registry:**
* **Implementation:** A **Perfect Hash Function (PHF)** data structure.
* **Content:** Contains all commands, namespaces, and routines that are known at compile-time.
* **Generation:** The PHF **must** be generated by `utility1`'s build process (e.g., in `build.rs`) from all compile-time command definitions. This ensures that the cost of building the lookup table is paid during compilation, not at application startup.
* **Dynamic Registry:**
* **Implementation:** A standard `HashMap`.
* **Content:** Contains commands and namespaces that are added at runtime (e.g., from a `Command Manifest`).
* **Lookup Precedence:** When resolving a command `FullName`, the `CommandRegistry` **must** first query the static PHF. If the command is not found, it must then query the dynamic `HashMap`.
---
### 4. Global Arguments & Configuration
**Design Focus: `Public Contract`**
**Primary Implementor: `unilang` crate**
This section defines how an `Integrator` configures `utility1` and how an `End User` can override that configuration.
#### 4.1. `GlobalArgumentDefinition` Anatomy
The `Integrator` **must** define their global arguments using this structure, which can then be registered with `utility1`.
| `name` | `String` | Yes | The unique name of the global argument (e.g., `output-format`). |
| `hint` | `String` | No | A human-readable description. |
| `kind` | `Kind` | Yes | The data type of the argument's value. |
| `env_var` | `String` | No | The name of an environment variable that can set this value. |
#### 4.2. Configuration Precedence
Configuration values **must** be resolved in the following order of precedence (last one wins):
1. Default built-in values.
2. System-wide configuration file (e.g., `/etc/utility1/config.toml`).
3. User-specific configuration file (e.g., `~/.config/utility1/config.toml`).
4. Project-specific configuration file (e.g., `./.utility1.toml`).
5. Environment variables (as defined in `GlobalArgumentDefinition.env_var`).
6. CLI Global Arguments provided at invocation.
---
### 5. Architectural Diagrams
**Design Focus: `Strategic Context`**
These diagrams provide a high-level, visual overview of the system's architecture and flow.
#### 5.1. System Context Diagram
This C4 diagram shows the `unilang` framework in the context of its users and the systems it interacts with.
```mermaid
graph TD
subgraph "System Context for a 'utility1' Application"
A[Integrator (Developer)] -- Defines Commands & Routines using --> B{unilang Framework};
B -- Builds into --> C[utility1 Application];
D[End User] -- Interacts via Modality (CLI, GUI, etc.) --> C;
C -- Executes Routines that may call --> E[External Service e.g., Database, API];
C -- Interacts with --> F[Operating System e.g., Filesystem, Env Vars];
end
style B fill:#1168bd,stroke:#fff,stroke-width:2px,color:#fff
style C fill:#22a6f2,stroke:#fff,stroke-width:2px,color:#fff
```
#### 5.2. High-Level Architecture Diagram
This diagram shows the internal components of the `unilang` ecosystem and their relationships.
```mermaid
graph TD
subgraph "unilang Ecosystem"
A[unilang_meta] -- Generates Definitions at Compile Time --> B(build.rs / Static Initializers);
B -- Populates --> C{Static Registry (PHF)};
D[unilang_instruction_parser] -- Produces GenericInstruction --> E[unilang Crate];
subgraph E
direction LR
F[Semantic Analyzer] --> G[Interpreter];
G -- Uses --> H[Hybrid Command Registry];
end
H -- Contains --> C;
H -- Contains --> I{Dynamic Registry (HashMap)};
J[Command Manifest (YAML/JSON)] -- Loaded at Runtime by --> E;
E -- Populates --> I;
end
```
#### 5.3. Sequence Diagram: Unified Processing Pipeline
This diagram illustrates the flow of data and control during a typical CLI command execution.
```mermaid
sequenceDiagram
participant User
participant CLI
participant Parser as unilang_instruction_parser
participant SemanticAnalyzer as unilang::SemanticAnalyzer
participant Interpreter as unilang::Interpreter
participant Routine
User->>CLI: Enters "utility1 .files.copy src::a.txt"
CLI->>Parser: parse_single_str("...")
activate Parser
Parser-->>CLI: Returns Vec<GenericInstruction>
deactivate Parser
CLI->>SemanticAnalyzer: analyze(instructions)
activate SemanticAnalyzer
SemanticAnalyzer-->>CLI: Returns Vec<VerifiedCommand>
deactivate SemanticAnalyzer
CLI->>Interpreter: run(verified_commands)
activate Interpreter
Interpreter->>Routine: execute(command, context)
activate Routine
Routine-->>Interpreter: Returns Result<OutputData, ErrorData>
deactivate Routine
Interpreter-->>CLI: Returns final Result
deactivate Interpreter
CLI->>User: Displays formatted output or error
```
---
### 6. Interaction Modalities
**Design Focus: `Public Contract`**
**Primary Implementor: `unilang` crate (provides the framework)**
`unilang` definitions are designed to drive various interaction modalities.
* **6.1. CLI (Command Line Interface):** The primary modality, defined in Section 2.
* **6.2. TUI (Textual User Interface):** An interactive terminal interface built from command definitions.
* **6.3. GUI (Graphical User Interface):** A graphical interface with forms and widgets generated from command definitions.
* **6.4. WEB Endpoints:**
* **Goal:** Automatically generate a web API from `unilang` command specifications.
* **Mapping:** A command `.namespace.command` maps to an HTTP path like `/api/v1/namespace/command`.
* **Serialization:** Arguments are passed as URL query parameters (`GET`) or a JSON body (`POST`/`PUT`). `OutputData` and `ErrorData` are returned as JSON.
* **Discoverability:** An endpoint (e.g., `/openapi.json`) **must** be available to generate an OpenAPI v3+ specification. The content of this specification is derived directly from the `CommandDefinition`, `ArgumentDefinition`, and `NamespaceDefinition` metadata.
---
### 7. Cross-Cutting Concerns
**Design Focus: `Public Contract`**
**Primary Implementor: `unilang` crate**
This section defines framework-wide contracts for handling common concerns like errors and security.
#### 7.1. Error Handling (`ErrorData`)
Routines that fail **must** return an `ErrorData` object. The `code` field should use a standard identifier where possible.
* **Standard Codes:** `UNILANG_COMMAND_NOT_FOUND`, `UNILANG_ARGUMENT_INVALID`, `UNILANG_ARGUMENT_MISSING`, `UNILANG_TYPE_MISMATCH`, `UNILANG_VALIDATION_RULE_FAILED`, `UNILANG_PERMISSION_DENIED`, `UNILANG_EXECUTION_ERROR`, `UNILANG_IO_ERROR`, `UNILANG_INTERNAL_ERROR`.
* **New Code for External Failures:** `UNILANG_EXTERNAL_DEPENDENCY_ERROR` - To be used when a routine fails due to an error from an external service (e.g., network timeout, API error response).
```json
{
"code": "ErrorCodeIdentifier",
"message": "Human-readable error message.",
"details": {
"argument_name": "src",
"location_in_input": { "source_type": "single_string", "start_offset": 15, "end_offset": 20 }
},
"origin_command": ".files.copy"
}
```
#### 7.2. Standard Output (`OutputData`)
Successful routines **must** return an `OutputData` object.
```json
{
"payload": "Any",
"metadata": { "count": 10, "warnings": [] },
"output_type_hint": "application/json"
}
```
#### 7.3. Security
* **Permissions:** The `permissions` field on a `CommandDefinition` declares the rights needed for execution. The `utility1` `Interpreter` is responsible for checking these.
* **Sensitive Data:** Arguments marked `sensitive: true` **must** be masked in UIs and redacted from logs.
#### 7.4. Extensibility Model
* **Compile-Time `Extension Module`s:** Rust crates that can provide a suite of components to `utility1`. An extension module **should** include a manifest file (e.g., `unilang-module.toml`) to declare the components it provides. These components are compiled into the **Static Registry (PHF)**.
* **Run-Time `Command Manifest`s:** `utility1` **must** provide a mechanism to load `CommandDefinition`s from external `Command Manifest` files (e.g., YAML or JSON) at runtime. These commands are registered into the **Dynamic Registry (HashMap)**. The `routine_link` field in their definitions is used to associate them with pre-compiled functions.
---
### 8. Project Management
**Design Focus: `Strategic Context`**
This section contains meta-information about the project itself.
#### 8.1. Success Metrics
* **Performance:** For a `utility1` application with 100,000 statically compiled commands, the p99 latency for resolving a command `FullName` in the `CommandRegistry` **must** be less than 1 millisecond on commodity hardware.
* **Adoption:** The framework is considered successful if it is used to build at least three distinct `utility1` applications with different modalities.
#### 8.2. Out of Scope
The `unilang` framework is responsible for the command interface, not the business logic itself. The following are explicitly out of scope:
* **Transactional Guarantees:** The framework does not provide built-in transactional logic for command sequences. If a command in a `;;` sequence fails, the framework will not automatically roll back the effects of previous commands.
* **Inter-Command State Management:** The framework does not provide a mechanism for one command to pass complex state to the next, other than through external means (e.g., environment variables, files) managed by the `Integrator`.
* **Business Logic Implementation:** The framework provides the `Routine` execution shell, but the logic inside the routine is entirely the `Integrator`'s responsibility.
#### 8.3. Open Questions
This section tracks critical design decisions that are not yet finalized.
1. **Runtime Routine Linking:** What is the precise mechanism for resolving a `routine_link` string from a `Command Manifest` to a callable function pointer at runtime? Options include a name-based registry populated at startup or dynamic library loading (e.g., via `libloading`). This needs to be defined.
2. **Custom Type Registration:** What is the API and process for an `Integrator` to define a new custom `Kind` and register its associated parsing and validation logic with the framework?
---
### 9. Interpreter / Execution Engine
**Design Focus: `Internal Design`**
**Primary Implementor: `unilang` crate**
The Interpreter is the internal `unilang` component responsible for orchestrating command execution. Its existence and function are critical, but its specific implementation details are not part of the public API.
1. **Routine Invocation:** For each `VerifiedCommand`, the Interpreter retrieves the linked `Routine` from the `CommandRegistry`.
2. **Context Preparation:** It prepares and passes the `VerifiedCommand` object and the `ExecutionContext` object to the `Routine`.
3. **Result Handling:** It receives the `Result<OutputData, ErrorData>` from the `Routine` and passes it to the active `Modality` for presentation.
4. **Sequential Execution:** It executes commands from a `;;` sequence in order, respecting the `on_error` global argument policy.
---
### 10. Crate-Specific Responsibilities
**Design Focus: `Strategic Context`**
This section clarifies the role of each crate in implementing this specification.
#### 10.1. `unilang` (Core Framework)
* **Role:** The central orchestrator.
* **Responsibilities:**
* **Mandate:** Must use `unilang_instruction_parser` for all syntactic analysis.
* Implements the **Hybrid `CommandRegistry`** (PHF for static, HashMap for dynamic).
* Provides the build-time logic for generating the PHF from compile-time definitions.
* Implements the `SemanticAnalyzer` (Phase 2) and `Interpreter` (Phase 3).
* Defines all core data structures (`CommandDefinition`, `ArgumentDefinition`, etc.).
* Implements the Configuration Management system.
#### 10.2. `unilang_instruction_parser` (Parser)
* **Role:** The dedicated lexical and syntactic analyzer.
* **Responsibilities:**
* **Mandate:** Must use the `strs_tools` crate for tokenization.
* Provides the reference implementation for **Section 2: Language Syntax & Processing**.
* Parses a raw string or slice of strings into a `Vec<parser::GenericInstruction>`.
* **It has no knowledge of command definitions, types, or semantics.**
#### 10.3. `unilang_meta` (Macros)
* **Role:** A developer-experience enhancement for compile-time definitions.
* **Responsibilities:**
* **Mandate:** Must use the `macro_tools` crate for procedural macro implementation.
* Provides procedural macros (e.g., `#[unilang::command]`) that generate `CommandDefinition` structures.
* These generated definitions are the primary input for the **PHF generation** step in `utility1`'s build process.
---
### 11. Appendices
#### Appendix A: Formal Grammar & Definitions
##### A.1. Example `unilang` Command Library (YAML)
```yaml
# commands.yaml - Example Unilang Command Definitions
commands:
- name: echo
namespace: .string
hint: Prints the input string to the output.
status: Stable
version: "1.0.0"
idempotent: true
arguments:
- name: input-string
kind: String
is_default_arg: true
optional: false
hint: The string to be echoed.
aliases: [ "i", "input" ]
- name: times
kind: Integer
optional: true
default_value: "1"
validation_rules: [ "min:1" ]
examples:
- "utility1 .string.echo \"Hello, Unilang!\""
```
##### A.2. BNF or Formal Grammar for CLI Syntax (Simplified & Revised)
This grammar reflects the strict parsing rules defined in Section 2.5.
```bnf
<invocation> ::= <utility_name> <global_args_opt> <command_sequence>
<command_sequence> ::= <command_expression> <command_separator_opt>
<command_separator_opt> ::= ";;" <command_sequence> | ""
<command_expression> ::= <command_path> <arguments_and_help_opt>
| <arguments_and_help>
<command_path> ::= <dot_opt> <segment> <path_tail_opt>
<path_tail_opt> ::= "." <segment> <path_tail_opt> | ""
<segment> ::= <IDENTIFIER>
<dot_opt> ::= "." | ""
<arguments_and_help_opt> ::= <arguments_and_help> | ""
<arguments_and_help> ::= <argument_list> <help_operator_opt> | <help_operator>
<argument_list> ::= <argument> <argument_list_opt>
<argument_list_opt> ::= <argument_list> | ""
<argument> ::= <named_arg> | <positional_arg>
<positional_arg> ::= <value>
<named_arg> ::= <IDENTIFIER> "::" <value>
<value> ::= <IDENTIFIER> | <QUOTED_STRING>
<help_operator_opt> ::= <help_operator> | ""
<help_operator> ::= "?"
```
#### Appendix B: Command Syntax Cookbook
This appendix provides a comprehensive set of practical examples for the `unilang` CLI syntax.
##### B.1. Basic Commands
* **Command in Root Namespace:**
```sh
utility1 .ping
```
* **Command in a Nested Namespace:**
```sh
utility1 .network.diagnostics.ping
```
##### B.2. Positional vs. Named Arguments
* **Using a Positional (Default) Argument:**
* Assumes `.log` defines its `message` argument with `is_default_arg: true`.
```sh
utility1 .log "This is a log message"
```
* **Using Named Arguments (Standard):**
```sh
utility1 .files.copy from::/path/to/source.txt to::/path/to/destination.txt
```
* **Using Aliases for Named Arguments:**
* Assumes `from` has an alias `f` and `to` has an alias `t`.
```sh
utility1 .files.copy f::/path/to/source.txt t::/path/to/destination.txt
```
##### B.3. Quoting and Escaping
* **Value with Spaces:** Quotes are required.
```sh
utility1 .files.create path::"/home/user/My Documents/report.txt"
```
* **Value Containing the Key-Value Separator (`::`):** Quotes are required.
```sh
utility1 .log message::"DEPRECATED::This function will be removed."
```
* **Value Containing Commas for a Non-List Argument:** Quotes are required.
```sh
utility1 .set.property name::"greeting" value::"Hello, world"
```
##### B.4. Handling Multiple Values and Collections
* **Argument with `multiple: true`:** The argument name is repeated.
* Assumes `.service.start` defines `instance` with `multiple: true`.
```sh
utility1 .service.start instance::api instance::worker instance::db
```
* **Argument of `Kind: List<String>`:** Values are comma-separated.
* Assumes `.posts.create` defines `tags` as `List<String>`.
```sh
utility1 .posts.create title::"New Post" tags::dev,rust,unilang
```
* **Argument of `Kind: Map<String,String>`:** Entries are comma-separated, key/value pairs use `=`.
* Assumes `.network.request` defines `headers` as `Map<String,String>`.
```sh
utility1 .network.request url::https://api.example.com headers::Content-Type=application/json,Auth-Token=xyz
```
##### B.5. Command Sequences and Help
* **Command Sequence:** Multiple commands are executed in order.
```sh
utility1 .archive.create name::backup.zip ;; .cloud.upload file::backup.zip
```
* **Help for a Specific Command:**
```sh
utility1 .archive.create ?
```
* **Listing Contents of a Namespace:**
```sh
utility1 .archive ?
```