hedl-cli

Complete HEDL toolkit—validation, formatting, linting, inspection, conversion, and batch processing with parallel execution.

You need to validate HEDL files, convert between formats, analyze structure, or process hundreds of files in parallel. hedl-cli provides 21 commands covering the entire HEDL workflow: core operations (validate, format, lint, inspect, stats), bidirectional conversion to 6 formats (JSON, YAML, XML, CSV, Parquet, TOON), batch processing with automatic parallelization, and shell completion generation.

This is the official command-line interface for the HEDL ecosystem. Whether you're validating configuration files, converting database exports, analyzing token efficiency, or processing directories of HEDL documents—hedl-cli provides the tools you need.

What's Implemented

Complete command-line toolkit with 21 commands across 4 categories:

Core Commands (5): Validate, format, lint, inspect, stats
Format Conversion (12): Bidirectional conversion for JSON, YAML, XML, CSV, Parquet, TOON
Batch Processing (3): Parallel validation, formatting, linting with progress tracking
Utilities (1): Shell completion generation for 5 shells
High-Performance Architecture: Parallel processing, colored output, structured errors
Security Features: File size limits (1 GB default), input validation, safe error propagation
Flexible Output: File or stdout, JSON or text, pretty or compact formatting

Installation

# From source
cargo install hedl-cli

# Or build locally
cd crates/hedl-cli
cargo build --release

Binary location: target/release/hedl

Core Commands

validate - Syntax and Structure Validation

Validate HEDL files with optional strict reference checking:

# Basic validation
hedl validate config.hedl

# Strict mode (all references must resolve)
hedl validate --strict api_schema.hedl

Output:

✓ config.hedl
  Version: 1.0
  Structs: 5
  Aliases: 2
  Nests: 3

Options:

--strict - Enforce all entity references must resolve to defined entities

Exit Codes: 0 (valid), 1 (parse errors or validation failures)

format - Canonical Formatting

Normalize HEDL files to canonical form with optional optimizations:

# Format to stdout
hedl format data.hedl

# Format to file
hedl format data.hedl -o formatted.hedl

# Check if already canonical (no changes)
hedl format --check config.hedl

# Disable ditto optimization (keep repeated values explicit)
hedl format --ditto=false data.hedl

# Add count hints to all matrix lists
hedl format --with-counts users.hedl

Options:

-o, --output <FILE> - Write to file instead of stdout
--check - Only check if canonical, don't write output
--ditto - Enable ditto operator optimization for repeated values (default: enabled)
--with-counts - Recursively add count hints to matrix lists

Exit Codes: 0 (success or already canonical), 1 (parse error or check failed)

lint - Best Practices Checking

Check HEDL files against best practices with configurable severity:

# Text output with colors
hedl lint schema.hedl

# JSON output for programmatic processing
hedl lint --format json config.hedl

# Treat warnings as errors
hedl lint --warn-error critical.hedl

Output (text format with colors):

Warning [unused-alias]: Alias 'old_api' is defined but never used
  at line 15

Suggestion [add-count-hints]: Matrix list 'users' is missing count hint
  at line 42

Found 1 warning, 1 suggestion

Output (JSON format):

{
  "issues": [
    {
      "severity": "warning",
      "rule": "unused-alias",
      "message": "Alias 'old_api' is defined but never used",
      "line": 15
    }
  ],
  "summary": {
    "errors": 0,
    "warnings": 1,
    "suggestions": 1
  }
}

Options:

-f, --format <text|json> - Output format (default: text with colors)
-W, --warn-error - Treat warnings as errors

Exit Codes: 0 (no issues), 1 (has errors or warnings when --warn-error enabled)

inspect - Structure Visualization

Display HEDL file structure as an interactive tree:

# Basic tree view
hedl inspect data.hedl

# Verbose mode (show field values and row data)
hedl inspect -v schema.hedl

Output (tree format):

Document (1.0)
├─ Schemas (3)
│  ├─ User [id, name, email, created_at]
│  ├─ Post [id, author, title, content]
│  └─ Comment [id, post, author, text]
├─ Aliases (2)
│  ├─ $api_url = "https://api.example.com"
│  └─ $version = "2.1.0"
├─ Nests (1)
│  └─ Post > Comment
└─ Data
   ├─ users: @User (125 entities)
   ├─ posts: @Post (48 entities)
   └─ comments: @Comment (312 entities)

Verbose Output (shows actual data):

└─ users: @User (3 entities)
   ├─ alice [Alice Smith, alice@example.com, 2024-01-15]
   ├─ bob [Bob Jones, bob@example.com, 2024-02-20]
   └─ carol [Carol White, carol@example.com, 2024-03-10]

Options:

-v, --verbose - Show detailed field values and row data

stats - Format Comparison Analysis

Compare HEDL file size and token counts vs JSON, YAML, XML:

# Byte counts only
hedl stats data.hedl

# Include LLM token estimates
hedl stats --tokens config.hedl

Output:

Format Comparison for 'data.hedl':

File Sizes:
  HEDL:         2,458 bytes
  JSON (compact): 3,841 bytes  (+56.3%)
  JSON (pretty):  5,219 bytes  (+112.3%)
  YAML:         4,105 bytes  (+67.0%)
  XML:          6,732 bytes  (+173.9%)

Token Estimates (LLM ~4 chars/token):
  HEDL:         615 tokens
  JSON (compact): 960 tokens   (+56.1%)
  JSON (pretty):  1,305 tokens (+112.2%)
  YAML:         1,026 tokens  (+66.8%)
  XML:          1,683 tokens  (+173.7%)

Conclusion: HEDL saves 345 tokens (36%) vs JSON compact, 690 tokens (53%) vs JSON pretty

Options:

--tokens - Include LLM token count estimates (~4 chars/token heuristic)

Performance: All format conversions run in parallel using Rayon for maximum throughput.

Format Conversion Commands

Bidirectional conversion between HEDL and 6 popular formats.

JSON Conversion

# HEDL → JSON (compact)
hedl to-json data.hedl -o output.json

# HEDL → JSON (pretty-printed)
hedl to-json --pretty data.hedl

# HEDL → JSON (with metadata)
hedl to-json --metadata schema.hedl

# JSON → HEDL
hedl from-json input.json -o output.hedl

to-json Options:

-o, --output <FILE> - Write to file
--pretty - Pretty-print with indentation
--metadata - Include HEDL version and schema information

YAML Conversion

# HEDL → YAML
hedl to-yaml config.hedl -o config.yml

# YAML → HEDL
hedl from-yaml config.yml -o config.hedl

XML Conversion

# HEDL → XML (compact)
hedl to-xml data.hedl -o output.xml

# HEDL → XML (pretty-printed)
hedl to-xml --pretty data.hedl

# XML → HEDL
hedl from-xml input.xml -o output.hedl

to-xml Options:

--pretty - Pretty-print with indentation

CSV Conversion

# HEDL → CSV (includes headers by default)
hedl to-csv users.hedl -o users.csv

# CSV → HEDL (specify entity type name)
hedl from-csv --type-name User input.csv -o users.hedl

to-csv Options:

--headers - Include column headers (default: true)

from-csv Options:

--type-name <NAME> - Entity type name (default: "Row")

Parquet Conversion

# HEDL → Parquet (columnar format)
hedl to-parquet data.hedl --output output.parquet

# Parquet → HEDL
hedl from-parquet input.parquet -o data.hedl

Note: --output is required (not optional like other commands) because Parquet uses binary columnar format.

TOON Conversion

# HEDL → TOON
hedl to-toon data.hedl -o output.toon

# TOON → HEDL
hedl from-toon input.toon -o data.hedl

TOON (Token-Oriented Object Notation) is optimized for LLM efficiency but accuracy testing shows HEDL achieves higher comprehension (+3.4 points average) with 10% fewer tokens.

Batch Processing Commands

Process multiple files in parallel with automatic parallelization and progress tracking.

batch-validate - Parallel Validation

# Validate all .hedl files in directory
hedl batch-validate data/*.hedl

# Strict mode for all files
hedl batch-validate --strict schemas/*.hedl

# Verbose progress tracking
hedl batch-validate -v configs/*.hedl

# Force parallel processing
hedl batch-validate -p data/*.hedl

# Use streaming mode for large files (constant memory)
hedl batch-validate --streaming large-files/*.hedl

# Automatically use streaming for files > 100MB
hedl batch-validate --auto-streaming mixed-files/*.hedl

# Limit processing to 5000 files
hedl batch-validate --max-files 5000 huge-directory/*.hedl

Options:

--strict - Enforce reference resolution for all files
-v, --verbose - Detailed progress output
-p, --parallel - Force parallel processing (default: auto-detect based on file count)
--streaming - Use streaming mode for memory-efficient processing (constant memory, ideal for files >100MB)
--auto-streaming - Automatically use streaming for large files (>100MB) and standard mode for smaller files
--max-files <N> - Maximum number of files to process (default: 10,000, set to 0 for unlimited)

Output:

Validating 127 files...
Progress: [========================================] 127/127 (100%)
Completed in 2.3s (55 files/sec)

Results:
  Valid: 125 files
  Failed: 2 files
    - data/broken.hedl: Parse error at line 42: unexpected token
    - schemas/old.hedl: Unresolved reference @User:nonexistent

Performance: Automatic parallelization when file count ≥ 10 (configurable), ~3-5x speedup on multi-core systems.

batch-format - Parallel Formatting

# Format all files in-place
hedl batch-format data/*.hedl

# Format to output directory
hedl batch-format configs/*.hedl --output-dir formatted/

# Format with ditto optimization
hedl batch-format --ditto data/*.hedl

# Add count hints to all files
hedl batch-format --with-counts schemas/*.hedl

# Limit processing to 5000 files
hedl batch-format --max-files 5000 huge-directory/*.hedl

Options:

--output-dir <DIR> - Write formatted files to directory (preserves relative paths)
--ditto - Enable ditto optimization (default: true)
--with-counts - Add count hints to matrix lists
-v, --verbose - Detailed progress output
-p, --parallel - Force parallel processing
--max-files <N> - Maximum number of files to process (default: 10,000, set to 0 for unlimited)

Output:

Formatting 89 files...
Progress: [========================================] 89/89 (100%)
Completed in 1.8s (49 files/sec)

Results:
  Formatted: 87 files
  Unchanged: 0 files (already canonical)
  Failed: 2 files
    - data/corrupt.hedl: Parse error at line 15

batch-lint - Parallel Linting

# Lint all files with aggregated results
hedl batch-lint data/*.hedl

# Treat warnings as errors
hedl batch-lint --warn-error schemas/*.hedl

# Verbose per-file results
hedl batch-lint -v configs/*.hedl

# Limit processing to 5000 files
hedl batch-lint --max-files 5000 huge-directory/*.hedl

Options:

--warn-error - Treat warnings as errors
-v, --verbose - Show issues for each file
-p, --parallel - Force parallel processing
--max-files <N> - Maximum number of files to process (default: 10,000, set to 0 for unlimited)

Output:

Linting 64 files...
Progress: [========================================] 64/64 (100%)
Completed in 1.1s (58 files/sec)

Aggregated Results:
  Errors: 3 across 2 files
  Warnings: 12 across 8 files
  Suggestions: 25 across 19 files

Top Issues:
  - unused-alias (8 occurrences)
  - add-count-hints (7 occurrences)
  - unresolved-reference (3 occurrences)

Failed Files:
  - schemas/old.hedl: 2 errors, 3 warnings
  - configs/broken.hedl: 1 error

Shell Completion

Generate shell completion scripts for interactive usage:

# Generate for current shell
hedl completion bash > ~/.hedl-completion.bash
hedl completion zsh > ~/.hedl-completion.zsh
hedl completion fish > ~/.config/fish/completions/hedl.fish

# Supported shells
hedl completion bash      # Bash
hedl completion zsh       # Zsh
hedl completion fish      # Fish
hedl completion powershell # PowerShell
hedl completion elvish    # Elvish

Installation (bash example):

# Add to ~/.bashrc
source ~/.hedl-completion.bash

After installation, tab completion works for all commands, subcommands, and options:

hedl <TAB>           # Shows all commands
hedl batch-<TAB>     # Shows batch-validate, batch-format, batch-lint
hedl validate --<TAB> # Shows --strict option

Security Features

File Size Limits

Prevents memory exhaustion from malicious or unexpected large files:

# Default: 1 GB limit
hedl validate huge_file.hedl
# Error: File size (1.2 GB) exceeds limit (1 GB)

# Configure via environment variable
export HEDL_MAX_FILE_SIZE=2147483648  # 2 GB
hedl validate huge_file.hedl

Default Limit: 1,073,741,824 bytes (1 GB)

Input Validation

Type Names (CSV conversion): Alphanumeric characters and underscores only
Path Safety: All file operations validated before processing
Error Boundaries: Continues batch processing on individual file errors

Error Context

All errors include file paths and detailed context:

Error: Failed to parse 'data/broken.hedl'
  Parse error at line 42, column 15:
    unexpected token ']', expected field name

  Context:
    40 | users: @User[id, name]
    41 |   | alice, Alice Smith
    42 |   | bob, Bob Jones]
       |                     ^ here

Architecture Features

Parallel Processing

BatchProcessor System:

Configurable parallelization threshold (default: 10 files)
Automatic thread pool sizing
Progress tracking with atomic counters (lock-free)
Error resilience (collects all failures, continues processing)

Performance: ~3-5x speedup on multi-core systems for batch operations.

Count Hints System

Recursively adds count hints to matrix lists and nested children:

# Before formatting with --with-counts
users: @User[id, name]
  | alice, Alice
  | bob, Bob

# After formatting
users[2]: @User[id, name]
  | alice, Alice
  | bob, Bob

Behavior: Overwrites existing hints with actual counts from parsed document.

Output Handling

Colored Console: Uses colored crate for syntax highlighting and progress
Flexible Destinations: File path or stdout (respects --output/-o)
Format Options: JSON, text, compact, pretty-printed
Error Separation: Errors always printed to stderr, output to stdout

Error Types

Comprehensive error handling with 19 error variants:

Io - File I/O errors with path context
FileTooLarge - Size limit exceeded (configurable)
IoTimeout - I/O operation timeout
Parse - HEDL syntax errors with line/column
Canonicalization - Canonicalization failures
JsonConversion - JSON conversion errors
JsonFormat - JSON serialization/deserialization errors
YamlConversion - YAML conversion errors
XmlConversion - XML conversion errors
CsvConversion - CSV conversion errors
ParquetConversion - Parquet conversion errors
LintErrors - Linting errors found
NotCanonical - File is not in canonical form
InvalidInput - Input validation failures (type names, paths)
ThreadPoolError - Parallel processing thread pool creation failure
GlobPattern - Invalid glob pattern syntax
NoFilesMatched - No files matched the provided patterns
DirectoryTraversal - Directory traversal failures
ResourceExhaustion - System resource exhaustion (file handles, memory)

All errors implement std::error::Error, Display, and Clone for detailed messages and parallel error handling.

Use Cases

Configuration Management: Validate and lint HEDL configuration files in CI/CD pipelines, format for canonical diffs, convert to JSON/YAML for runtime.

Data Pipeline Integration: Convert CSV exports to HEDL for structured processing, validate schemas, transform data, export to Parquet for analytics.

Schema Development: Write HEDL schemas with instant validation feedback, lint for best practices, inspect structure, compare token efficiency vs JSON.

Batch Processing: Process directories of HEDL files in parallel (validation, formatting, linting), aggregate results, identify issues across large codebases.

LLM Context Optimization: Analyze token counts with stats command, convert JSON to HEDL for 40-60% token savings, validate compressed output.

Database Export/Import: Export databases to CSV, convert to HEDL with type inference, validate structure, transform with matrix operations, import to Neo4j via Cypher.

What This Crate Doesn't Do

Interactive Editing: Not a REPL or interactive editor—use hedl-lsp with your favorite editor (VS Code, Neovim, Emacs) for interactive development.

Language Server: LSP functionality is in hedl-lsp crate—this CLI focuses on batch operations and one-off conversions.

MCP Server: Model Context Protocol server is in hedl-mcp crate—this CLI is for human-driven workflows and automation scripts.

Data Transformation: Provides format conversion and validation, not arbitrary data transformations—use HEDL's matrix query capabilities or convert to SQL/Cypher for complex transformations.

Performance Characteristics

Command Performance:

validate: O(n) parsing, ~100-200 MB/s throughput
format: O(n) parse + canonicalization, ~50-100 MB/s
lint: O(n) parse + validation rules, ~80-150 MB/s
stats: Parallel format conversions, ~50-100 MB/s per format

Batch Processing: ~3-5x speedup with parallel execution on multi-core systems.

Memory: O(document_size) per file—loads entire document for parsing. For streaming large files (>100 MB), use hedl-stream crate directly.

Detailed performance benchmarks are available in the HEDL repository benchmark suite.

Dependencies

hedl-core 1.2 - HEDL parsing and data model
hedl-c14n 1.2 - Canonicalization
hedl-lint 1.2 - Best practices linting
hedl-json, hedl-yaml, hedl-xml, hedl-csv, hedl-parquet, hedl-toon 1.2 - Format conversion
clap 4.4 - CLI argument parsing
clap_complete - Shell completion generation
colored - Terminal coloring
rayon - Parallel processing
serde_json - JSON output formatting
thiserror - Error type definitions

License

Apache-2.0

hedl-cli 1.2.0

hedl-cli

What's Implemented

Installation

Core Commands

validate - Syntax and Structure Validation

format - Canonical Formatting

lint - Best Practices Checking

inspect - Structure Visualization

stats - Format Comparison Analysis

Format Conversion Commands

JSON Conversion

YAML Conversion

XML Conversion

CSV Conversion

Parquet Conversion

TOON Conversion

Batch Processing Commands

batch-validate - Parallel Validation

batch-format - Parallel Formatting

batch-lint - Parallel Linting

Shell Completion

Security Features

File Size Limits

Input Validation

Error Context

Architecture Features

Parallel Processing

Count Hints System

Output Handling

Error Types

Use Cases

What This Crate Doesn't Do

Performance Characteristics

Dependencies

License