hedl-cli
Complete HEDL toolkit—validation, formatting, linting, inspection, conversion, and batch processing with parallel execution.
You need to validate HEDL files, convert between formats, analyze structure, or process hundreds of files in parallel. hedl-cli provides 21 commands covering the entire HEDL workflow: core operations (validate, format, lint, inspect, stats), bidirectional conversion to 6 formats (JSON, YAML, XML, CSV, Parquet, TOON), batch processing with automatic parallelization, and shell completion generation.
This is the official command-line interface for the HEDL ecosystem. Whether you're validating configuration files, converting database exports, analyzing token efficiency, or processing directories of HEDL documents—hedl-cli provides the tools you need.
What's Implemented
Complete command-line toolkit with 21 commands across 4 categories:
- Core Commands (5): Validate, format, lint, inspect, stats
- Format Conversion (12): Bidirectional conversion for JSON, YAML, XML, CSV, Parquet, TOON
- Batch Processing (3): Parallel validation, formatting, linting with progress tracking
- Utilities (1): Shell completion generation for 5 shells
- High-Performance Architecture: Parallel processing, colored output, structured errors
- Security Features: File size limits (1 GB default), input validation, safe error propagation
- Flexible Output: File or stdout, JSON or text, pretty or compact formatting
Installation
# From source
# Or build locally
Binary location: target/release/hedl
Core Commands
validate - Syntax and Structure Validation
Validate HEDL files with optional strict reference checking:
# Basic validation
# Strict mode (all references must resolve)
Output:
✓ config.hedl
Version: 1.0
Structs: 5
Aliases: 2
Nests: 3
Options:
--strict- Enforce all entity references must resolve to defined entities
Exit Codes: 0 (valid), 1 (parse errors or validation failures)
format - Canonical Formatting
Normalize HEDL files to canonical form with optional optimizations:
# Format to stdout
# Format to file
# Check if already canonical (no changes)
# Disable ditto optimization (keep repeated values explicit)
# Add count hints to all matrix lists
Options:
-o, --output <FILE>- Write to file instead of stdout--check- Only check if canonical, don't write output--ditto- Enable ditto operator optimization for repeated values (default: enabled)--with-counts- Recursively add count hints to matrix lists
Exit Codes: 0 (success or already canonical), 1 (parse error or check failed)
lint - Best Practices Checking
Check HEDL files against best practices with configurable severity:
# Text output with colors
# JSON output for programmatic processing
# Treat warnings as errors
Output (text format with colors):
Warning [unused-alias]: Alias 'old_api' is defined but never used
at line 15
Suggestion [add-count-hints]: Matrix list 'users' is missing count hint
at line 42
Found 1 warning, 1 suggestion
Output (JSON format):
Options:
-f, --format <text|json>- Output format (default: text with colors)-W, --warn-error- Treat warnings as errors
Exit Codes: 0 (no issues), 1 (has errors or warnings when --warn-error enabled)
inspect - Structure Visualization
Display HEDL file structure as an interactive tree:
# Basic tree view
# Verbose mode (show field values and row data)
Output (tree format):
Document (1.0)
├─ Schemas (3)
│ ├─ User [id, name, email, created_at]
│ ├─ Post [id, author, title, content]
│ └─ Comment [id, post, author, text]
├─ Aliases (2)
│ ├─ $api_url = "https://api.example.com"
│ └─ $version = "2.1.0"
├─ Nests (1)
│ └─ Post > Comment
└─ Data
├─ users: @User (125 entities)
├─ posts: @Post (48 entities)
└─ comments: @Comment (312 entities)
Verbose Output (shows actual data):
└─ users: @User (3 entities)
├─ alice [Alice Smith, alice@example.com, 2024-01-15]
├─ bob [Bob Jones, bob@example.com, 2024-02-20]
└─ carol [Carol White, carol@example.com, 2024-03-10]
Options:
-v, --verbose- Show detailed field values and row data
stats - Format Comparison Analysis
Compare HEDL file size and token counts vs JSON, YAML, XML:
# Byte counts only
# Include LLM token estimates
Output:
Format Comparison for 'data.hedl':
File Sizes:
HEDL: 2,458 bytes
JSON (compact): 3,841 bytes (+56.3%)
JSON (pretty): 5,219 bytes (+112.3%)
YAML: 4,105 bytes (+67.0%)
XML: 6,732 bytes (+173.9%)
Token Estimates (LLM ~4 chars/token):
HEDL: 615 tokens
JSON (compact): 960 tokens (+56.1%)
JSON (pretty): 1,305 tokens (+112.2%)
YAML: 1,026 tokens (+66.8%)
XML: 1,683 tokens (+173.7%)
Conclusion: HEDL saves 345 tokens (36%) vs JSON compact, 690 tokens (53%) vs JSON pretty
Options:
--tokens- Include LLM token count estimates (~4 chars/token heuristic)
Performance: All format conversions run in parallel using Rayon for maximum throughput.
Format Conversion Commands
Bidirectional conversion between HEDL and 6 popular formats.
JSON Conversion
# HEDL → JSON (compact)
# HEDL → JSON (pretty-printed)
# HEDL → JSON (with metadata)
# JSON → HEDL
to-json Options:
-o, --output <FILE>- Write to file--pretty- Pretty-print with indentation--metadata- Include HEDL version and schema information
YAML Conversion
# HEDL → YAML
# YAML → HEDL
XML Conversion
# HEDL → XML (compact)
# HEDL → XML (pretty-printed)
# XML → HEDL
to-xml Options:
--pretty- Pretty-print with indentation
CSV Conversion
# HEDL → CSV (includes headers by default)
# CSV → HEDL (specify entity type name)
to-csv Options:
--headers- Include column headers (default: true)
from-csv Options:
--type-name <NAME>- Entity type name (default: "Row")
Parquet Conversion
# HEDL → Parquet (columnar format)
# Parquet → HEDL
Note: --output is required (not optional like other commands) because Parquet uses binary columnar format.
TOON Conversion
# HEDL → TOON
# TOON → HEDL
TOON (Token-Oriented Object Notation) is optimized for LLM efficiency but accuracy testing shows HEDL achieves higher comprehension (+3.4 points average) with 10% fewer tokens.
Batch Processing Commands
Process multiple files in parallel with automatic parallelization and progress tracking.
batch-validate - Parallel Validation
# Validate all .hedl files in directory
# Strict mode for all files
# Verbose progress tracking
# Force parallel processing
# Use streaming mode for large files (constant memory)
# Automatically use streaming for files > 100MB
# Limit processing to 5000 files
Options:
--strict- Enforce reference resolution for all files-v, --verbose- Detailed progress output-p, --parallel- Force parallel processing (default: auto-detect based on file count)--streaming- Use streaming mode for memory-efficient processing (constant memory, ideal for files >100MB)--auto-streaming- Automatically use streaming for large files (>100MB) and standard mode for smaller files--max-files <N>- Maximum number of files to process (default: 10,000, set to 0 for unlimited)
Output:
Validating 127 files...
Progress: [========================================] 127/127 (100%)
Completed in 2.3s (55 files/sec)
Results:
Valid: 125 files
Failed: 2 files
- data/broken.hedl: Parse error at line 42: unexpected token
- schemas/old.hedl: Unresolved reference @User:nonexistent
Performance: Automatic parallelization when file count ≥ 10 (configurable), ~3-5x speedup on multi-core systems.
batch-format - Parallel Formatting
# Format all files in-place
# Format to output directory
# Format with ditto optimization
# Add count hints to all files
# Limit processing to 5000 files
Options:
--output-dir <DIR>- Write formatted files to directory (preserves relative paths)--ditto- Enable ditto optimization (default: true)--with-counts- Add count hints to matrix lists-v, --verbose- Detailed progress output-p, --parallel- Force parallel processing--max-files <N>- Maximum number of files to process (default: 10,000, set to 0 for unlimited)
Output:
Formatting 89 files...
Progress: [========================================] 89/89 (100%)
Completed in 1.8s (49 files/sec)
Results:
Formatted: 87 files
Unchanged: 0 files (already canonical)
Failed: 2 files
- data/corrupt.hedl: Parse error at line 15
batch-lint - Parallel Linting
# Lint all files with aggregated results
# Treat warnings as errors
# Verbose per-file results
# Limit processing to 5000 files
Options:
--warn-error- Treat warnings as errors-v, --verbose- Show issues for each file-p, --parallel- Force parallel processing--max-files <N>- Maximum number of files to process (default: 10,000, set to 0 for unlimited)
Output:
Linting 64 files...
Progress: [========================================] 64/64 (100%)
Completed in 1.1s (58 files/sec)
Aggregated Results:
Errors: 3 across 2 files
Warnings: 12 across 8 files
Suggestions: 25 across 19 files
Top Issues:
- unused-alias (8 occurrences)
- add-count-hints (7 occurrences)
- unresolved-reference (3 occurrences)
Failed Files:
- schemas/old.hedl: 2 errors, 3 warnings
- configs/broken.hedl: 1 error
Shell Completion
Generate shell completion scripts for interactive usage:
# Generate for current shell
# Supported shells
Installation (bash example):
# Add to ~/.bashrc
After installation, tab completion works for all commands, subcommands, and options:
Security Features
File Size Limits
Prevents memory exhaustion from malicious or unexpected large files:
# Default: 1 GB limit
# Error: File size (1.2 GB) exceeds limit (1 GB)
# Configure via environment variable
# 2 GB
Default Limit: 1,073,741,824 bytes (1 GB)
Input Validation
- Type Names (CSV conversion): Alphanumeric characters and underscores only
- Path Safety: All file operations validated before processing
- Error Boundaries: Continues batch processing on individual file errors
Error Context
All errors include file paths and detailed context:
Error: Failed to parse 'data/broken.hedl'
Parse error at line 42, column 15:
unexpected token ']', expected field name
Context:
40 | users: @User[id, name]
41 | | alice, Alice Smith
42 | | bob, Bob Jones]
| ^ here
Architecture Features
Parallel Processing
BatchProcessor System:
- Configurable parallelization threshold (default: 10 files)
- Automatic thread pool sizing
- Progress tracking with atomic counters (lock-free)
- Error resilience (collects all failures, continues processing)
Performance: ~3-5x speedup on multi-core systems for batch operations.
Count Hints System
Recursively adds count hints to matrix lists and nested children:
# Before formatting with --with-counts
users: @User[id, name]
| alice, Alice
| bob, Bob
# After formatting
users[2]: @User[id, name]
| alice, Alice
| bob, Bob
Behavior: Overwrites existing hints with actual counts from parsed document.
Output Handling
- Colored Console: Uses
coloredcrate for syntax highlighting and progress - Flexible Destinations: File path or stdout (respects --output/-o)
- Format Options: JSON, text, compact, pretty-printed
- Error Separation: Errors always printed to stderr, output to stdout
Error Types
Comprehensive error handling with 19 error variants:
- Io - File I/O errors with path context
- FileTooLarge - Size limit exceeded (configurable)
- IoTimeout - I/O operation timeout
- Parse - HEDL syntax errors with line/column
- Canonicalization - Canonicalization failures
- JsonConversion - JSON conversion errors
- JsonFormat - JSON serialization/deserialization errors
- YamlConversion - YAML conversion errors
- XmlConversion - XML conversion errors
- CsvConversion - CSV conversion errors
- ParquetConversion - Parquet conversion errors
- LintErrors - Linting errors found
- NotCanonical - File is not in canonical form
- InvalidInput - Input validation failures (type names, paths)
- ThreadPoolError - Parallel processing thread pool creation failure
- GlobPattern - Invalid glob pattern syntax
- NoFilesMatched - No files matched the provided patterns
- DirectoryTraversal - Directory traversal failures
- ResourceExhaustion - System resource exhaustion (file handles, memory)
All errors implement std::error::Error, Display, and Clone for detailed messages and parallel error handling.
Use Cases
Configuration Management: Validate and lint HEDL configuration files in CI/CD pipelines, format for canonical diffs, convert to JSON/YAML for runtime.
Data Pipeline Integration: Convert CSV exports to HEDL for structured processing, validate schemas, transform data, export to Parquet for analytics.
Schema Development: Write HEDL schemas with instant validation feedback, lint for best practices, inspect structure, compare token efficiency vs JSON.
Batch Processing: Process directories of HEDL files in parallel (validation, formatting, linting), aggregate results, identify issues across large codebases.
LLM Context Optimization: Analyze token counts with stats command, convert JSON to HEDL for 40-60% token savings, validate compressed output.
Database Export/Import: Export databases to CSV, convert to HEDL with type inference, validate structure, transform with matrix operations, import to Neo4j via Cypher.
What This Crate Doesn't Do
Interactive Editing: Not a REPL or interactive editor—use hedl-lsp with your favorite editor (VS Code, Neovim, Emacs) for interactive development.
Language Server: LSP functionality is in hedl-lsp crate—this CLI focuses on batch operations and one-off conversions.
MCP Server: Model Context Protocol server is in hedl-mcp crate—this CLI is for human-driven workflows and automation scripts.
Data Transformation: Provides format conversion and validation, not arbitrary data transformations—use HEDL's matrix query capabilities or convert to SQL/Cypher for complex transformations.
Performance Characteristics
Command Performance:
- validate: O(n) parsing, ~100-200 MB/s throughput
- format: O(n) parse + canonicalization, ~50-100 MB/s
- lint: O(n) parse + validation rules, ~80-150 MB/s
- stats: Parallel format conversions, ~50-100 MB/s per format
Batch Processing: ~3-5x speedup with parallel execution on multi-core systems.
Memory: O(document_size) per file—loads entire document for parsing. For streaming large files (>100 MB), use hedl-stream crate directly.
Detailed performance benchmarks are available in the HEDL repository benchmark suite.
Dependencies
hedl-core1.2 - HEDL parsing and data modelhedl-c14n1.2 - Canonicalizationhedl-lint1.2 - Best practices lintinghedl-json,hedl-yaml,hedl-xml,hedl-csv,hedl-parquet,hedl-toon1.2 - Format conversionclap4.4 - CLI argument parsingclap_complete- Shell completion generationcolored- Terminal coloringrayon- Parallel processingserde_json- JSON output formattingthiserror- Error type definitions
License
Apache-2.0