hedl-mcp
Model Context Protocol server for HEDL—complete AI/LLM integration with 11 tools, caching, rate limiting, and streaming support.
AI/LLM systems need seamless access to HEDL files: reading documents, validating schemas, converting formats, querying entities, optimizing token usage. hedl-mcp provides a production-grade MCP server that bridges AI systems with the HEDL ecosystem through 11 specialized tools (including batch operations), high-performance caching (2-5x speedup), rate limiting (DoS protection), and streaming support for large files.
This is the official Model Context Protocol server for HEDL. Connect any MCP-compatible AI system (Claude Desktop, custom agents, LLM applications) to HEDL with comprehensive tools for validation, conversion, optimization, and querying.
What's Implemented
Complete MCP server with production-ready infrastructure:
- 11 MCP Tools: Read, query, validate, optimize, stats, format, write, convert (to/from), stream, batch
- JSON-RPC 2.0 Protocol: Full MCP specification compliance (protocol version 2024-11-05)
- High-Performance Caching: LRU cache with 2-5x speedup on repeated operations
- Rate Limiting: Token bucket algorithm with 200 burst capacity, 100 req/s sustained
- Streaming Support: Memory-efficient pagination for large documents
- Security Features: Path traversal protection, input size limits (10 MB), safe file operations
- Parallel Processing: Configurable thread pool for directory scanning
- Resource Protocol: List and read HEDL files as MCP resources
- Stdio Transport: Both sync and async modes for flexible integration
- Comprehensive Testing: All tools with valid/invalid inputs, edge cases, caching, rate limiting
Installation
# From source
# Or build locally
Binary location: target/release/hedl-mcp
Quick Start
Standalone Server
# Start server with stdio transport
# Server listens on stdin/stdout for JSON-RPC 2.0 messages
Claude Desktop Integration
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or equivalent:
Environment Variables:
HEDL_ROOT- Root directory for scoped file operations (default: current directory)
Custom MCP Client Integration
use ;
use PathBuf;
let config = McpServerConfig ;
let server = new;
server.run_stdio.await?; // Async mode
// or
server.run_stdio_sync?; // Sync mode
MCP Tools (11 Total)
1. hedl_read - Read and Parse HEDL Files
Read HEDL files from directory or specific file with optional JSON representation:
Arguments:
path(string, required) - File or directory path (scoped to root_path)recursive(boolean, optional) - Recursively scan directories (default: true)include_json(boolean, optional) - Include JSON representation in output (default: false)num_threads(number, optional) - Thread count for parallel processing (default: CPU core count)
Output:
Parallel Processing: Automatically uses parallel processing for directory operations with configurable thread pool. Single-file operations do not use parallelism.
2. hedl_query - Query Entity Registry
Query parsed entities by type and ID with graph-aware nested children support:
Arguments:
hedl(string, required) - HEDL document content to querytype_name(string, optional) - Filter by entity type nameid(string, optional) - Filter by entity IDinclude_children(boolean, optional) - Include nested children in results (default: true)
Output:
Features:
- Filter by type_name only: returns all entities of that type
- Filter by id only: returns entities with matching ID across all types
- Recursive children traversal via %NEST relationships
- Set
include_children: falseto exclude nested entity information
3. hedl_validate - Validate HEDL Documents
Validate syntax, schema, and references with detailed diagnostics:
Arguments:
hedl(string, required) - HEDL document content to validatestrict(boolean, optional) - Treat lint warnings as errors (default: true)lint(boolean, optional) - Include linting diagnostics (default: true)
Output (valid document):
Output (invalid document):
Validation Levels:
- Syntax: Parse errors (missing brackets, invalid tokens)
- Schema: Type mismatches, field count errors
- References: Unresolved entity references
- Lint: Best practices (unused aliases, missing count hints)
4. hedl_optimize - JSON → HEDL Optimization
Convert JSON to optimized HEDL format with token savings statistics:
Arguments:
json(string, required) - JSON content to convertditto(boolean, optional) - Enable ditto optimization for repeated values (default: true)compact(boolean, optional) - Minimize whitespace in output (default: false)
Output:
Reported Capability: 40-60% token savings for typical JSON documents
5. hedl_stats - Token Usage Statistics
Compare HEDL vs JSON token counts with detailed breakdown:
Arguments:
hedl(string, required) - HEDL document content to analyzetokenizer(string, optional) - Tokenizer to use: "simple" or "cl100k" (default: "simple")
Output:
Tokenizers:
- simple: ~4 chars/token heuristic (fast, approximate)
- cl100k: OpenAI tiktoken cl100k_base (accurate for GPT models)
6. hedl_format - Format to Canonical Form
Canonicalize HEDL with optional ditto optimization:
Arguments:
hedl(string, required) - HEDL document content to formatditto(boolean, optional) - Enable ditto optimization for repeated values (default: true)
Output:
Canonicalization:
- Normalizes whitespace (2-space indentation)
- Adds spaces after colons and commas
- Alphabetically sorts header directives
- Optionally compresses repeated values with ditto (
^)
7. hedl_write - Write HEDL Content to File
Write HEDL content with optional validation and formatting:
Arguments:
path(string, required) - Output file path (scoped to root_path)content(string, required) - HEDL content to writevalidate(boolean, optional) - Validate before writing (default: true)format(boolean, optional) - Canonicalize before writing (default: false)backup(boolean, optional) - Create .hedl.bak backup (default: true)
Output:
Safety Features:
- Path traversal protection (canonicalize + prefix checking)
- Optional validation prevents writing invalid HEDL
- Backup creation preserves existing files
8. hedl_convert_to - Convert HEDL to Other Formats
Export HEDL to JSON, YAML, XML, CSV, Parquet (base64), Cypher (Neo4j), or TOON:
Arguments:
hedl(string, required) - HEDL document content to convertformat(string, required) - Target format: "json", "yaml", "xml", "csv", "parquet", "cypher", "toon"options(object, optional) - Format-specific options:- For JSON/YAML/XML:
pretty(bool) - Pretty-print output (default: true) - For XML:
include_metadata(bool) - Include metadata (default: true),root_element(string) - For CSV:
include_headers(bool) - Include CSV headers (default: true) - For Cypher:
include_constraints(bool) - Add constraints (default: false)
- For JSON/YAML/XML:
Output:
Parquet Output: Base64-encoded parquet_base64 field with byte count
Cypher Output: Neo4j CREATE/MERGE statements with optional constraints
9. hedl_convert_from - Convert Other Formats to HEDL
Import JSON, YAML, XML, CSV, Parquet, or TOON into HEDL:
Arguments:
content(string, required) - Source content to convert (base64 for parquet)format(string, required) - Source format: "json", "yaml", "xml", "csv", "parquet", "toon"options(object, optional) - Format-specific options:- General:
canonicalize(bool) - Canonicalize output (default: false) - CSV:
schema(array of strings) - Column names (required for headerless CSV)
- General:
Output:
CSV Conversion: Requires schema option for headerless CSV; auto-detects types for values
Parquet Input: Provide base64-encoded content in content field
10. hedl_stream - Stream Parse Large Documents
Memory-efficient parsing with pagination and type filtering:
Arguments:
hedl(string, required) - HEDL document content to stream parsetype_filter(string, optional) - Filter entities by typeoffset(number, optional) - Skip first N entities (default: 0)limit(number, optional) - Maximum entities to return (default: 100)
Output:
Performance: Streaming architecture processes documents efficiently with configurable pagination.
11. batch - Execute Multiple Operations
Execute multiple operations in a single request with dependency resolution and parallel execution:
Arguments:
operations(array, required) - List of operations to execute:id(string, required) - Unique operation identifiertool(string, required) - Tool name to executearguments(object, required) - Tool-specific argumentsdepends_on(array, optional) - Operation IDs this depends on
mode(string, optional) - Execution mode: "continue_on_error" or "stop_on_error" (default: "continue_on_error")parallel(boolean, optional) - Enable parallel execution for independent operations (default: true)transaction(boolean, optional) - All-or-nothing transaction semantics (default: false)timeout(number, optional) - Maximum execution time in seconds (1-3600)
Output:
Features:
- Dependency Resolution: Automatically orders operations based on
depends_ondeclarations - Parallel Execution: Independent operations run concurrently (when
parallel: true) - Error Handling: Continue on errors or stop at first failure
- Transaction Semantics: All-or-nothing execution with rollback on failure
- Timeout Protection: Maximum execution time prevents runaway operations
High-Performance Caching
LRU cache for immutable operations provides 2-5x speedup on repeated requests:
Cached Operations
hedl_validate- Validation results for identical contenthedl_query- Entity lookups for unchanged fileshedl_stats- Token statistics for same content
Cache Key: Operation name + SHA256 hash of inputs
Configuration:
McpServerConfig
Cache Statistics:
Impact: Validation of frequently-accessed documents drops from 50ms to <1ms.
Rate Limiting: DoS Protection
Token bucket algorithm prevents request flooding:
Configuration:
McpServerConfig
Behavior:
- Burst: Allow up to 200 requests instantly
- Sustained: Refill 100 tokens/second
- Exceeded: Returns error with retry-after time
Error Response:
Use Case: Prevents aggressive clients from overwhelming server with thousands of requests.
Security Features
Path Traversal Protection
All file operations are scoped to configured root_path:
// Canonicalize path and check prefix
let canonical = canonicalize?;
if !canonical.starts_with
Blocked Attempts:
/etc/passwd → Error: Path traversal detected
../../../secrets.hedl → Error: Path traversal detected
/data/hedl/users.hedl → OK (within root_path)
Input Size Validation
Maximum Input: 10 MB per request
if content.len > 10_000_000
Purpose: Prevents memory exhaustion from malicious large inputs.
Safe File Operations
- All writes create parent directories if needed
- Backup files preserve original content (.hedl.bak)
- Validation prevents writing malformed HEDL
Resource Protocol Support
List and read HEDL files as MCP resources:
resources/list
Response:
resources/read
Response:
MCP Protocol Implementation
Supported Methods
Lifecycle:
initialize- Protocol handshake with capability negotiationinitialized- Client confirmation notificationshutdown- Graceful server termination
Tools:
tools/list- List all 11 available tools with schemastools/call- Execute specific tool with arguments
Resources:
resources/list- List HEDL files in root_pathresources/read- Read HEDL file content
Health:
ping- Health check endpoint (always returns pong)
Server Capabilities
Error Handling
Comprehensive error types with MCP error codes:
Error Codes:
-32700: Parse error (invalid JSON-RPC)-32600: Invalid request (malformed method/params)-32601: Method not found-32000: Server error (HEDL-specific errors)
HEDL-Specific Errors:
Tool Errors: Returned as successful responses with is_error: true:
Use Cases
AI-Powered HEDL Editing: LLMs read HEDL files, suggest edits, validate changes, write back canonical HEDL—all through MCP tools.
Automated Optimization: AI agents convert JSON to HEDL, analyze token savings, optimize with ditto, validate output, deploy optimized configs.
Interactive Query Assistant: Chat with HEDL data—ask "Show me all users created in 2024", agent uses hedl_query to fetch entities, formats response.
Multi-Format Pipeline: Agent reads CSV exports, converts to HEDL via hedl_convert_from, validates schemas, exports to Neo4j Cypher, tracks token efficiency.
Bulk Document Processing: AI orchestrates parallel HEDL validation across directories, aggregates lint issues, generates summary reports.
LLM Context Optimization: Analyze HEDL vs JSON token usage with hedl_stats, optimize prompts with hedl_optimize, validate context window limits.
What This Crate Doesn't Do
Direct File Watching: No file system watching—client must explicitly call tools to check for changes. Use inotify/fsevents in client if needed.
Multi-Document Transactions: Each tool call is independent—no transactional updates across multiple files. Implement transactions in client layer if required.
Schema Evolution: No automatic schema migration—manual handling of %STRUCT changes required. Use hedl-lint for schema consistency validation.
Distributed Coordination: Single-server design—no distributed consensus or multi-server coordination. Deploy multiple instances behind load balancer if needed.
Performance Characteristics
Tool Execution: ~50-200 MB/s parsing throughput depending on complexity
Caching: 2-5x speedup for repeated validation/query operations
Streaming: O(1) memory per entity regardless of file size
Rate Limiting: <0.1ms overhead per request (~1% impact)
Parallel Processing: Linear speedup up to CPU core count for directory scanning
Detailed performance benchmarks are available in the HEDL repository benchmark suite.
Testing Coverage
Comprehensive test suite covering:
- All 11 Tools: Valid/invalid inputs, edge cases, error conditions
- Batch Operations: Dependency resolution, parallel execution, error handling
- Caching: Hits, misses, LRU eviction, statistics
- Rate Limiting: Burst capacity, sustained rate, token refill, overflow
- Security: Path traversal detection, input size validation
- Parallel Processing: Custom threads, recursive scanning, error collection
- Format Conversions: Round-trip fidelity for all supported formats
Dependencies
hedl-core1.0 - HEDL parsing and data modelhedl-c14n1.0 - Canonicalizationhedl-lint1.0 - Best practices lintinghedl-json,hedl-yaml,hedl-xml,hedl-csv,hedl-parquet,hedl-neo4j1.0 - Format conversionserde1.0,serde_json1.0 - JSON serializationtokio1.0 - Async runtimedashmap6.1 - Concurrent HashMap for cachingrayon1.10 - Parallel processingthiserror1.0 - Error type definitions
License
Apache-2.0