anyrepair-0.1.7 has been yanked.

AnyRepair

A comprehensive Rust crate for repairing LLM responses across multiple formats including JSON, YAML, Markdown, XML, TOML, CSV, and INI files.

Why AnyRepair?

Critical for Agentic AI and Tool Use

In the era of agentic AI systems, reliable data parsing is essential for:

Tool Function Calls: AI agents must parse structured data from external APIs and tools
MCP (Model Context Protocol): Ensures reliable communication between AI models and external systems
Agent Workflows: Multi-step AI processes depend on consistent data format handling
Error Recovery: When AI outputs malformed data, repair enables graceful recovery
Production Reliability: Real-world AI applications need robust data handling

The Problem with LLM Output

LLMs often generate:

Malformed JSON with missing quotes, trailing commas, or syntax errors
Inconsistent YAML with indentation issues or missing colons
Broken Markdown with malformed headers, links, or code blocks
Invalid XML/TOML/CSV/INI with structural problems

Without repair, these errors cause:

Tool failures in agentic workflows
MCP communication breakdowns
Cascading errors in multi-step AI processes
Poor user experience with unreliable AI applications

AnyRepair's Solution

While there are several JSON repair tools available in Rust, AnyRepair addresses the unique challenges of LLM-generated content across multiple formats:

Multi-Format Support

Unlike single-format tools like json-repair-rs or json5, AnyRepair handles 7 different formats (JSON, YAML, Markdown, XML, TOML, CSV, INI) with auto-detection, making it perfect for LLM responses that can be in any format.

LLM-Specific Optimizations

Intelligent Format Detection: Automatically detects format from damaged content
Context-Aware Repair: Understands LLM output patterns and common errors
Rule-Based Confidence Scoring: Provides repair quality metrics using pattern-based rules (no LLM required)
Parallel Processing: Optimized for batch processing of LLM responses

Advanced Repair Strategies

Adaptive Repair: Strategies that adapt based on content complexity
Multi-Pass Processing: Applies multiple repair strategies in optimal order
Custom Rules: User-defined repair patterns for specific use cases
Plugin System: Extensible architecture for custom repair logic

Production-Ready Features

Comprehensive Testing: 326 test cases (204 library + 4 integration + 26 streaming + 18 complex damage + 18 complex streaming + 36 fuzz + 18 damage scenarios + 2 doctests) with 100% pass rate
High Performance: Regex caching with 99.6% performance improvement, optimized binaries (1.5 MB)
CLI & Library: Both command-line tool and Rust library for integration
MCP Server: Model Context Protocol server for Claude and other AI clients
Streaming Support: Process large files with minimal memory overhead
Configuration Management: TOML-based configuration with custom rules
Enterprise Features: Analytics, batch processing, validation rules, and audit logging
Python-Compatible API: Drop-in replacement for Python's jsonrepair library

Comparison with Other Tools

Feature	AnyRepair	json-repair-rs	json5	Other Tools
Multi-format	✅ 7 formats	❌ JSON only	❌ JSON only	❌ Single format
Auto-detection	✅	❌	❌	❌
LLM-optimized	✅	❌	❌	❌
Agentic AI support	✅	❌	❌	❌
MCP integration	✅	❌	❌	❌
Custom rules	✅	❌	❌	❌
Plugin system	✅	❌	❌	❌
Rule-based confidence scoring	✅	❌	❌	❌
Parallel processing	✅	❌	❌	❌
Fuzz testing	✅	❌	❌	❌

Features

Multi-format repair: JSON, YAML, Markdown, XML, TOML, CSV, INI
Auto-detection: Automatically detects format and applies appropriate repairs
High performance: Regex caching with up to 99.6% performance improvement
CLI tool: Command-line interface for easy usage
Comprehensive testing: 326 test cases with snapshot and fuzz testing
Parallel processing: Multi-threaded strategy application for better performance
Advanced strategies: Intelligent format detection, adaptive repair, and context-aware processing
Plugin system: Extensible architecture for custom repair strategies
Custom rules: User-defined repair rules with full CLI management
Fuzz testing: Comprehensive property-based testing for robustness
Configuration: TOML-based configuration with custom rules and plugin settings
Python-compatible API: Drop-in replacement for Python's jsonrepair library

Rule-Based Confidence Scoring

AnyRepair uses sophisticated pattern-based rules to calculate confidence scores without requiring any LLM calls:

JSON Confidence Rules

Structure Detection: Checks for balanced braces {} and brackets []
Key-Value Patterns: Detects colon : separators and quote patterns
Syntax Validation: Validates JSON structure without parsing
Balance Scoring: Rewards properly balanced opening/closing delimiters

YAML Confidence Rules

Indentation Analysis: Evaluates consistent indentation patterns
Key-Value Detection: Identifies colon : separators and list indicators -
Document Structure: Recognizes YAML document separators ---
Content Patterns: Detects YAML-specific syntax elements

Format-Specific Rules

Each format has tailored confidence rules:

Markdown: Header patterns #, code blocks ````, link syntax []()
XML: Tag structure <>, attribute patterns, entity encoding
TOML: Table headers [], key-value pairs =, array syntax
CSV: Delimiter consistency, quote patterns, row structure
INI: Section headers [], key-value pairs =, comment patterns

Benefits of Rule-Based Approach

Fast: No external API calls or network requests
Reliable: Consistent scoring based on deterministic rules
Transparent: Clear understanding of how scores are calculated
Customizable: Rules can be extended through the plugin system

Agentic AI & MCP Integration

MCP Server

AnyRepair now includes a dedicated MCP server for integration with Claude and other MCP-compatible clients:

# Run the MCP server
anyrepair-mcp

# Or integrate with Claude
# Add to claude_desktop_config.json:
# {
#   "mcpServers": {
#     "anyrepair": {
#       "command": "anyrepair-mcp"
#     }
#   }
# }

See MCP_SERVER.md for detailed documentation.

Model Context Protocol (MCP) Support

AnyRepair is designed to work seamlessly with MCP implementations:

use anyrepair::repair;

// MCP tool call response repair
let mcp_response = r#"{"tool": "search", "params": {"query": "AI news"}, "result": "..."}"#;
let repaired = repair(mcp_response)?;

// MCP context data repair
let context_data = r#"name: AI Assistant
version: 1.0
capabilities: [search, analyze, generate]"#;
let repaired_context = repair(context_data)?;

Agentic AI Workflow Integration

Perfect for AI agent systems that need reliable data handling:

// Agent tool execution with repair
async fn execute_agent_tool(tool_call: &str) -> Result<String> {
    let response = call_external_api(tool_call).await?;
    
    // Repair the response before processing
    let repaired = anyrepair::repair(&response)?;
    
    // Parse and use the repaired data
    let parsed: serde_json::Value = serde_json::from_str(&repaired)?;
    process_agent_response(parsed)
}

Use Cases

AI Agent Tool Calls: Repair malformed responses from external APIs
MCP Communication: Ensure reliable data exchange between AI models and tools
Multi-Agent Systems: Handle data format inconsistencies across different agents
Production AI Apps: Robust error handling for real-world AI applications
LLM Output Processing: Clean and validate AI-generated structured data

Enterprise Features

AnyRepair now includes comprehensive enterprise-grade features:

Advanced Analytics

Track repair operations with detailed metrics:

use anyrepair::AnalyticsTracker;
use std::time::Duration;

let tracker = AnalyticsTracker::new();
tracker.record_repair("json", true, Duration::from_millis(10), 0.95);
let metrics = tracker.get_metrics();
println!("Success rate: {}%", tracker.get_success_rate());

Batch Processing

Process multiple files across different formats:

use anyrepair::BatchProcessor;
use std::path::Path;

let processor = BatchProcessor::new();
let results = processor.process_directory(
    Path::new("./data"),
    true,
    Some(&["json", "yaml", "xml"])
)?;
println!("Processed: {}", results.total_files);

Custom Validation Rules

Define and enforce validation rules:

use anyrepair::ValidationRulesEngine;
use anyrepair::validation_rules::{ValidationRule, RuleType};

let mut engine = ValidationRulesEngine::new();
let rule = ValidationRule {
    name: "max_size".to_string(),
    rule_type: RuleType::Length,
    pattern: "10000".to_string(),
    error_message: "Content exceeds maximum size".to_string(),
    enabled: true,
};
engine.add_rule(rule);
let result = engine.validate(content);

Audit Logging

Comprehensive audit logging for compliance:

use anyrepair::AuditLogger;

let logger = AuditLogger::with_file("audit.log");
logger.log_repair("data.json", "json", true, "user@example.com", Some("Automated repair"));
let entries = logger.get_entries_by_type("REPAIR");

Advanced Confidence Scoring

Improved confidence scoring algorithms:

use anyrepair::ConfidenceScorer;

let score = ConfidenceScorer::score_json(content);
let yaml_score = ConfidenceScorer::score_yaml(content);
let xml_score = ConfidenceScorer::score_xml(content);

Installation

[dependencies]
anyrepair = "0.1.5"

Usage

Library

Python jsonrepair Compatible API

AnyRepair provides a Python jsonrepair-compatible interface for easy migration:

Function-based API:

use anyrepair::jsonrepair;

// Simple function call matching Python's jsonrepair
let malformed = r#"{"name": "John", age: 30,}"#;
let repaired = jsonrepair(malformed)?;
println!("{}", repaired); // {"name": "John", "age": 30}

Class-based API:

use anyrepair::JsonRepair;

// Class-like interface matching Python's JsonRepair class
let mut jr = JsonRepair::new();
let malformed = r#"{"key": "value",}"#;
let repaired = jr.jsonrepair(malformed)?;
println!("{}", repaired); // {"key": "value"}

Multi-Format Auto-Detection

use anyrepair::repair;

// Auto-detect format and repair
let content = r#"{"name": "John", "age": 30,}"#;
let repaired = repair(content)?;
println!("{}", repaired); // {"name": "John", "age": 30}

CLI

# Install
cargo install anyrepair

# Repair a file
anyrepair input.json

# Stream repair large files with minimal memory
anyrepair stream --input large_file.json --output repaired.json --format json

# Batch process
anyrepair batch --input-dir ./files --output-dir ./repaired

# Get statistics
anyrepair stats --input-dir ./files

Streaming Repair for Large Files

AnyRepair includes streaming repair capabilities for processing large files with minimal memory overhead:

use anyrepair::StreamingRepair;
use std::fs::File;
use std::io::BufReader;

let input = BufReader::new(File::open("large_file.json")?);
let mut output = File::create("repaired.json")?;

let processor = StreamingRepair::with_buffer_size(8192);
let bytes_processed = processor.process(input, &mut output, "json")?;
println!("Processed {} bytes", bytes_processed);

Benefits:

Process files larger than available RAM
Configurable buffer size for memory optimization
Automatic format detection
Streaming from stdin/stdout support
Progress tracking via byte count

Supported Formats

JSON: Missing quotes, trailing commas, malformed numbers, boolean/null values, nested structures
YAML: Indentation, missing colons, list formatting, complex structures, document separators
Markdown: Headers, code blocks, lists, tables, links, images, bold/italic formatting
XML: Unclosed tags, malformed attributes, missing quotes, invalid characters, entity encoding
TOML: Missing quotes, malformed arrays, table headers, numbers, dates, inline tables
CSV: Unquoted strings, malformed quotes, extra/missing commas, headers, field escaping
INI: Malformed sections, missing equals signs, unquoted values, comments, key-value pairs

Plugin System

AnyRepair features a powerful plugin system that allows you to extend functionality with custom repair strategies, validators, and repairers.

Plugin Management

# List available plugins
anyrepair plugins list

# Show plugin information
anyrepair plugins info my_plugin

# Enable/disable plugins
anyrepair plugins toggle --id my_plugin --enable

# Show plugin statistics
anyrepair plugins stats

# Discover plugins in directories
anyrepair plugins discover --paths ./plugins,./custom-plugins

Custom Rules

Create and manage custom repair rules:

# Initialize configuration with templates
anyrepair rules init

# Add a custom rule
anyrepair rules add --id "fix_undefined" --name "Fix Undefined" --format "json" --pattern "undefined" --replacement "null"

# Test a rule
anyrepair rules test --id "fix_undefined" --input '{"value": undefined}'

# List all rules
anyrepair rules list

Plugin Development

See Plugin Development Guide for detailed information on creating custom plugins.

License

Apache-2.0

Repository

https://github.com/yingkitw/anyrepair

anyrepair 0.1.7