diffx
🚀 Semantic diff for structured data - Focus on what matters, not formatting
English README | 日本語版 README | 中文版 README
A next-generation diff tool that understands the structure and meaning of your data, not just text changes. Perfect for JSON, YAML, TOML, XML, INI, and CSV files.
# Traditional diff shows formatting noise (key order, trailing commas)
<
>
# diffx shows only semantic changes
✨ Key Features
- 🎯 Semantic Awareness: Ignores formatting, key order, whitespace, and trailing commas
- 🔧 Multiple Formats: JSON, YAML, TOML, XML, INI, CSV support
- 🤖 AI-Friendly: Clean CLI output perfect for automation and AI analysis
- ⚡ Fast: Built in Rust for maximum performance
- 🔗 Meta-Chaining: Compare diff reports to track change evolution
📊 Performance
Real benchmark results on AMD Ryzen 5 PRO 4650U:
# Test files: ~600 bytes JSON with nested config
# Results:
)
)
Why CLI matters for the AI era: As AI tools become essential in development workflows, having structured, machine-readable diff output becomes crucial. diffx
provides clean, parseable results that AI can understand and reason about, making it perfect for automated code review, configuration management, and intelligent deployment pipelines.
Why diffx?
Traditional diff
tools show you formatting noise. diffx
shows you what actually changed.
- Focus on meaning: Ignores key order, whitespace, and formatting
- Multiple formats: Works with JSON, YAML, TOML, XML, INI, CSV
- Clean output: Perfect for humans, scripts, and AI analysis
Specification
Supported Formats
- JSON
- YAML
- TOML
- XML
- INI
- CSV
Types of Differences
- Key addition/deletion
- Value change
- Array insertion/deletion/modification
- Nested structure differences
- Value type change
Output Formats
diffx
outputs differences in the diffx format by default - a semantic diff representation designed specifically for structured data. The diffx format provides the richest expression of structural differences and can be complemented with machine-readable formats for integration:
-
diffx Format (Default)
- The diffx format is a human-readable, semantic diff representation that clearly displays structural differences (additions, changes, deletions, type changes, etc.) using intuitive symbols and hierarchical paths.
- Differences are represented by
+
(addition),-
(deletion),~
(change),!
(type change) symbols with full path context (e.g.,database.connection.host
). - Core Feature: Focuses on semantic changes in data, ignoring changes in key order, whitespace, and formatting. This semantic focus is the fundamental value of both the tool and the diffx format.
-
JSON Format
- Machine-readable format. Used for CI/CD and integration with other programs.
- Differences detected by
diffx
are output as a JSON array.
-
YAML Format
- Machine-readable format. Used for CI/CD and integration with other programs, similar to JSON.
- Differences detected by
diffx
are output as a YAML array.
-
diff-compatible Format (Unified Format)
- Provided with the
--output unified
option. - Intended for integration with
git
and existing merge tools. - Note: This format only shows the semantic differences detected by
diffx
in traditional diff format. Changes that are not semantic differences (e.g., key order changes, whitespace changes) are not displayed. This is purely for compatibility with existing tools.
- Provided with the
🏗️ Architecture
System Overview
graph TB
subgraph Core["diffx-core"]
B[Format Parsers]
C[Semantic Diff Engine]
D[Output Formatters]
B --> C --> D
end
E[CLI Tool] --> Core
F[NPM Package] --> E
G[Python Package] --> E
H[JSON] --> B
I[YAML] --> B
J[TOML] --> B
K[XML] --> B
L[INI] --> B
M[CSV] --> B
D --> N[CLI Display]
D --> O[JSON Output]
D --> P[YAML Output]
D --> Q[Unified Diff]
Project Structure
diffx/
├── diffx-core/ # Diff extraction library (Crate)
├── diffx-cli/ # CLI wrapper
├── tests/ # All test-related files
│ ├── fixtures/ # Test input data
│ ├── integration/ # CLI integration tests
│ ├── unit/ # Core library unit tests
│ └── output/ # Test intermediate files
├── docs/ # Documentation and specifications
└── ...
Technology Stack
- Rust (Fast, safe, cross-platform)
serde_json
,serde_yml
,toml
,configparser
,quick-xml
,csv
parsersclap
(CLI argument parsing)colored
(CLI output coloring)similar
(Unified Format output)
🔗 Meta-Chaining
Compare diff reports to track how changes evolve over time:
graph LR
A[config_v1.json] --> D1[diffx]
B[config_v2.json] --> D1
D1 --> R1[diff_report_v1.json]
B --> D2[diffx]
C[config_v3.json] --> D2
D2 --> R2[diff_report_v2.json]
R1 --> D3[diffx]
R2 --> D3
D3 --> M[Meta-Diff Report]
🚀 Quick Start
Installation
# Rust (recommended - native performance)
# Node.js ecosystem (⚡ offline-ready with all platform binaries)
# Python ecosystem (🆕 self-contained wheel with embedded binary)
# Or download pre-built binaries from GitHub Releases
For detailed usage and examples, see the documentation.
Quick Documentation Links
- Getting Started - Learn the basics
- Installation Guide - Platform-specific setup
- CLI Reference - Complete command reference
- Real-World Examples - Industry use cases
- Integration Guide - CI/CD and automation
Basic Usage
# Compare JSON files
# Compare with different output formats
# Advanced filtering options
# High-demand practical options
&&
# Performance optimization for large files
# Directory comparison
# Meta-chaining for change tracking
Integration Examples
CI/CD Pipeline:
- name: Check configuration changes
run: |
diffx config/prod.yaml config/staging.yaml --output json > changes.json
# Process changes.json for deployment validation
- name: Quick file change detection
run: |
if ! diffx config/current.json config/new.json --quiet; then
echo "Configuration changed, triggering deployment"
fi
- name: Compare with ignore options for cleaner diffs
run: |
diffx api_old.json api_new.json --ignore-case --ignore-whitespace --output json > api_changes.json
# Focus on semantic changes, ignore formatting
- name: Compare large datasets efficiently
run: |
diffx large_prod_data.json large_staging_data.json --output json > data_changes.json
# Optimized processing for large files in CI
Git Hook:
#!/bin/bash
# pre-commit hook
if | ; then
fi
🌍 Multi-Language Support
diffx is available across multiple ecosystems:
# Rust (native CLI)
# Node.js wrapper
# Python wrapper
All packages provide the same semantic diff capabilities:
- Rust: Source-based compilation
- npm: Universal package with all platform binaries (offline-ready)
- Python: Self-contained wheels with embedded binaries
🔮 Future Plans
- Interactive TUI (
diffx-tui
): A powerful viewer showcasing diffx capabilities with side-by-side data display - AI agent integration: Automated diff summarization and explanation
- Web UI version (
diffx-web
) - VSCode extension (
diffx-vscode
) - Advanced CI/CD templates: Pre-built workflows for common use cases
🤝 Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
📄 License
MIT License - see LICENSE for details.