diffo 0.2.0

Semantic diffing for Rust structs via serde
Documentation
# diffo

**Semantic diffing for Rust structs via serde.**

[![Crates.io](https://img.shields.io/crates/v/diffo)](https://crates.io/crates/diffo)
[![Documentation](https://docs.rs/diffo/badge.svg)](https://docs.rs/diffo)
[![License](https://img.shields.io/crates/l/diffo)](LICENSE)

Compare any two Rust values that implement `Serialize` and get a detailed, human-readable diff showing exactly what changed. Perfect for audit logs, API testing, config validation, and debugging.

## Installation

```toml
[dependencies]
diffo = "0.2"
```

## Quick Start

```rust
use diffo::diff;
use serde::Serialize;

#[derive(Serialize)]
struct User {
    name: String,
    email: String,
}

let old = User {
    name: "Alice".into(),
    email: "alice@old.com".into()
};

let new = User {
    name: "Alice Smith".into(),
    email: "alice@new.com".into()
};

let d = diff(&old, &new).unwrap();
println!("{}", d.to_pretty());
```

**Output:**
```
name
  - "Alice"
  + "Alice Smith"
email
  - "alice@old.com"
  + "alice@new.com"
```

## Common Use Cases

- **Audit Logs** - Track what changed in your entities over time
- **API Testing** - Compare expected vs actual API responses
- **Config Validation** - Detect configuration drift between environments
- **Undo/Redo** - Apply diffs to reconstruct previous states
- **Database Migrations** - Verify data transformations
- **CI/CD** - Catch unintended changes in generated files

## Features

- 🚀 **Zero boilerplate** - works with any `Serialize` type
- 🎯 **Path-based changes** - precise paths like `user.roles[2].name`
- 🎨 **Multiple formats** - pretty-print, JSON, JSON Patch (RFC 6902), Markdown
- 🔒 **Secret masking** - hide sensitive fields like passwords
- 🔧 **Configurable** - float tolerance, depth limits, collection size caps
- 🧬 **Multiple diff algorithms** - choose between speed and accuracy
- 🎭 **Custom comparators** - define your own equality logic per path
- 🔄 **Diff application** - apply diffs to produce new values
- ⚡ **Fast** - optimized allocations and configurable algorithms

## Examples

### Working with Diffs

```rust
use diffo::diff;
use serde::Serialize;

#[derive(Serialize)]
struct User {
    name: String,
    email: String,
}

let old = User {
    name: "Alice".into(),
    email: "alice@old.com".into()
};

let new = User {
    name: "Alice Smith".into(),
    email: "alice@new.com".into()
};

let d = diff(&old, &new).unwrap();

// Check what changed
if !d.is_empty() {
    println!("Found {} changes", d.len());

    // Check specific paths
    if let Some(change) = d.get("email") {
        println!("Email changed: {:?}", change);
    }
}
```

### Applying Diffs (Undo/Redo)

```rust
use diffo::{diff, apply};
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize, PartialEq)]
struct Config {
    version: String,
    enabled: bool,
}

let v1 = Config { version: "1.0".into(), enabled: true };
let v2 = Config { version: "2.0".into(), enabled: false };

// Create diff from v1 to v2
let forward = diff(&v1, &v2).unwrap();

// Apply forward diff: v1 + diff = v2
let result: Config = apply(&v1, &forward).unwrap();
assert_eq!(result, v2);

// Create reverse diff for undo
let backward = diff(&v2, &v1).unwrap();
let undone: Config = apply(&v2, &backward).unwrap();
assert_eq!(undone, v1);
```

### Masking Secrets

```rust
use diffo::{diff_with, DiffConfig};

let config = DiffConfig::new()
    .mask("*.password")
    .mask("api.secret");

let diff = diff_with(&old, &new, &config).unwrap();
```

### Float Tolerance

```rust
use diffo::DiffConfig;

let config = DiffConfig::new()
    .float_tolerance("metrics.*", 1e-6)
    .default_float_tolerance(1e-9);

let diff = diff_with(&old, &new, &config).unwrap();
```

### Sequence Diff Algorithms

Choose the right algorithm for your use case:

```rust
use diffo::{DiffConfig, SequenceDiffAlgorithm};

let config = DiffConfig::new()
    // IndexBased: O(n) simple comparison (default, backward compatible)
    .default_sequence_algorithm(SequenceDiffAlgorithm::IndexBased)

    // Myers: O(ND) optimal edit distance
    .sequence_algorithm("commits", SequenceDiffAlgorithm::Myers)

    // Patience: O(N log N) human-intuitive, great for code/reorderings
    .sequence_algorithm("users", SequenceDiffAlgorithm::Patience);

let diff = diff_with(&old, &new, &config).unwrap();
```

**Algorithm comparison:**
- **IndexBased** (default): Simple index-by-index comparison. Fast and predictable. Best for simple changes.
- **Myers**: Finds minimal edit distance. Best when you need optimal/shortest diff.
- **Patience**: Uses unique elements as anchors. Best for code diffs and when order changes significantly.

### Custom Comparators

Define custom comparison logic for specific paths:

```rust
use diffo::{DiffConfig, ValueExt};
use std::rc::Rc;

let config = DiffConfig::new()
    // Ignore timestamp fields
    .comparator("", Rc::new(|old, new| {
        old.get_field("id") == new.get_field("id") &&
        old.get_field("name") == new.get_field("name")
        // timestamp field ignored
    }))

    // Case-insensitive URL comparison
    .comparator("url", Rc::new(|old, new| {
        if let (Some(a), Some(b)) = (old.as_string(), new.as_string()) {
            a.to_lowercase() == b.to_lowercase()
        } else {
            old == new
        }
    }));

let diff = diff_with(&old, &new, &config).unwrap();
```

**Use cases:**
- Ignore volatile fields (timestamps, cache values)
- Compare by ID only (treat objects equal if IDs match)
- Case-insensitive comparisons
- Normalized URL comparisons

**Array element limitation:**
Due to glob pattern syntax, array indices like `[0]`, `[1]` require workaround patterns:
```rust
// For small, known-size arrays
.comparator("?0?", comparator)  // Matches [0]
.comparator("?1?", comparator)  // Matches [1]
// Note: ?0? = 3 chars, so won't match [10] (4 chars)
```
For dynamic arrays, consider comparing at the parent level or using `.ignore()` for specific indices.

### Output Formats

```rust
let diff = diff(&old, &new).unwrap();

// Pretty format (human-readable)
println!("{}", diff.to_pretty());

// JSON format
let json = diff.to_json()?;

// JSON Patch (RFC 6902)
let patch = diff.to_json_patch()?;

// Markdown table (great for PRs)
let markdown = diff.to_markdown()?;
```

**JSON Patch Output:**
```json
[
  {
    "op": "replace",
    "path": "/name",
    "value": "Alice Smith"
  },
  {
    "op": "replace",
    "path": "/email",
    "value": "alice@new.com"
  }
]
```

**Markdown Output:**

| Path | Change | Old Value | New Value |
|------|--------|-----------|-----------|
| name | Modified | "Alice" | "Alice Smith" |
| email | Modified | "alice@old.com" | "alice@new.com" |

## Advanced Configuration

```rust
use diffo::{DiffConfig, SequenceDiffAlgorithm};

let config = DiffConfig::new()
    // Ignore specific paths
    .ignore("*.internal")
    .ignore("metadata.timestamp")

    // Mask sensitive data
    .mask("*.password")
    .mask("*.secret")

    // Float comparison tolerance
    .float_tolerance("metrics.*.value", 1e-6)
    .default_float_tolerance(1e-9)

    // Limit depth (prevent stack overflow)
    .max_depth(32)

    // Limit collection size
    .collection_limit(500)

    // Sequence diff algorithms (per-path or default)
    .sequence_algorithm("logs", SequenceDiffAlgorithm::IndexBased)
    .sequence_algorithm("commits", SequenceDiffAlgorithm::Myers)
    .default_sequence_algorithm(SequenceDiffAlgorithm::Patience);

let diff = diff_with(&old, &new, &config)?;
```

## Edge Cases Handled

- **Floats**: NaN == NaN, -0.0 == +0.0
- **Large collections**: Automatic elision with configurable limits
- **Large byte arrays**: Hex previews instead of full dumps
- **Deep nesting**: Configurable depth limits
- **Type mismatches**: Clear reporting when types differ

## Performance

Diffo is designed for production use with configurable algorithms:

- **IndexBased**: O(n) - fastest, simple comparison
- **Myers**: O(ND) - optimal edit distance (D = number of differences)
- **Patience**: O(N log N) - human-intuitive results
- **Efficient** path representation with BTreeMap

**Algorithm selection guide:**
- Use **IndexBased** for simple cases or when speed is critical
- Use **Myers** when you need the minimal/optimal diff
- Use **Patience** for code, reorderings, or human review

## Comparison with Alternatives

| Feature | diffo | serde-diff | json-diff |
|---------|-------|------------|-----------|
| Path notation ||||
| Secret masking ||||
| JSON Patch (RFC 6902) ||||
| Float tolerance ||||
| Works with any Serialize ||| JSON only |
| Multiple formatters ||||
| Collection limits ||||
| Multiple diff algorithms ||||
| Custom comparators ||||

## Roadmap

### v0.2 ✅
- [x] Multiple sequence diff algorithms (IndexBased, Myers, Patience)
- [x] Per-path algorithm configuration
- [x] Custom comparator functions per path
- [x] Diff application (apply changes to produce new value)

### Future releases
- [ ] Path interning for performance
- [ ] Streaming diff for very large structures

### v1.0
- [ ] Stable API guarantee
- [ ] Comprehensive benchmarks vs alternatives
- [ ] Performance optimization guide

## Contributing

Contributions are welcome! Please open an issue or PR on GitHub.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

Built with:
- [serde]https://serde.rs/ - Serialization framework
- [serde-value]https://github.com/arcnmx/serde-value - Generic value representation