xml-3dm 0.1.1

3-way XML diff and merge tool
Documentation
# xml-3dm

A Rust implementation of the 3DM (3-way Diff/Merge) algorithm for structure-aware XML differencing and merging.

## Overview

`xml-3dm` is a library for performing 3-way merges on XML documents. The library operates on XML tree structure and handles the following scenarios:

- Merging changes when one branch moves content and another edits it
- Detecting and preserving copy operations across the tree
- Resolving conflicts at the element and attribute level
- Generating compact diff representations with copy detection

This is a Rust port of the original [3DM tool](https://www.cs.hut.fi/~ctl/3dm/) created by Tancred Lindholm as part of his Master's thesis at Helsinki University of Technology.

## Features

- **Structure-aware merging**: Understands XML tree structure, not just text lines
- **Heuristic matching**: Uses Q-gram distance and tree structure to match corresponding nodes
- **Copy detection**: Identifies copied subtrees and encodes them efficiently
- **Conflict resolution**: Provides detailed conflict and warning logs
- **Full XML support**: Handles elements, attributes, text nodes, and mixed content

## Installation

Add this to your `Cargo.toml`:

```toml
[dependencies]
xml-3dm = "0.1"
```

## Usage

### 3-Way Merge

```rust
use xml_3dm::{parse_file, BaseNodeFactory, BranchNodeFactory, XmlParser, Merge, TriMatching};
use std::io;

// Parse the three XML files
let base_parser = XmlParser::new(BaseNodeFactory);
let branch_parser = XmlParser::new(BranchNodeFactory);

let base = base_parser.parse_file("base.xml")?;
let left = branch_parser.parse_file("branch1.xml")?;
let right = branch_parser.parse_file("branch2.xml")?;

// Build matching between the three trees
let matching = TriMatching::new(left, base, right);

// Perform the merge
let mut merger = Merge::new(matching);
merger.merge(&mut io::stdout())?;

// Check for conflicts
if merger.conflict_log.has_conflicts() {
    eprintln!("Conflicts detected:");
    for conflict in merger.conflict_log.conflicts() {
        eprintln!("  {}", conflict);
    }
}
```

### Diff Generation

```rust
use xml_3dm::{parse_file, BaseNodeFactory, BranchNodeFactory, XmlParser, Diff, DiffMatching};
use std::io;

// Parse base and modified versions
let base_parser = XmlParser::new(BaseNodeFactory);
let branch_parser = XmlParser::new(BranchNodeFactory);

let base = base_parser.parse_file("base.xml")?;
let branch = branch_parser.parse_file("modified.xml")?;

// Build matching optimized for diff
let matching = DiffMatching::new(base, branch);

// Generate diff
if let Some(diff) = Diff::new(&matching) {
    diff.diff(&mut io::stdout())?;
}
```

### Patch Application

```rust
use xml_3dm::{parse_file, BaseNodeFactory, XmlParser, Patch};
use std::io;

// Parse base and diff
let parser = XmlParser::new(BaseNodeFactory);
let base = parser.parse_file("base.xml")?;
let diff = parser.parse_file("changes.xml")?;

// Apply patch
let patcher = Patch::new();
patcher.patch(&base, &diff, &mut io::stdout())?;
```

## Algorithm

The 3DM algorithm works in several phases:

1. **Parsing**: Builds tree representations of the XML documents
2. **Matching**: Uses heuristic matching to find corresponding nodes across trees
   - Exact matching by content hash
   - Fuzzy matching using Q-gram distance
   - Structural matching of children
3. **Merge**: Combines changes from both branches relative to the base
   - Detects inserts, deletes, moves, and updates
   - Resolves operation conflicts using a conflict table
   - Merges element content and attributes
4. **Output**: Serializes the merged tree back to XML

## Configuration

### Copy Detection Threshold

The minimum size (in information bytes) for copy detection can be configured:

```rust
use xml_3dm::TriMatching;

// Disable copy detection
let matching = TriMatching::with_copy_threshold(left, base, right, 0);

// Increase threshold (default is 128)
let matching = TriMatching::with_copy_threshold(left, base, right, 256);
```

### Custom Matching Algorithms

You can provide custom matching implementations:

```rust
use xml_3dm::{TriMatching, HeuristicMatching};

let left_matching = HeuristicMatching::new(128);
let right_matching = HeuristicMatching::new(128);

let matching = TriMatching::with_matching(
    left,
    base,
    right,
    left_matching,
    right_matching,
);
```

## Limitations

- No namespace awareness (attributes treated as simple name-value pairs)
- Processing instructions are not preserved (comments ARE preserved)
- Limited support for mixed content (element + text siblings)
- Requires well-formed XML input

## License

Licensed under the GNU Lesser General Public License v2.1 or later (LGPL-2.1-or-later), matching the original 3DM implementation.

## Credits

Original 3DM implementation by Tancred Lindholm (Helsinki University of Technology, 2001).