lling-llang 0.1.0

WFST framework for text normalization and grammar correction
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
# Path Extraction Algorithms

Path extraction finds optimal paths through lattices. lling-llang provides three algorithms with different trade-offs between exactness, speed, and memory usage.

## Concepts

### What is Path Extraction?

Given a lattice with weighted edges, path extraction finds the "best" paths from start to end. What "best" means depends on the semiring:

- **TropicalWeight**: Lowest total cost (sum of edge weights)
- **LogWeight**: Highest probability (product of probabilities)
- **BoolWeight**: Any valid path

```
            ┌───the(0.5)───┐
   start ──►│              ├───quick(0.5)───►end
            └───teh(0.0)───┘

   Path 1: the + quick = 0.5 + 0.5 = 1.0 (tropical)
   Path 2: teh + quick = 0.0 + 0.5 = 0.5 (tropical)

   Best path (lowest weight): "teh quick" (0.5)
```

### Algorithm Selection Guide

| Algorithm | Time | Space | Exact? | Use Case |
|-----------|------|-------|--------|----------|
| Viterbi | O(V + E) | O(V) | Yes | Single best path |
| N-best | O(k log k)* | O(k × L) | Yes | Top-k paths |
| Beam search | O(V × B × D) | O(B × L) | No | Approximate top-k |

Where:
- V = nodes, E = edges, L = path length
- k = number of paths, B = beam width, D = average out-degree
- *After initial O(V + E) topological sort

### Core Types

| Type | Description |
|------|-------------|
| `LatticePath<W>` | A path through the lattice with edges and weight |
| `ViterbiResult<W>` | Result of Viterbi decoding |
| `NBestIterator<W, B>` | Lazy iterator for N-best paths |
| `BeamSearchConfig` | Configuration for beam search |

## Viterbi Algorithm

The **Viterbi algorithm** finds the single best path using dynamic programming. It's the go-to algorithm when you only need the optimal path.

### How It Works

1. **Forward pass**: Process nodes in topological order, computing the best score to reach each node
2. **Backward pass**: Trace back from the end node to reconstruct the path

```rust
use lling_llang::path::viterbi;
use lling_llang::lattice::{LatticeBuilder, EdgeMetadata};
use lling_llang::backend::HashMapBackend;
use lling_llang::semiring::TropicalWeight;

let backend = HashMapBackend::new();
let mut builder = LatticeBuilder::new(backend);

builder.add_correction(0, 1, "the", TropicalWeight::new(0.5), EdgeMetadata::default());
builder.add_correction(0, 1, "a", TropicalWeight::new(1.0), EdgeMetadata::default());
builder.add_correction(1, 2, "quick", TropicalWeight::new(0.5), EdgeMetadata::default());

let mut lattice = builder.build(2);
let result = viterbi(&mut lattice);

if result.success {
    println!("Best path: {:?}", result.path.to_words(&lattice));
    println!("Total weight: {:?}", result.path.weight.value());
}
```

### ViterbiResult

```rust
pub struct ViterbiResult<W: Semiring> {
    /// The best path through the lattice.
    pub path: LatticePath<W>,
    /// Whether a valid path was found.
    pub success: bool,
}
```

**When `success` is `false`**:
- The lattice has no path from start to end
- The lattice contains a cycle (not a DAG)
- The end node is unreachable

### Complexity

- **Time**: O(V + E) - each node and edge visited exactly once
- **Space**: O(V) - stores best score and backpointer for each node

### When to Use Viterbi

- You only need the single best path
- You need guaranteed optimality
- Memory is not constrained

## N-Best Extraction

**N-best extraction** finds the top-k paths in order of weight. It uses a priority queue for lazy enumeration, based on Huang & Chiang (2005).

### How It Works

1. Initialize priority queue with the start node
2. Pop the best partial path
3. If it reaches the end, yield it
4. Otherwise, extend it with all outgoing edges and push back
5. Repeat until k paths found or queue exhausted

```rust
use lling_llang::path::nbest;

let mut lattice = build_lattice();  // Your lattice

// Get top 5 paths
let top5 = nbest(&mut lattice, 5);

for (i, path) in top5.iter().enumerate() {
    println!("Path {}: {:?} (weight: {:.2})",
        i + 1,
        path.to_words(&lattice),
        path.weight.value());
}
```

### Lazy Iterator

For streaming results, use `NBestIterator` directly:

```rust
use lling_llang::path::NBestIterator;

let lattice = build_lattice();
let mut iter = NBestIterator::new(&lattice, 100);

// Process paths as they're discovered
for path in iter {
    let words = path.to_words(&lattice);
    if is_acceptable(&words) {
        println!("Found acceptable path: {:?}", words);
        break;  // Stop early
    }
}
```

### Complexity

- **Time**: O(k log k) for extracting k paths (after O(V+E) setup)
- **Space**: O(k × L) where L is average path length

The key insight is that paths are enumerated lazily - if you only need 3 paths, only those 3 are fully computed.

### When to Use N-Best

- You need exactly the top-k paths
- You can afford the memory for storing partial paths
- Path diversity matters (e.g., showing alternatives to users)

## Beam Search

**Beam search** is an approximate algorithm that limits memory by keeping only the top `beam_width` hypotheses at each step.

### How It Works

1. Start with single hypothesis at start node
2. For each step:
   a. Expand all hypotheses with outgoing edges
   b. Sort by weight
   c. Keep only top `beam_width` hypotheses
3. Collect completed paths when they reach the end

```rust
use lling_llang::path::beam_search;

let mut lattice = build_lattice();

// Beam search with width 10
let paths = beam_search(&mut lattice, 10);

for path in &paths {
    println!("{:?}", path.to_words(&lattice));
}
```

### Configuration

For more control, use `BeamSearchConfig`:

```rust
use lling_llang::path::{beam_search_with_config, BeamSearchConfig};

let config = BeamSearchConfig::new(20)  // beam width 20
    .with_max_results(5)                 // return at most 5 paths
    .with_duplicates(false);             // no duplicate word sequences

let paths = beam_search_with_config(&mut lattice, config);
```

**Configuration options**:

| Option | Default | Description |
|--------|---------|-------------|
| `beam_width` | 10 | Max hypotheses per step |
| `max_results` | 10 | Max paths to return |
| `allow_duplicates` | false | Allow same word sequence |

### Complexity

- **Time**: O(V × B × D) where B = beam width, D = average out-degree
- **Space**: O(B × L) where L = path length

### Trade-offs

**Advantages**:
- Bounded memory usage
- Fast for large lattices
- Good approximation in practice

**Disadvantages**:
- May miss the optimal path
- No guarantees on path ranking
- Sensitive to beam width choice

### When to Use Beam Search

- Memory is constrained
- Approximate solutions are acceptable
- Lattice has many alternatives per position

## LatticePath

All algorithms return `LatticePath` objects:

```rust
pub struct LatticePath<W: Semiring> {
    /// The edges traversed in order.
    pub edges: SmallVec<[EdgeId; 16]>,
    /// The accumulated weight of the path.
    pub weight: W,
    /// Whether this is a complete path (reaches the end node).
    pub is_complete: bool,
}

impl<W: Semiring> LatticePath<W> {
    /// Get the number of edges in the path.
    pub fn len(&self) -> usize;

    /// Check if the path is empty.
    pub fn is_empty(&self) -> bool;

    /// Get vocabulary IDs for this path.
    pub fn labels<'a, B>(&'a self, lattice: &'a Lattice<W, B>) -> impl Iterator<Item = VocabId>;

    /// Get words for this path.
    pub fn words<'a, B>(&'a self, lattice: &'a Lattice<W, B>) -> impl Iterator<Item = &'a str>;

    /// Convert to a vector of words.
    pub fn to_words<B>(&self, lattice: &Lattice<W, B>) -> Vec<String>;
}
```

### Path Inspection

```rust
let result = viterbi(&mut lattice);
let path = result.path;

// Check completion
assert!(path.is_complete);

// Get weight
println!("Weight: {}", path.weight.value());

// Get length
println!("Edges: {}", path.len());

// Get words
let words = path.to_words(&lattice);
println!("Words: {:?}", words);

// Iterate over vocabulary IDs
for vocab_id in path.labels(&lattice) {
    println!("VocabId: {}", vocab_id);
}
```

## Details

### Topological Order Requirement

All path extraction algorithms require the lattice to be a DAG (directed acyclic graph). This is checked via topological sorting:

```rust
let order = lattice.topological_order();
if order.is_none() {
    // Lattice contains a cycle - cannot extract paths
}
```

Lattices built with `LatticeBuilder` are always DAGs (edges can only go forward in position).

### Semiring Operations

Path extraction uses semiring operations to combine weights:

```rust
// Sequential edges: use times (⊗)
path_weight = edge1.weight.times(&edge2.weight);

// For TropicalWeight: 0.5 ⊗ 0.3 = 0.5 + 0.3 = 0.8

// Parallel paths: use plus (⊕) in forward scores
best = old_score.plus(&new_score);

// For TropicalWeight: 1.0 ⊕ 0.8 = min(1.0, 0.8) = 0.8
```

### Natural Ordering

The `natural_less` method determines what "better" means:

```rust
// For TropicalWeight: lower is better
assert!(TropicalWeight::new(0.5).natural_less(&TropicalWeight::new(1.0)) == Some(true));

// For BoolWeight: true is better than false
assert!(BoolWeight::new(true).natural_less(&BoolWeight::new(false)) == Some(true));
```

### Empty Lattice Handling

All algorithms handle empty lattices (start == end):

```rust
let builder = LatticeBuilder::new(backend);
let mut lattice = builder.build(0);  // Empty lattice

let result = viterbi(&mut lattice);
assert!(result.success);  // Empty path is valid
assert!(result.path.is_empty());
```

### SmallVec Optimization

`LatticePath` uses `SmallVec<[EdgeId; 16]>` for edges:
- Paths with ≤16 edges use stack allocation
- Longer paths automatically spill to heap
- Avoids allocation for typical short paths

## Common Patterns

### Early Termination

Stop N-best when finding an acceptable path:

```rust
let acceptable = NBestIterator::new(&lattice, 1000)
    .find(|path| {
        let words = path.to_words(&lattice);
        is_grammatically_correct(&words)
    });
```

### Comparing Algorithms

Run all three for analysis:

```rust
use std::time::Instant;

// Viterbi (single best)
let start = Instant::now();
let viterbi_result = viterbi(&mut lattice);
let viterbi_time = start.elapsed();

// N-best (exact top 10)
let start = Instant::now();
let nbest_result = nbest(&mut lattice, 10);
let nbest_time = start.elapsed();

// Beam search (approximate top 10)
let start = Instant::now();
let beam_result = beam_search(&mut lattice, 10);
let beam_time = start.elapsed();

println!("Viterbi: {:?} ({:?})", viterbi_result.path.weight, viterbi_time);
println!("N-best: {} paths ({:?})", nbest_result.len(), nbest_time);
println!("Beam: {} paths ({:?})", beam_result.len(), beam_time);
```

### Path Diversity

Get diverse paths by filtering duplicates:

```rust
use std::collections::HashSet;

let mut seen: HashSet<Vec<String>> = HashSet::new();
let diverse_paths: Vec<_> = NBestIterator::new(&lattice, 100)
    .filter(|path| {
        let words = path.to_words(&lattice);
        seen.insert(words)  // Returns false if already seen
    })
    .take(10)
    .collect();
```

### Batch Processing

Process multiple lattices efficiently:

```rust
fn process_batch(lattices: &mut [Lattice<TropicalWeight, HashMapBackend>]) -> Vec<Vec<String>> {
    lattices.iter_mut()
        .map(|lattice| {
            let result = viterbi(lattice);
            if result.success {
                result.path.to_words(lattice)
            } else {
                Vec::new()
            }
        })
        .collect()
}
```

## Next Steps

- [Composition]composition.md: Lazy lattice-grammar composition
- [Topological Sort]topological-sort.md: DAG ordering algorithm
- [Lattices]../architecture/lattices.md: Lattice construction
- [Semirings]../architecture/semirings.md: Weight algebra