tldr-core 0.1.4

Core analysis engine for TLDR code analysis tool
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
# Dataflow Analysis Specification

**Created**: 2026-02-03
**Author**: architect-agent
**Source**: Gap Analysis session13-available-abstract-gap.yaml
**Total Capabilities**: 34 (12 Available Expressions + 22 Abstract Interpretation)

## Overview

This specification defines the Rust port of two dataflow analyses from tldr-code v2:

1. **Available Expressions Analysis**: Forward MUST (intersection) dataflow analysis for CSE detection
2. **Abstract Interpretation**: Forward dataflow with widening for range tracking and safety checks

Both analyses build on the existing `tldr-core` infrastructure:
- `cfg` module for control flow graphs
- `dfg` module for variable references (VarRef)
- `ssa` module for dominance and def-use chains

## Module Structure

```
src/dataflow/
├── mod.rs              # Module entry point, re-exports
├── types.rs            # Shared types (BlockId, predecessors helper)
├── available.rs        # Available expressions analysis (CAP-AE-01 through CAP-AE-12)
├── abstract_interp.rs  # Abstract interpretation (CAP-AI-01 through CAP-AI-22)
└── tests/
    ├── available_test.rs
    └── abstract_interp_test.rs
```

---

## Part 1: Available Expressions Analysis

### Data Types

#### CAP-AE-01: Expression (frozen/hashable)

```rust
/// Represents a computed expression for availability analysis.
/// 
/// Immutable and hashable by text only (line-independent equality).
/// Two expressions with the same normalized text are equal regardless of line.
#[derive(Debug, Clone, Eq, Serialize, Deserialize)]
pub struct Expression {
    /// Normalized expression string (e.g., "a + b")
    pub text: String,
    
    /// Variables used in this expression (sorted for consistency)
    pub operands: Vec<String>,
    
    /// Line where expression first appears
    pub line: u32,
}

impl PartialEq for Expression {
    fn eq(&self, other: &Self) -> bool {
        self.text == other.text  // Equality by text only
    }
}

impl Hash for Expression {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.text.hash(state);  // Hash by text only
    }
}

impl Expression {
    /// Check if redefining var kills this expression.
    /// 
    /// An expression is killed when any operand is redefined.
    pub fn is_killed_by(&self, var: &str) -> bool {
        self.operands.iter().any(|op| op == var)
    }
}
```

#### CAP-AE-02: Commutative Expression Normalization

```rust
/// Commutative operators that allow operand reordering.
pub const COMMUTATIVE_OPS: &[&str] = &["+", "*", "==", "!=", "and", "or", "&", "|", "^"];

/// Normalize binary expression to canonical form.
/// 
/// For commutative operators, sort operands alphabetically.
/// This ensures "a + b" and "b + a" produce the same text.
pub fn normalize_expression(op: &str, left: &str, right: &str) -> String {
    if COMMUTATIVE_OPS.contains(&op) {
        let mut operands = [left.trim(), right.trim()];
        operands.sort();
        format!("{} {} {}", operands[0], op, operands[1])
    } else {
        format!("{} {} {}", left.trim(), op, right.trim())
    }
}
```

#### CAP-AE-03 to CAP-AE-07: BlockExpressions (Gen/Kill)

```rust
/// Expressions generated and killed in a single CFG block.
#[derive(Debug, Default)]
struct BlockExpressions {
    /// Expressions computed in this block (before any operand is killed)
    gen: HashSet<Expression>,
    
    /// Variables redefined in this block (kills expressions using that var)
    kill: HashSet<String>,
}
```

#### CAP-AE-08 to CAP-AE-11: AvailableExprsInfo

```rust
/// Available expressions analysis results.
/// 
/// An expression is available at a point if:
/// 1. It has been computed on EVERY path reaching that point (MUST analysis)
/// 2. None of its operands have been redefined since computation
#[derive(Debug, Serialize, Deserialize)]
pub struct AvailableExprsInfo {
    /// Expressions available at block entry.
    /// avail_in[b] = intersection(avail_out[p] for p in predecessors[b])
    pub avail_in: HashMap<u32, HashSet<Expression>>,
    
    /// Expressions available at block exit.
    /// avail_out[b] = gen[b] | (avail_in[b] - killed_by_block[b])
    pub avail_out: HashMap<u32, HashSet<Expression>>,
    
    /// All unique expressions found in the function
    pub all_exprs: HashSet<Expression>,
    
    /// Entry block ID
    pub entry_block: u32,
    
    /// All expression instances including duplicates (for CSE detection)
    #[serde(skip)]
    pub expr_instances: Vec<Expression>,
}

impl AvailableExprsInfo {
    /// Check if expression is available at entry to block.
    pub fn is_available(&self, block: u32, expr: &Expression) -> bool {
        self.avail_in.get(&block).map_or(false, |set| set.contains(expr))
    }
    
    /// Check if expression is available at exit of block.
    pub fn is_available_at_exit(&self, block: u32, expr: &Expression) -> bool {
        self.avail_out.get(&block).map_or(false, |set| set.contains(expr))
    }
    
    /// CAP-AE-06: Find expressions computed when already available (CSE opportunities).
    /// 
    /// Returns: Vec<(expr_text, original_line, redundant_line)>
    pub fn redundant_computations(&self) -> Vec<(String, u32, u32)> {
        // Implementation handles intra-block kills carefully
        // See Python source lines 162-217 for algorithm
    }
    
    /// CAP-AE-10: Get expressions available at a specific source line.
    pub fn get_available_at_line(&self, line: u32, cfg: &CfgInfo) -> HashSet<Expression> {
        for block in &cfg.blocks {
            if block.start_line <= line && line <= block.end_line {
                return self.avail_in.get(&block.id).cloned().unwrap_or_default();
            }
        }
        HashSet::new()
    }
    
    /// CAP-AE-11: Serialize to JSON-compatible structure.
    pub fn to_json(&self) -> serde_json::Value {
        // Output format:
        // {
        //   "avail_in": {"0": [{"text": "a + b", "operands": ["a", "b"], "line": 2}], ...},
        //   "avail_out": {...},
        //   "all_expressions": [...],
        //   "entry_block": 0,
        //   "redundant_computations": [{"expr": "a + b", "first_at": 2, "redundant_at": 4}]
        // }
    }
}
```

### CAP-AE-04, CAP-AE-05: Algorithm (MUST Analysis with Fixpoint)

```rust
/// Compute available expressions using forward MUST dataflow analysis.
/// 
/// Algorithm:
/// 1. Extract all expressions and their operands from DFG
/// 2. Compute gen/kill sets for each block
/// 3. Initialize: entry = {}, others = ALL_EXPRS
/// 4. Iterate until fixed point:
///    - avail_in[b] = intersection(avail_out[p] for p in predecessors[b])
///    - avail_out[b] = gen[b] | (avail_in[b] - killed_by_block[b])
/// 5. Return AvailableExprsInfo with redundant_computations()
/// 
/// FM1 Mitigation: Commutative expression normalization
/// FM2 Mitigation: Function calls excluded from CSE (CAP-AE-12)
/// FM3 Mitigation: ALL_EXPRS computed before init, proper empty set for entry
pub fn compute_available_exprs(
    cfg: &CfgInfo,
    dfg: &DfgInfo,
) -> Result<AvailableExprsInfo, TldrError> {
    // Step 1: Extract expressions from variable uses (pairs on same line)
    let (all_exprs, block_info, expr_instances) = extract_expressions_from_refs(cfg, dfg)?;
    
    // Early return if no expressions
    if all_exprs.is_empty() {
        return Ok(AvailableExprsInfo::empty(cfg.entry_block_id));
    }
    
    // Step 2: Build predecessor map
    let predecessors = build_predecessors(cfg);
    
    // Step 3: Initialize (MUST analysis: start optimistic except entry)
    let mut avail_in: HashMap<u32, HashSet<Expression>> = HashMap::new();
    let mut avail_out: HashMap<u32, HashSet<Expression>> = HashMap::new();
    
    let entry = cfg.entry_block_id;
    avail_in.insert(entry, HashSet::new());  // Nothing available at entry
    avail_out.insert(entry, block_info[&entry].gen.clone());
    
    for block in &cfg.blocks {
        if block.id != entry {
            avail_in.insert(block.id, all_exprs.clone());
            avail_out.insert(block.id, all_exprs.clone());
        }
    }
    
    // Step 4: Iterate until fixed point
    let mut changed = true;
    let max_iterations = cfg.blocks.len() * all_exprs.len() + 10;
    let mut iteration = 0;
    
    while changed && iteration < max_iterations {
        changed = false;
        iteration += 1;
        
        for block in &cfg.blocks {
            if block.id == entry {
                continue;
            }
            
            // avail_in = INTERSECTION of all predecessor's avail_out
            let preds = predecessors.get(&block.id).unwrap_or(&vec![]);
            let new_in = if preds.is_empty() {
                HashSet::new()
            } else {
                preds.iter()
                    .map(|p| avail_out.get(p).unwrap())
                    .fold(avail_out[&preds[0]].clone(), |acc, set| {
                        acc.intersection(set).cloned().collect()  // MUST = intersection
                    })
            };
            
            // avail_out = gen | (avail_in - killed)
            let info = &block_info[&block.id];
            let killed: HashSet<_> = new_in.iter()
                .filter(|e| info.kill.iter().any(|k| e.is_killed_by(k)))
                .cloned()
                .collect();
            let new_out: HashSet<_> = info.gen.union(&new_in.difference(&killed).cloned().collect())
                .cloned()
                .collect();
            
            if avail_in[&block.id] != new_in || avail_out[&block.id] != new_out {
                changed = true;
                avail_in.insert(block.id, new_in);
                avail_out.insert(block.id, new_out);
            }
        }
    }
    
    Ok(AvailableExprsInfo {
        avail_in,
        avail_out,
        all_exprs,
        entry_block: entry,
        expr_instances,
    })
}
```

### Behavioral Contracts

| ID | Contract | Test Case |
|----|----------|-----------|
| CAP-AE-01 | Expression equality based on text only | `expr1(line=1) == expr2(line=5)` if same text |
| CAP-AE-02 | Commutative normalization | `"a + b"` equals `"b + a"` after normalization |
| CAP-AE-03 | Gen: expressions before operand killed | `x=a+b; a=5` -> gen contains `a+b` with original line |
| CAP-AE-04 | MUST semantics: intersection at joins | Diamond CFG: single-branch expr NOT available at merge |
| CAP-AE-05 | Fixpoint terminates | Max iterations = blocks * expressions + 10 |
| CAP-AE-06 | redundant_computations() accuracy | `(expr_text, first_line, redundant_line)` tuples |
| CAP-AE-07 | Intra-block kill handling | `x=a+b; a=5; y=a+b` -> second NOT redundant |
| CAP-AE-08 | is_available() at entry | True iff expr in avail_in[block] |
| CAP-AE-09 | is_available_at_exit() | True iff expr in avail_out[block] |
| CAP-AE-10 | get_available_at_line() | Maps line to containing block's avail_in |
| CAP-AE-11 | to_json() serializable | All fields JSON-compatible |
| CAP-AE-12 | Function calls excluded | `foo(x)` not tracked as available expression |

---

## Part 2: Abstract Interpretation Analysis

### Data Types

#### CAP-AI-01: Nullability Enum

```rust
/// Nullability lattice: NEVER < MAYBE < ALWAYS
/// 
/// Used to track whether a variable may be null/None at a program point.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum Nullability {
    /// Definitely not null
    Never,
    /// Could be null or non-null
    Maybe,
    /// Definitely null
    Always,
}

impl Default for Nullability {
    fn default() -> Self {
        Nullability::Maybe
    }
}
```

#### CAP-AI-02 to CAP-AI-06: AbstractValue

```rust
/// Abstract representation of a variable's value at a program point.
/// 
/// Tracks four dimensions:
/// - type_: Inferred type (str, int, list, etc.) or None if unknown
/// - range_: Value range [min, max] for numeric types, None for unbounded
/// - nullable: Whether the value can be null/None
/// - constant: If value is a known constant, the value itself
#[derive(Debug, Clone, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub struct AbstractValue {
    /// Inferred type name (e.g., "int", "str") or None if unknown
    #[serde(rename = "type")]
    pub type_: Option<String>,
    
    /// Value range [min, max] for numerics. None bounds mean infinity.
    /// For strings, tracks length.
    pub range_: Option<(Option<i64>, Option<i64>)>,
    
    /// Nullability status
    pub nullable: Nullability,
    
    /// Known constant value (not hashed, used for propagation)
    #[serde(skip_serializing_if = "Option::is_none")]
    pub constant: Option<ConstantValue>,
}

/// Constant values that can be tracked
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
#[serde(untagged)]
pub enum ConstantValue {
    Int(i64),
    Float(f64),
    String(String),
    Bool(bool),
    Null,
}

impl AbstractValue {
    /// CAP-AI-04: Top of lattice - no information known
    pub fn top() -> Self {
        AbstractValue {
            type_: None,
            range_: None,
            nullable: Nullability::Maybe,
            constant: None,
        }
    }
    
    /// CAP-AI-04: Bottom of lattice - contradiction (unreachable)
    pub fn bottom() -> Self {
        AbstractValue {
            type_: Some("<bottom>".to_string()),
            range_: Some((None, None)),
            nullable: Nullability::Never,
            constant: None,
        }
    }
    
    /// CAP-AI-03: Create from known constant
    pub fn from_constant(value: ConstantValue) -> Self {
        match value {
            ConstantValue::Int(v) => AbstractValue {
                type_: Some("int".to_string()),
                range_: Some((Some(v), Some(v))),
                nullable: Nullability::Never,
                constant: Some(ConstantValue::Int(v)),
            },
            ConstantValue::Float(v) => AbstractValue {
                type_: Some("float".to_string()),
                range_: None,  // Float ranges less useful
                nullable: Nullability::Never,
                constant: Some(ConstantValue::Float(v)),
            },
            ConstantValue::String(ref s) => AbstractValue {
                type_: Some("str".to_string()),
                range_: Some((Some(s.len() as i64), Some(s.len() as i64))),  // CAP-AI-18: Track length
                nullable: Nullability::Never,
                constant: Some(value),
            },
            ConstantValue::Bool(v) => AbstractValue {
                type_: Some("bool".to_string()),
                range_: Some((Some(v as i64), Some(v as i64))),
                nullable: Nullability::Never,
                constant: Some(ConstantValue::Bool(v)),
            },
            ConstantValue::Null => AbstractValue {
                type_: Some("NoneType".to_string()),
                range_: None,
                nullable: Nullability::Always,
                constant: None,
            },
        }
    }
    
    /// CAP-AI-05: Check if value could be zero (for division check)
    pub fn may_be_zero(&self) -> bool {
        match &self.range_ {
            None => true,  // Unknown range, conservatively true
            Some((low, high)) => {
                let low = low.unwrap_or(i64::MIN);
                let high = high.unwrap_or(i64::MAX);
                low <= 0 && 0 <= high
            }
        }
    }
    
    /// CAP-AI-06: Check if value could be null/None
    pub fn may_be_null(&self) -> bool {
        self.nullable != Nullability::Never
    }
    
    /// Check if this is a known constant value
    pub fn is_constant(&self) -> bool {
        self.constant.is_some()
    }
}
```

#### CAP-AI-07: AbstractState

```rust
/// Abstract state at a program point: mapping from variables to abstract values.
#[derive(Debug, Clone, Default, PartialEq, Eq, Serialize, Deserialize)]
pub struct AbstractState {
    pub values: HashMap<String, AbstractValue>,
}

impl AbstractState {
    /// Get abstract value for variable, defaulting to top (unknown)
    pub fn get(&self, var: &str) -> AbstractValue {
        self.values.get(var).cloned().unwrap_or_else(AbstractValue::top)
    }
    
    /// Return new state with updated variable value (immutable style)
    pub fn set(&self, var: &str, value: AbstractValue) -> Self {
        let mut new_values = self.values.clone();
        new_values.insert(var.to_string(), value);
        AbstractState { values: new_values }
    }
    
    /// Create a copy of this state
    pub fn copy(&self) -> Self {
        self.clone()
    }
}
```

#### CAP-AI-21, CAP-AI-22: AbstractInterpInfo

```rust
/// Abstract interpretation analysis results for a function.
#[derive(Debug, Serialize, Deserialize)]
pub struct AbstractInterpInfo {
    /// Abstract state at entry of each block
    pub state_in: HashMap<u32, AbstractState>,
    
    /// Abstract state at exit of each block
    pub state_out: HashMap<u32, AbstractState>,
    
    /// CAP-AI-10: Potential division-by-zero warnings (line, var)
    pub potential_div_zero: Vec<(u32, String)>,
    
    /// CAP-AI-11: Potential null dereference warnings (line, var)
    pub potential_null_deref: Vec<(u32, String)>,
    
    /// Function name
    pub function_name: String,
}

impl AbstractInterpInfo {
    /// Get abstract value of variable at entry to block
    pub fn value_at(&self, block: u32, var: &str) -> AbstractValue {
        self.state_in.get(&block)
            .map(|s| s.get(var))
            .unwrap_or_else(AbstractValue::top)
    }
    
    /// Get abstract value of variable at exit of block
    pub fn value_at_exit(&self, block: u32, var: &str) -> AbstractValue {
        self.state_out.get(&block)
            .map(|s| s.get(var))
            .unwrap_or_else(AbstractValue::top)
    }
    
    /// Get the value range for variable at block entry
    pub fn range_at(&self, block: u32, var: &str) -> Option<(Option<i64>, Option<i64>)> {
        self.value_at(block, var).range_
    }
    
    /// Get the inferred type for variable at block entry
    pub fn type_at(&self, block: u32, var: &str) -> Option<String> {
        self.value_at(block, var).type_
    }
    
    /// Check if variable is definitely non-null at block entry
    pub fn is_definitely_not_null(&self, block: u32, var: &str) -> bool {
        self.value_at(block, var).nullable == Nullability::Never
    }
    
    /// CAP-AI-12: Get all variables with known constant values at function exit
    pub fn get_constants(&self) -> HashMap<String, ConstantValue> {
        let mut constants = HashMap::new();
        for state in self.state_out.values() {
            for (var, val) in &state.values {
                if let Some(c) = &val.constant {
                    constants.insert(var.clone(), c.clone());
                }
            }
        }
        constants
    }
    
    /// CAP-AI-22: Serialize to JSON-compatible structure
    pub fn to_json(&self) -> serde_json::Value {
        // Output format per v1 CLI
    }
}
```

### CAP-AI-08, CAP-AI-09: State Join and Widening

```rust
/// CAP-AI-08: Join multiple abstract states at a merge point.
/// 
/// For each variable:
/// - range: take union (widest bounds)
/// - type: take common type or unknown
/// - nullable: MAYBE if any is MAYBE, else max
/// - constant: keep only if all agree
fn join_states(states: &[AbstractState]) -> AbstractState {
    if states.is_empty() {
        return AbstractState::default();
    }
    if states.len() == 1 {
        return states[0].clone();
    }
    
    let all_vars: HashSet<_> = states.iter()
        .flat_map(|s| s.values.keys().cloned())
        .collect();
    
    let mut result = HashMap::new();
    for var in all_vars {
        let values: Vec<_> = states.iter().map(|s| s.get(&var)).collect();
        result.insert(var, join_values(&values));
    }
    
    AbstractState { values: result }
}

fn join_values(values: &[AbstractValue]) -> AbstractValue {
    // Range: union (widest bounds)
    let ranges: Vec<_> = values.iter()
        .filter_map(|v| v.range_)
        .collect();
    
    let joined_range = if ranges.is_empty() {
        None
    } else {
        let lows: Vec<_> = ranges.iter().filter_map(|r| r.0).collect();
        let highs: Vec<_> = ranges.iter().filter_map(|r| r.1).collect();
        Some((
            lows.iter().min().copied(),
            highs.iter().max().copied(),
        ))
    };
    
    // Type: common type or None
    let types: Vec<_> = values.iter().filter_map(|v| v.type_.clone()).collect();
    let joined_type = if types.windows(2).all(|w| w[0] == w[1]) && !types.is_empty() {
        Some(types[0].clone())
    } else {
        None
    };
    
    // Nullable: MAYBE if any is MAYBE or disagreement
    let nulls: Vec<_> = values.iter().map(|v| v.nullable).collect();
    let joined_null = if nulls.contains(&Nullability::Maybe) {
        Nullability::Maybe
    } else if nulls.windows(2).all(|w| w[0] == w[1]) {
        nulls[0]
    } else {
        Nullability::Maybe
    };
    
    // Constant: only if all agree
    let constants: Vec<_> = values.iter().filter_map(|v| v.constant.clone()).collect();
    let joined_const = if constants.len() == values.len() 
        && constants.windows(2).all(|w| w[0] == w[1]) {
        constants.into_iter().next()
    } else {
        None
    };
    
    AbstractValue {
        type_: joined_type,
        range_: joined_range,
        nullable: joined_null,
        constant: joined_const,
    }
}

/// CAP-AI-09: Apply widening to ensure termination on loops.
/// 
/// For each variable, if the range is growing, widen to infinity.
fn widen_state(old: &AbstractState, new: &AbstractState) -> AbstractState {
    let all_vars: HashSet<_> = old.values.keys()
        .chain(new.values.keys())
        .cloned()
        .collect();
    
    let mut result = HashMap::new();
    for var in all_vars {
        let old_val = old.get(&var);
        let new_val = new.get(&var);
        result.insert(var, widen_value(&old_val, &new_val));
    }
    
    AbstractState { values: result }
}

fn widen_value(old: &AbstractValue, new: &AbstractValue) -> AbstractValue {
    let widened_range = match (&old.range_, &new.range_) {
        (None, None) => None,
        (None, r) => r.clone(),
        (_, None) => None,
        (Some((old_low, old_high)), Some((new_low, new_high))) => {
            // Widen low: if growing downward, widen to -inf
            let widened_low = match (old_low, new_low) {
                (None, _) => None,  // Already widened
                (_, None) => None,  // Widen
                (Some(o), Some(n)) if *n < *o => None,  // Growing down
                (_, n) => *n,
            };
            
            // Widen high: if growing upward, widen to +inf
            let widened_high = match (old_high, new_high) {
                (None, _) => None,  // Already widened
                (_, None) => None,  // Widen
                (Some(o), Some(n)) if *n > *o => None,  // Growing up
                (_, n) => *n,
            };
            
            Some((widened_low, widened_high))
        }
    };
    
    AbstractValue {
        type_: new.type_.clone(),
        range_: widened_range,
        nullable: new.nullable,
        constant: None,  // Constant lost after widening
    }
}
```

### CAP-AI-13, CAP-AI-14, CAP-AI-19: Transfer Function and RHS Parsing

```rust
/// CAP-AI-14: Parse RHS of assignment and compute abstract value.
/// 
/// Handles:
/// - Integer literals: x = 5
/// - String literals: x = "hello"
/// - None/null: x = None (language-specific keywords)
/// - Boolean: True/False or true/false (language-specific)
/// - Simple arithmetic: x = a + 1, x = b * 2 (CAP-AI-13)
/// - Variable copies: x = y (CAP-AI-19)
pub fn parse_rhs_abstract(
    line: &str,
    var: &str,
    state: &AbstractState,
    language: Language,
) -> AbstractValue {
    // Get language-specific patterns
    let comment_pattern = get_comment_pattern(language);
    let null_keywords = get_null_keywords(language);  // CAP-AI-15
    let bool_keywords = get_boolean_keywords(language);  // CAP-AI-16
    
    // Strip comments (CAP-AI-17)
    let line = comment_pattern.split(line).next().unwrap_or("");
    
    // Find assignment: var = ... or var := ...
    let rhs = extract_rhs(line, var)?;
    
    // Integer literal
    if let Ok(v) = rhs.parse::<i64>() {
        return AbstractValue::from_constant(ConstantValue::Int(v));
    }
    
    // Float literal
    if let Ok(v) = rhs.parse::<f64>() {
        return AbstractValue::from_constant(ConstantValue::Float(v));
    }
    
    // String literal
    if (rhs.starts_with('"') && rhs.ends_with('"')) ||
       (rhs.starts_with('\'') && rhs.ends_with('\'')) {
        let s = rhs[1..rhs.len()-1].to_string();
        return AbstractValue::from_constant(ConstantValue::String(s));
    }
    
    // Null keywords (CAP-AI-15)
    if null_keywords.contains(&rhs) {
        return AbstractValue::from_constant(ConstantValue::Null);
    }
    
    // Boolean keywords (CAP-AI-16)
    if let Some(&b) = bool_keywords.get(rhs) {
        return AbstractValue::from_constant(ConstantValue::Bool(b));
    }
    
    // Variable copy (CAP-AI-19)
    if is_identifier(rhs) {
        return state.get(rhs);
    }
    
    // Simple arithmetic: x = a + N or x = a - N (CAP-AI-13)
    if let Some((operand, op, const_val)) = parse_simple_arithmetic(rhs) {
        return apply_arithmetic(state.get(&operand), op, const_val);
    }
    
    // Unknown RHS
    AbstractValue::top()
}

/// CAP-AI-13: Apply abstract arithmetic
fn apply_arithmetic(operand: AbstractValue, op: char, constant: i64) -> AbstractValue {
    let new_range = operand.range_.map(|(low, high)| {
        match op {
            '+' => (low.map(|l| l + constant), high.map(|h| h + constant)),
            '-' => (low.map(|l| l - constant), high.map(|h| h - constant)),
            '*' => {
                let vals = [
                    low.map(|l| l * constant),
                    high.map(|h| h * constant),
                ];
                (vals.iter().filter_map(|&v| v).min(), 
                 vals.iter().filter_map(|&v| v).max())
            }
            _ => (None, None),  // Unknown op -> unbounded
        }
    });
    
    AbstractValue {
        type_: operand.type_,
        range_: new_range,
        nullable: operand.nullable,
        constant: if operand.is_constant() && new_range.map(|(l, h)| l == h).unwrap_or(false) {
            new_range.and_then(|(l, _)| l.map(ConstantValue::Int))
        } else {
            None
        },
    }
}
```

### CAP-AI-15, CAP-AI-16, CAP-AI-17: Multi-Language Support

```rust
/// CAP-AI-15: Get null keywords for language
fn get_null_keywords(language: Language) -> &'static [&'static str] {
    match language {
        Language::Python => &["None"],
        Language::TypeScript | Language::JavaScript => &["null", "undefined"],
        Language::Go => &["nil"],
        Language::Rust => &[],  // Rust has no null (None is Option::None)
        Language::Java | Language::Kotlin | Language::CSharp => &["null"],
        Language::Swift => &["nil"],
        _ => &["null", "nil", "None"],  // Fallback
    }
}

/// CAP-AI-16: Get boolean keywords for language
fn get_boolean_keywords(language: Language) -> HashMap<&'static str, bool> {
    match language {
        Language::Python => [("True", true), ("False", false)].into(),
        Language::TypeScript | Language::JavaScript | Language::Go | Language::Rust => 
            [("true", true), ("false", false)].into(),
        _ => [("True", true), ("False", false), ("true", true), ("false", false)].into(),
    }
}

/// CAP-AI-17: Get comment pattern for language
fn get_comment_pattern(language: Language) -> &'static str {
    match language {
        Language::Python => "#",
        Language::TypeScript | Language::JavaScript | Language::Go | 
        Language::Rust | Language::Java | Language::CSharp | Language::Kotlin | Language::Swift => "//",
        _ => "#",  // Fallback
    }
}
```

### Main Algorithm

```rust
/// Compute abstract interpretation using forward dataflow with widening.
/// 
/// Algorithm:
/// 1. Initialize entry block with parameter values (top)
/// 2. Iterate in reverse postorder until fixpoint:
///    - state_in[b] = join(state_out[p] for p in predecessors[b])
///    - Apply widening at loop headers (CAP-AI-09)
///    - state_out[b] = transfer(state_in[b], block[b])
/// 3. Detect potential issues:
///    - Division by zero: divisor.may_be_zero() (CAP-AI-10)
///    - Null dereference: obj.may_be_null() at attribute access (CAP-AI-11)
/// 4. Return AbstractInterpInfo
pub fn compute_abstract_interp(
    cfg: &CfgInfo,
    refs: &[VarRef],
    source_lines: Option<&[String]>,
    language: Language,
) -> Result<AbstractInterpInfo, TldrError> {
    let predecessors = build_predecessors(cfg);
    let loop_headers = find_back_edges(cfg);
    let block_order = reverse_postorder(cfg);
    
    // Initialize
    let mut state_in: HashMap<u32, AbstractState> = HashMap::new();
    let mut state_out: HashMap<u32, AbstractState> = HashMap::new();
    
    let entry = cfg.entry_block_id;
    let init_state = init_params(cfg, refs);
    state_in.insert(entry, init_state.clone());
    state_out.insert(entry, init_state);
    
    for block in &cfg.blocks {
        if block.id != entry {
            state_in.insert(block.id, AbstractState::default());
            state_out.insert(block.id, AbstractState::default());
        }
    }
    
    // Iterate until fixpoint
    let mut changed = true;
    let max_iterations = cfg.blocks.len() * 10 + 100;
    let mut iteration = 0;
    
    while changed && iteration < max_iterations {
        changed = false;
        iteration += 1;
        
        for &block_id in &block_order {
            if block_id == entry {
                continue;
            }
            
            let block = cfg.blocks.iter().find(|b| b.id == block_id).unwrap();
            let preds = predecessors.get(&block_id).unwrap_or(&vec![]);
            
            // Join predecessors
            let pred_states: Vec<_> = preds.iter()
                .map(|p| state_out[p].clone())
                .collect();
            let mut new_in = join_states(&pred_states);
            
            // Apply widening at loop headers
            if loop_headers.contains(&block_id) {
                new_in = widen_state(&state_in[&block_id], &new_in);
            }
            
            // Transfer function
            let new_out = transfer_block(&new_in, block, refs, source_lines, language);
            
            if state_in[&block_id] != new_in || state_out[&block_id] != new_out {
                changed = true;
                state_in.insert(block_id, new_in);
                state_out.insert(block_id, new_out);
            }
        }
    }
    
    // Detect issues
    let potential_div_zero = find_div_zero(cfg, refs, &state_in, source_lines, &state_out);  // CAP-AI-20
    let potential_null_deref = find_null_deref(cfg, refs, &state_in);
    
    Ok(AbstractInterpInfo {
        state_in,
        state_out,
        potential_div_zero,
        potential_null_deref,
        function_name: cfg.function_name.clone(),
    })
}
```

### Behavioral Contracts

| ID | Contract | Test Case |
|----|----------|-----------|
| CAP-AI-01 | Nullability enum values | NEVER="never", MAYBE="maybe", ALWAYS="always" |
| CAP-AI-02 | AbstractValue is frozen/hashable | Can be used in HashSet |
| CAP-AI-03 | from_constant(5) -> range=[5,5], constant=5 | Integer constant |
| CAP-AI-04 | top() -> None type/range, MAYBE nullable | Unknown value |
| CAP-AI-05 | may_be_zero(): range [-5,5] -> true | Division check |
| CAP-AI-06 | may_be_null(): NEVER -> false | Null check |
| CAP-AI-07 | AbstractState.get() -> top() for missing | Default behavior |
| CAP-AI-08 | Join [1,1] and [10,10] -> [1,10] | Range union |
| CAP-AI-09 | Growing upper bound -> widen to +inf | Loop termination |
| CAP-AI-10 | Division by may_be_zero() divisor flagged | Safety check |
| CAP-AI-11 | Attribute access on may_be_null() flagged | Safety check |
| CAP-AI-12 | Constant propagation through arithmetic | x=5; y=x+1 -> y=[6,6] |
| CAP-AI-13 | Arithmetic on ranges | [5,5] + 3 -> [8,8] |
| CAP-AI-14 | RHS parsing: int, float, str, None | Literal extraction |
| CAP-AI-15 | Python None, TS null/undefined, Go nil | Multi-language null |
| CAP-AI-16 | Python True/False, TS true/false | Multi-language bool |
| CAP-AI-17 | Python #, TS // comment stripping | Parse accuracy |
| CAP-AI-18 | String "hello" -> range=[5,5] length | String tracking |
| CAP-AI-19 | y = x copies abstract value | Variable flow |
| CAP-AI-20 | x=5; y=1/x no warning (intra-block) | False positive prevention |
| CAP-AI-21 | Query methods return correct values | API contract |
| CAP-AI-22 | to_json() JSON-serializable | CLI output |

---

## CLI Interface

### Available Expressions Command

```bash
tldr available <file> <function> [--lang <lang>] [--check <expr>] [--at-line <n>]
```

**Output format:**
```json
{
  "function": "example",
  "avail_in": {
    "0": [],
    "1": [{"text": "a + b", "operands": ["a", "b"], "line": 2}]
  },
  "avail_out": {
    "0": [{"text": "a + b", "operands": ["a", "b"], "line": 2}],
    "1": [{"text": "a + b", "operands": ["a", "b"], "line": 2}]
  },
  "all_expressions": [
    {"text": "a + b", "operands": ["a", "b"], "line": 2}
  ],
  "redundant_computations": [
    {"expr": "a + b", "first_at": 2, "redundant_at": 4}
  ]
}
```

### Abstract Interpretation Command

```bash
tldr abstract-interp <file> <function> [--lang <lang>] [--var <name>] [--line <n>]
```

**Output format:**
```json
{
  "function": "example",
  "state_in": {
    "0": {},
    "1": {
      "x": {"type": "int", "range": [5, 5], "nullable": "never", "constant": 5}
    }
  },
  "state_out": {
    "0": {
      "x": {"type": "int", "range": [5, 5], "nullable": "never", "constant": 5}
    }
  },
  "potential_div_zero": [
    {"line": 10, "var": "divisor"}
  ],
  "potential_null_deref": [
    {"line": 15, "var": "obj"}
  ]
}
```

---

## Error Handling

```rust
/// Dataflow analysis errors
#[derive(Debug, thiserror::Error)]
pub enum DataflowError {
    #[error("CFG has no blocks")]
    EmptyCfg,
    
    #[error("Entry block {0} not found in CFG")]
    EntryBlockNotFound(u32),
    
    #[error("Analysis did not converge after {0} iterations")]
    NoConvergence(usize),
    
    #[error("Source line {0} out of range")]
    LineOutOfRange(u32),
    
    #[error("Block {0} not found")]
    BlockNotFound(u32),
}
```

---

## Edge Cases (from tests)

### Available Expressions

1. **Empty function** - No expressions, empty avail_in/avail_out
2. **Unreachable block** - Still included in results (entry/exit present)
3. **Self-loop** - Handles gracefully without infinite loop
4. **Multiple expressions** - Tracked independently
5. **Diamond CFG single branch** - Expression NOT available at merge (MUST)
6. **Diamond CFG both branches** - Expression IS available at merge

### Abstract Interpretation

1. **Empty function** - Empty state, no warnings
2. **Unknown RHS** - Defaults to top()
3. **Parameter starts as top** - Unknown input
4. **Negative constant** - Range [-5, -5]
5. **String constant tracks length** - "hello" -> [5, 5]
6. **Nested loops** - Terminates via widening
7. **Rust has no null keyword** - None is Option::None, not null

---

## Implementation Phases

### Phase 1: Foundation (types.rs)
- [ ] BlockId type alias
- [ ] build_predecessors() helper
- [ ] find_back_edges() helper
- [ ] reverse_postorder() helper

### Phase 2: Available Expressions (available.rs)
- [ ] Expression struct (CAP-AE-01)
- [ ] normalize_expression() (CAP-AE-02)
- [ ] BlockExpressions (CAP-AE-03)
- [ ] AvailableExprsInfo (CAP-AE-08-11)
- [ ] compute_available_exprs() (CAP-AE-04-05)
- [ ] redundant_computations() (CAP-AE-06-07)
- [ ] Function call exclusion (CAP-AE-12)

### Phase 3: Abstract Interpretation (abstract_interp.rs)
- [ ] Nullability enum (CAP-AI-01)
- [ ] AbstractValue struct (CAP-AI-02-06)
- [ ] AbstractState struct (CAP-AI-07)
- [ ] join_states(), widen_state() (CAP-AI-08-09)
- [ ] parse_rhs_abstract() (CAP-AI-14)
- [ ] apply_arithmetic() (CAP-AI-13)
- [ ] Multi-language keywords (CAP-AI-15-17)
- [ ] compute_abstract_interp()
- [ ] find_div_zero(), find_null_deref() (CAP-AI-10-11)
- [ ] AbstractInterpInfo (CAP-AI-21-22)

### Phase 4: Integration
- [ ] Add `pub mod dataflow;` to lib.rs
- [ ] Re-export main types
- [ ] CLI commands in tldr-cli

### Phase 5: Testing
- [ ] Port all 32 available_exprs tests
- [ ] Port all 56 abstract_interp tests
- [ ] Coverage target: 80%

---

## Success Criteria

1. All 34 capabilities implemented
2. All 88 test cases ported and passing
3. JSON output matches v1 CLI format
4. Multi-language support (Python, TypeScript, Go, Rust)
5. Performance: O(blocks * expressions) for available, O(blocks * vars * iterations) for abstract
6. No panics on edge cases (empty CFG, unreachable blocks, self-loops)

---

## References

- Dragon Book Ch. 10: Available Expressions
- Cooper & Torczon: "Engineering a Compiler" Ch. 9
- Cousot & Cousot (1977): "Abstract Interpretation: A Unified Lattice Model"
- Python source: packages/tldr-code/available_exprs.py (538 lines)
- Python source: packages/tldr-code/abstract_interp.py (980 lines)
- Gap analysis: thoughts/shared/gap-analysis/session13-available-abstract-gap.yaml