tldr-core 0.1.4

# Dataflow Analysis Specification

**Created**: 2026-02-03
**Author**: architect-agent
**Source**: Gap Analysis session13-available-abstract-gap.yaml
**Total Capabilities**: 34 (12 Available Expressions + 22 Abstract Interpretation)

## Overview

This specification defines the Rust port of two dataflow analyses from tldr-code v2:

1. **Available Expressions Analysis**: Forward MUST (intersection) dataflow analysis for CSE detection
2. **Abstract Interpretation**: Forward dataflow with widening for range tracking and safety checks

Both analyses build on the existing `tldr-core` infrastructure:
- `cfg` module for control flow graphs
- `dfg` module for variable references (VarRef)
- `ssa` module for dominance and def-use chains

## Module Structure

```
src/dataflow/
├── mod.rs              # Module entry point, re-exports
├── types.rs            # Shared types (BlockId, predecessors helper)
├── available.rs        # Available expressions analysis (CAP-AE-01 through CAP-AE-12)
├── abstract_interp.rs  # Abstract interpretation (CAP-AI-01 through CAP-AI-22)
└── tests/
    ├── available_test.rs
    └── abstract_interp_test.rs
```

---

## Part 1: Available Expressions Analysis

### Data Types

#### CAP-AE-01: Expression (frozen/hashable)

```rust
/// Represents a computed expression for availability analysis.
/// 
/// Immutable and hashable by text only (line-independent equality).
/// Two expressions with the same normalized text are equal regardless of line.
#[derive(Debug, Clone, Eq, Serialize, Deserialize)]
pub struct Expression {
    /// Normalized expression string (e.g., "a + b")
    pub text: String,
    
    /// Variables used in this expression (sorted for consistency)
    pub operands: Vec<String>,
    
    /// Line where expression first appears
    pub line: u32,
}

impl PartialEq for Expression {
    fn eq(&self, other: &Self) -> bool {
        self.text == other.text  // Equality by text only
    }
}

impl Hash for Expression {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.text.hash(state);  // Hash by text only
    }
}

impl Expression {
    /// Check if redefining var kills this expression.
    /// 
    /// An expression is killed when any operand is redefined.
    pub fn is_killed_by(&self, var: &str) -> bool {
        self.operands.iter().any(|op| op == var)
    }
}
```

#### CAP-AE-02: Commutative Expression Normalization

```rust
/// Commutative operators that allow operand reordering.
pub const COMMUTATIVE_OPS: &[&str] = &["+", "*", "==", "!=", "and", "or", "&", "|", "^"];

/// Normalize binary expression to canonical form.
/// 
/// For commutative operators, sort operands alphabetically.
/// This ensures "a + b" and "b + a" produce the same text.
pub fn normalize_expression(op: &str, left: &str, right: &str) -> String {
    if COMMUTATIVE_OPS.contains(&op) {
        let mut operands = [left.trim(), right.trim()];
        operands.sort();
        format!("{} {} {}", operands[0], op, operands[1])
    } else {
        format!("{} {} {}", left.trim(), op, right.trim())
    }
}
```

#### CAP-AE-03 to CAP-AE-07: BlockExpressions (Gen/Kill)

```rust
/// Expressions generated and killed in a single CFG block.
#[derive(Debug, Default)]
struct BlockExpressions {
    /// Expressions computed in this block (before any operand is killed)
    gen: HashSet<Expression>,
    
    /// Variables redefined in this block (kills expressions using that var)
    kill: HashSet<String>,
}
```

#### CAP-AE-08 to CAP-AE-11: AvailableExprsInfo

```rust
/// Available expressions analysis results.
/// 
/// An expression is available at a point if:
/// 1. It has been computed on EVERY path reaching that point (MUST analysis)
/// 2. None of its operands have been redefined since computation
#[derive(Debug, Serialize, Deserialize)]
pub struct AvailableExprsInfo {
    /// Expressions available at block entry.
    /// avail_in[b] = intersection(avail_out[p] for p in predecessors[b])
    pub avail_in: HashMap<u32, HashSet<Expression>>,
    
    /// Expressions available at block exit.
    /// avail_out[b] = gen[b] | (avail_in[b] - killed_by_block[b])
    pub avail_out: HashMap<u32, HashSet<Expression>>,
    
    /// All unique expressions found in the function
    pub all_exprs: HashSet<Expression>,
    
    /// Entry block ID
    pub entry_block: u32,
    
    /// All expression instances including duplicates (for CSE detection)
    #[serde(skip)]
    pub expr_instances: Vec<Expression>,
}

impl AvailableExprsInfo {
    /// Check if expression is available at entry to block.
    pub fn is_available(&self, block: u32, expr: &Expression) -> bool {
        self.avail_in.get(&block).map_or(false, |set| set.contains(expr))
    }
    
    /// Check if expression is available at exit of block.
    pub fn is_available_at_exit(&self, block: u32, expr: &Expression) -> bool {
        self.avail_out.get(&block).map_or(false, |set| set.contains(expr))
    }
    
    /// CAP-AE-06: Find expressions computed when already available (CSE opportunities).
    /// 
    /// Returns: Vec<(expr_text, original_line, redundant_line)>
    pub fn redundant_computations(&self) -> Vec<(String, u32, u32)> {
        // Implementation handles intra-block kills carefully
        // See Python source lines 162-217 for algorithm
    }
    
    /// CAP-AE-10: Get expressions available at a specific source line.
    pub fn get_available_at_line(&self, line: u32, cfg: &CfgInfo) -> HashSet<Expression> {
        for block in &cfg.blocks {
            if block.start_line <= line && line <= block.end_line {
                return self.avail_in.get(&block.id).cloned().unwrap_or_default();
            }
        }
        HashSet::new()
    }
    
    /// CAP-AE-11: Serialize to JSON-compatible structure.
    pub fn to_json(&self) -> serde_json::Value {
        // Output format:
        // {
        //   "avail_in": {"0": [{"text": "a + b", "operands": ["a", "b"], "line": 2}], ...},
        //   "avail_out": {...},
        //   "all_expressions": [...],
        //   "entry_block": 0,
        //   "redundant_computations": [{"expr": "a + b", "first_at": 2, "redundant_at": 4}]
        // }
    }
}
```

### CAP-AE-04, CAP-AE-05: Algorithm (MUST Analysis with Fixpoint)

```rust
/// Compute available expressions using forward MUST dataflow analysis.
/// 
/// Algorithm:
/// 1. Extract all expressions and their operands from DFG
/// 2. Compute gen/kill sets for each block
/// 3. Initialize: entry = {}, others = ALL_EXPRS
/// 4. Iterate until fixed point:
///    - avail_in[b] = intersection(avail_out[p] for p in predecessors[b])
///    - avail_out[b] = gen[b] | (avail_in[b] - killed_by_block[b])
/// 5. Return AvailableExprsInfo with redundant_computations()
/// 
/// FM1 Mitigation: Commutative expression normalization
/// FM2 Mitigation: Function calls excluded from CSE (CAP-AE-12)
/// FM3 Mitigation: ALL_EXPRS computed before init, proper empty set for entry
pub fn compute_available_exprs(
    cfg: &CfgInfo,
    dfg: &DfgInfo,
) -> Result<AvailableExprsInfo, TldrError> {
    // Step 1: Extract expressions from variable uses (pairs on same line)
    let (all_exprs, block_info, expr_instances) = extract_expressions_from_refs(cfg, dfg)?;
    
    // Early return if no expressions
    if all_exprs.is_empty() {
        return Ok(AvailableExprsInfo::empty(cfg.entry_block_id));
    }
    
    // Step 2: Build predecessor map
    let predecessors = build_predecessors(cfg);
    
    // Step 3: Initialize (MUST analysis: start optimistic except entry)
    let mut avail_in: HashMap<u32, HashSet<Expression>> = HashMap::new();
    let mut avail_out: HashMap<u32, HashSet<Expression>> = HashMap::new();
    
    let entry = cfg.entry_block_id;
    avail_in.insert(entry, HashSet::new());  // Nothing available at entry
    avail_out.insert(entry, block_info[&entry].gen.clone());
    
    for block in &cfg.blocks {
        if block.id != entry {
            avail_in.insert(block.id, all_exprs.clone());
            avail_out.insert(block.id, all_exprs.clone());
        }
    }
    
    // Step 4: Iterate until fixed point
    let mut changed = true;
    let max_iterations = cfg.blocks.len() * all_exprs.len() + 10;
    let mut iteration = 0;
    
    while changed && iteration < max_iterations {
        changed = false;
        iteration += 1;
        
        for block in &cfg.blocks {
            if block.id == entry {
                continue;
            }
            
            // avail_in = INTERSECTION of all predecessor's avail_out
            let preds = predecessors.get(&block.id).unwrap_or(&vec![]);
            let new_in = if preds.is_empty() {
                HashSet::new()
            } else {
                preds.iter()
                    .map(|p| avail_out.get(p).unwrap())
                    .fold(avail_out[&preds[0]].clone(), |acc, set| {
                        acc.intersection(set).cloned().collect()  // MUST = intersection
                    })
            };
            
            // avail_out = gen | (avail_in - killed)
            let info = &block_info[&block.id];
            let killed: HashSet<_> = new_in.iter()
                .filter(|e| info.kill.iter().any(|k| e.is_killed_by(k)))
                .cloned()
                .collect();
            let new_out: HashSet<_> = info.gen.union(&new_in.difference(&killed).cloned().collect())
                .cloned()
                .collect();
            
            if avail_in[&block.id] != new_in || avail_out[&block.id] != new_out {
                changed = true;
                avail_in.insert(block.id, new_in);
                avail_out.insert(block.id, new_out);
            }
        }
    }
    
    Ok(AvailableExprsInfo {
        avail_in,
        avail_out,
        all_exprs,
        entry_block: entry,
        expr_instances,
    })
}
```

### Behavioral Contracts

| ID | Contract | Test Case |
|----|----------|-----------|
| CAP-AE-01 | Expression equality based on text only | `expr1(line=1) == expr2(line=5)` if same text |
| CAP-AE-02 | Commutative normalization | `"a + b"` equals `"b + a"` after normalization |
| CAP-AE-03 | Gen: expressions before operand killed | `x=a+b; a=5` -> gen contains `a+b` with original line |
| CAP-AE-04 | MUST semantics: intersection at joins | Diamond CFG: single-branch expr NOT available at merge |
| CAP-AE-05 | Fixpoint terminates | Max iterations = blocks * expressions + 10 |
| CAP-AE-06 | redundant_computations() accuracy | `(expr_text, first_line, redundant_line)` tuples |
| CAP-AE-07 | Intra-block kill handling | `x=a+b; a=5; y=a+b` -> second NOT redundant |
| CAP-AE-08 | is_available() at entry | True iff expr in avail_in[block] |
| CAP-AE-09 | is_available_at_exit() | True iff expr in avail_out[block] |
| CAP-AE-10 | get_available_at_line() | Maps line to containing block's avail_in |
| CAP-AE-11 | to_json() serializable | All fields JSON-compatible |
| CAP-AE-12 | Function calls excluded | `foo(x)` not tracked as available expression |

---

## Part 2: Abstract Interpretation Analysis

### Data Types

#### CAP-AI-01: Nullability Enum

```rust
/// Nullability lattice: NEVER < MAYBE < ALWAYS
/// 
/// Used to track whether a variable may be null/None at a program point.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum Nullability {
    /// Definitely not null
    Never,
    /// Could be null or non-null
    Maybe,
    /// Definitely null
    Always,
}

impl Default for Nullability {
    fn default() -> Self {
        Nullability::Maybe
    }
}
```

#### CAP-AI-02 to CAP-AI-06: AbstractValue

```rust
/// Abstract representation of a variable's value at a program point.
/// 
/// Tracks four dimensions:
/// - type_: Inferred type (str, int, list, etc.) or None if unknown
/// - range_: Value range [min, max] for numeric types, None for unbounded
/// - nullable: Whether the value can be null/None
/// - constant: If value is a known constant, the value itself
#[derive(Debug, Clone, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub struct AbstractValue {
    /// Inferred type name (e.g., "int", "str") or None if unknown
    #[serde(rename = "type")]
    pub type_: Option<String>,
    
    /// Value range [min, max] for numerics. None bounds mean infinity.
    /// For strings, tracks length.
    pub range_: Option<(Option<i64>, Option<i64>)>,
    
    /// Nullability status
    pub nullable: Nullability,
    
    /// Known constant value (not hashed, used for propagation)
    #[serde(skip_serializing_if = "Option::is_none")]
    pub constant: Option<ConstantValue>,
}

/// Constant values that can be tracked
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
#[serde(untagged)]
pub enum ConstantValue {
    Int(i64),
    Float(f64),
    String(String),
    Bool(bool),
    Null,
}

impl AbstractValue {
    /// CAP-AI-04: Top of lattice - no information known
    pub fn top() -> Self {
        AbstractValue {
            type_: None,
            range_: None,
            nullable: Nullability::Maybe,
            constant: None,
        }
    }
    
    /// CAP-AI-04: Bottom of lattice - contradiction (unreachable)
    pub fn bottom() -> Self {
        AbstractValue {
            type_: Some("<bottom>".to_string()),
            range_: Some((None, None)),
            nullable: Nullability::Never,
            constant: None,
        }
    }
    
    /// CAP-AI-03: Create from known constant
    pub fn from_constant(value: ConstantValue) -> Self {
        match value {
            ConstantValue::Int(v) => AbstractValue {
                type_: Some("int".to_string()),
                range_: Some((Some(v), Some(v))),
                nullable: Nullability::Never,
                constant: Some(ConstantValue::Int(v)),
            },
            ConstantValue::Float(v) => AbstractValue {
                type_: Some("float".to_string()),
                range_: None,  // Float ranges less useful
                nullable: Nullability::Never,
                constant: Some(ConstantValue::Float(v)),
            },
            ConstantValue::String(ref s) => AbstractValue {
                type_: Some("str".to_string()),
                range_: Some((Some(s.len() as i64), Some(s.len() as i64))),  // CAP-AI-18: Track length
                nullable: Nullability::Never,
                constant: Some(value),
            },
            ConstantValue::Bool(v) => AbstractValue {
                type_: Some("bool".to_string()),
                range_: Some((Some(v as i64), Some(v as i64))),
                nullable: Nullability::Never,
                constant: Some(ConstantValue::Bool(v)),
            },
            ConstantValue::Null => AbstractValue {
                type_: Some("NoneType".to_string()),
                range_: None,
                nullable: Nullability::Always,
                constant: None,
            },
        }
    }
    
    /// CAP-AI-05: Check if value could be zero (for division check)
    pub fn may_be_zero(&self) -> bool {
        match &self.range_ {
            None => true,  // Unknown range, conservatively true
            Some((low, high)) => {
                let low = low.unwrap_or(i64::MIN);
                let high = high.unwrap_or(i64::MAX);
                low <= 0 && 0 <= high
            }
        }
    }
    
    /// CAP-AI-06: Check if value could be null/None
    pub fn may_be_null(&self) -> bool {
        self.nullable != Nullability::Never
    }
    
    /// Check if this is a known constant value
    pub fn is_constant(&self) -> bool {
        self.constant.is_some()
    }
}
```

#### CAP-AI-07: AbstractState

```rust
/// Abstract state at a program point: mapping from variables to abstract values.
#[derive(Debug, Clone, Default, PartialEq, Eq, Serialize, Deserialize)]
pub struct AbstractState {
    pub values: HashMap<String, AbstractValue>,
}

impl AbstractState {
    /// Get abstract value for variable, defaulting to top (unknown)
    pub fn get(&self, var: &str) -> AbstractValue {
        self.values.get(var).cloned().unwrap_or_else(AbstractValue::top)
    }
    
    /// Return new state with updated variable value (immutable style)
    pub fn set(&self, var: &str, value: AbstractValue) -> Self {
        let mut new_values = self.values.clone();
        new_values.insert(var.to_string(), value);
        AbstractState { values: new_values }
    }
    
    /// Create a copy of this state
    pub fn copy(&self) -> Self {
        self.clone()
    }
}
```

#### CAP-AI-21, CAP-AI-22: AbstractInterpInfo

```rust
/// Abstract interpretation analysis results for a function.
#[derive(Debug, Serialize, Deserialize)]
pub struct AbstractInterpInfo {
    /// Abstract state at entry of each block
    pub state_in: HashMap<u32, AbstractState>,
    
    /// Abstract state at exit of each block
    pub state_out: HashMap<u32, AbstractState>,
    
    /// CAP-AI-10: Potential division-by-zero warnings (line, var)
    pub potential_div_zero: Vec<(u32, String)>,
    
    /// CAP-AI-11: Potential null dereference warnings (line, var)
    pub potential_null_deref: Vec<(u32, String)>,
    
    /// Function name
    pub function_name: String,
}

impl AbstractInterpInfo {
    /// Get abstract value of variable at entry to block
    pub fn value_at(&self, block: u32, var: &str) -> AbstractValue {
        self.state_in.get(&block)
            .map(|s| s.get(var))
            .unwrap_or_else(AbstractValue::top)
    }
    
    /// Get abstract value of variable at exit of block
    pub fn value_at_exit(&self, block: u32, var: &str) -> AbstractValue {
        self.state_out.get(&block)
            .map(|s| s.get(var))
            .unwrap_or_else(AbstractValue::top)
    }
    
    /// Get the value range for variable at block entry
    pub fn range_at(&self, block: u32, var: &str) -> Option<(Option<i64>, Option<i64>)> {
        self.value_at(block, var).range_
    }
    
    /// Get the inferred type for variable at block entry
    pub fn type_at(&self, block: u32, var: &str) -> Option<String> {
        self.value_at(block, var).type_
    }
    
    /// Check if variable is definitely non-null at block entry
    pub fn is_definitely_not_null(&self, block: u32, var: &str) -> bool {
        self.value_at(block, var).nullable == Nullability::Never
    }
    
    /// CAP-AI-12: Get all variables with known constant values at function exit
    pub fn get_constants(&self) -> HashMap<String, ConstantValue> {
        let mut constants = HashMap::new();
        for state in self.state_out.values() {
            for (var, val) in &state.values {
                if let Some(c) = &val.constant {
                    constants.insert(var.clone(), c.clone());
                }
            }
        }
        constants
    }
    
    /// CAP-AI-22: Serialize to JSON-compatible structure
    pub fn to_json(&self) -> serde_json::Value {
        // Output format per v1 CLI
    }
}
```

### CAP-AI-08, CAP-AI-09: State Join and Widening

```rust
/// CAP-AI-08: Join multiple abstract states at a merge point.
/// 
/// For each variable:
/// - range: take union (widest bounds)
/// - type: take common type or unknown
/// - nullable: MAYBE if any is MAYBE, else max
/// - constant: keep only if all agree
fn join_states(states: &[AbstractState]) -> AbstractState {
    if states.is_empty() {
        return AbstractState::default();
    }
    if states.len() == 1 {
        return states[0].clone();
    }
    
    let all_vars: HashSet<_> = states.iter()
        .flat_map(|s| s.values.keys().cloned())
        .collect();
    
    let mut result = HashMap::new();
    for var in all_vars {
        let values: Vec<_> = states.iter().map(|s| s.get(&var)).collect();
        result.insert(var, join_values(&values));
    }
    
    AbstractState { values: result }
}

fn join_values(values: &[AbstractValue]) -> AbstractValue {
    // Range: union (widest bounds)
    let ranges: Vec<_> = values.iter()
        .filter_map(|v| v.range_)
        .collect();
    
    let joined_range = if ranges.is_empty() {
        None
    } else {
        let lows: Vec<_> = ranges.iter().filter_map(|r| r.0).collect();
        let highs: Vec<_> = ranges.iter().filter_map(|r| r.1).collect();
        Some((
            lows.iter().min().copied(),
            highs.iter().max().copied(),
        ))
    };
    
    // Type: common type or None
    let types: Vec<_> = values.iter().filter_map(|v| v.type_.clone()).collect();
    let joined_type = if types.windows(2).all(|w| w[0] == w[1]) && !types.is_empty() {
        Some(types[0].clone())
    } else {
        None
    };
    
    // Nullable: MAYBE if any is MAYBE or disagreement
    let nulls: Vec<_> = values.iter().map(|v| v.nullable).collect();
    let joined_null = if nulls.contains(&Nullability::Maybe) {
        Nullability::Maybe
    } else if nulls.windows(2).all(|w| w[0] == w[1]) {
        nulls[0]
    } else {
        Nullability::Maybe
    };
    
    // Constant: only if all agree
    let constants: Vec<_> = values.iter().filter_map(|v| v.constant.clone()).collect();
    let joined_const = if constants.len() == values.len() 
        && constants.windows(2).all(|w| w[0] == w[1]) {
        constants.into_iter().next()
    } else {
        None
    };
    
    AbstractValue {
        type_: joined_type,
        range_: joined_range,
        nullable: joined_null,
        constant: joined_const,
    }
}

/// CAP-AI-09: Apply widening to ensure termination on loops.
/// 
/// For each variable, if the range is growing, widen to infinity.
fn widen_state(old: &AbstractState, new: &AbstractState) -> AbstractState {
    let all_vars: HashSet<_> = old.values.keys()
        .chain(new.values.keys())
        .cloned()
        .collect();
    
    let mut result = HashMap::new();
    for var in all_vars {
        let old_val = old.get(&var);
        let new_val = new.get(&var);
        result.insert(var, widen_value(&old_val, &new_val));
    }
    
    AbstractState { values: result }
}

fn widen_value(old: &AbstractValue, new: &AbstractValue) -> AbstractValue {
    let widened_range = match (&old.range_, &new.range_) {
        (None, None) => None,
        (None, r) => r.clone(),
        (_, None) => None,
        (Some((old_low, old_high)), Some((new_low, new_high))) => {
            // Widen low: if growing downward, widen to -inf
            let widened_low = match (old_low, new_low) {
                (None, _) => None,  // Already widened
                (_, None) => None,  // Widen
                (Some(o), Some(n)) if *n < *o => None,  // Growing down
                (_, n) => *n,
            };
            
            // Widen high: if growing upward, widen to +inf
            let widened_high = match (old_high, new_high) {
                (None, _) => None,  // Already widened
                (_, None) => None,  // Widen
                (Some(o), Some(n)) if *n > *o => None,  // Growing up
                (_, n) => *n,
            };
            
            Some((widened_low, widened_high))
        }
    };
    
    AbstractValue {
        type_: new.type_.clone(),
        range_: widened_range,
        nullable: new.nullable,
        constant: None,  // Constant lost after widening
    }
}
```

### CAP-AI-13, CAP-AI-14, CAP-AI-19: Transfer Function and RHS Parsing

```rust
/// CAP-AI-14: Parse RHS of assignment and compute abstract value.
/// 
/// Handles:
/// - Integer literals: x = 5
/// - String literals: x = "hello"
/// - None/null: x = None (language-specific keywords)
/// - Boolean: True/False or true/false (language-specific)
/// - Simple arithmetic: x = a + 1, x = b * 2 (CAP-AI-13)
/// - Variable copies: x = y (CAP-AI-19)
pub fn parse_rhs_abstract(
    line: &str,
    var: &str,
    state: &AbstractState,
    language: Language,
) -> AbstractValue {
    // Get language-specific patterns
    let comment_pattern = get_comment_pattern(language);
    let null_keywords = get_null_keywords(language);  // CAP-AI-15
    let bool_keywords = get_boolean_keywords(language);  // CAP-AI-16
    
    // Strip comments (CAP-AI-17)
    let line = comment_pattern.split(line).next().unwrap_or("");
    
    // Find assignment: var = ... or var := ...
    let rhs = extract_rhs(line, var)?;
    
    // Integer literal
    if let Ok(v) = rhs.parse::<i64>() {
        return AbstractValue::from_constant(ConstantValue::Int(v));
    }
    
    // Float literal
    if let Ok(v) = rhs.parse::<f64>() {
        return AbstractValue::from_constant(ConstantValue::Float(v));
    }
    
    // String literal
    if (rhs.starts_with('"') && rhs.ends_with('"')) ||
       (rhs.starts_with('\'') && rhs.ends_with('\'')) {
        let s = rhs[1..rhs.len()-1].to_string();
        return AbstractValue::from_constant(ConstantValue::String(s));
    }
    
    // Null keywords (CAP-AI-15)
    if null_keywords.contains(&rhs) {
        return AbstractValue::from_constant(ConstantValue::Null);
    }
    
    // Boolean keywords (CAP-AI-16)
    if let Some(&b) = bool_keywords.get(rhs) {
        return AbstractValue::from_constant(ConstantValue::Bool(b));
    }
    
    // Variable copy (CAP-AI-19)
    if is_identifier(rhs) {
        return state.get(rhs);
    }
    
    // Simple arithmetic: x = a + N or x = a - N (CAP-AI-13)
    if let Some((operand, op, const_val)) = parse_simple_arithmetic(rhs) {
        return apply_arithmetic(state.get(&operand), op, const_val);
    }
    
    // Unknown RHS
    AbstractValue::top()
}

/// CAP-AI-13: Apply abstract arithmetic
fn apply_arithmetic(operand: AbstractValue, op: char, constant: i64) -> AbstractValue {
    let new_range = operand.range_.map(|(low, high)| {
        match op {
            '+' => (low.map(|l| l + constant), high.map(|h| h + constant)),
            '-' => (low.map(|l| l - constant), high.map(|h| h - constant)),
            '*' => {
                let vals = [
                    low.map(|l| l * constant),
                    high.map(|h| h * constant),
                ];
                (vals.iter().filter_map(|&v| v).min(), 
                 vals.iter().filter_map(|&v| v).max())
            }
            _ => (None, None),  // Unknown op -> unbounded
        }
    });
    
    AbstractValue {
        type_: operand.type_,
        range_: new_range,
        nullable: operand.nullable,
        constant: if operand.is_constant() && new_range.map(|(l, h)| l == h).unwrap_or(false) {
            new_range.and_then(|(l, _)| l.map(ConstantValue::Int))
        } else {
            None
        },
    }
}
```

### CAP-AI-15, CAP-AI-16, CAP-AI-17: Multi-Language Support

```rust
/// CAP-AI-15: Get null keywords for language
fn get_null_keywords(language: Language) -> &'static [&'static str] {
    match language {
        Language::Python => &["None"],
        Language::TypeScript | Language::JavaScript => &["null", "undefined"],
        Language::Go => &["nil"],
        Language::Rust => &[],  // Rust has no null (None is Option::None)
        Language::Java | Language::Kotlin | Language::CSharp => &["null"],
        Language::Swift => &["nil"],
        _ => &["null", "nil", "None"],  // Fallback
    }
}

/// CAP-AI-16: Get boolean keywords for language
fn get_boolean_keywords(language: Language) -> HashMap<&'static str, bool> {
    match language {
        Language::Python => [("True", true), ("False", false)].into(),
        Language::TypeScript | Language::JavaScript | Language::Go | Language::Rust => 
            [("true", true), ("false", false)].into(),
        _ => [("True", true), ("False", false), ("true", true), ("false", false)].into(),
    }
}

/// CAP-AI-17: Get comment pattern for language
fn get_comment_pattern(language: Language) -> &'static str {
    match language {
        Language::Python => "#",
        Language::TypeScript | Language::JavaScript | Language::Go | 
        Language::Rust | Language::Java | Language::CSharp | Language::Kotlin | Language::Swift => "//",
        _ => "#",  // Fallback
    }
}
```

### Main Algorithm

```rust
/// Compute abstract interpretation using forward dataflow with widening.
/// 
/// Algorithm:
/// 1. Initialize entry block with parameter values (top)
/// 2. Iterate in reverse postorder until fixpoint:
///    - state_in[b] = join(state_out[p] for p in predecessors[b])
///    - Apply widening at loop headers (CAP-AI-09)
///    - state_out[b] = transfer(state_in[b], block[b])
/// 3. Detect potential issues:
///    - Division by zero: divisor.may_be_zero() (CAP-AI-10)
///    - Null dereference: obj.may_be_null() at attribute access (CAP-AI-11)
/// 4. Return AbstractInterpInfo
pub fn compute_abstract_interp(
    cfg: &CfgInfo,
    refs: &[VarRef],
    source_lines: Option<&[String]>,
    language: Language,
) -> Result<AbstractInterpInfo, TldrError> {
    let predecessors = build_predecessors(cfg);
    let loop_headers = find_back_edges(cfg);
    let block_order = reverse_postorder(cfg);
    
    // Initialize
    let mut state_in: HashMap<u32, AbstractState> = HashMap::new();
    let mut state_out: HashMap<u32, AbstractState> = HashMap::new();
    
    let entry = cfg.entry_block_id;
    let init_state = init_params(cfg, refs);
    state_in.insert(entry, init_state.clone());
    state_out.insert(entry, init_state);
    
    for block in &cfg.blocks {
        if block.id != entry {
            state_in.insert(block.id, AbstractState::default());
            state_out.insert(block.id, AbstractState::default());
        }
    }
    
    // Iterate until fixpoint
    let mut changed = true;
    let max_iterations = cfg.blocks.len() * 10 + 100;
    let mut iteration = 0;
    
    while changed && iteration < max_iterations {
        changed = false;
        iteration += 1;
        
        for &block_id in &block_order {
            if block_id == entry {
                continue;
            }
            
            let block = cfg.blocks.iter().find(|b| b.id == block_id).unwrap();
            let preds = predecessors.get(&block_id).unwrap_or(&vec![]);
            
            // Join predecessors
            let pred_states: Vec<_> = preds.iter()
                .map(|p| state_out[p].clone())
                .collect();
            let mut new_in = join_states(&pred_states);
            
            // Apply widening at loop headers
            if loop_headers.contains(&block_id) {
                new_in = widen_state(&state_in[&block_id], &new_in);
            }
            
            // Transfer function
            let new_out = transfer_block(&new_in, block, refs, source_lines, language);
            
            if state_in[&block_id] != new_in || state_out[&block_id] != new_out {
                changed = true;
                state_in.insert(block_id, new_in);
                state_out.insert(block_id, new_out);
            }
        }
    }
    
    // Detect issues
    let potential_div_zero = find_div_zero(cfg, refs, &state_in, source_lines, &state_out);  // CAP-AI-20
    let potential_null_deref = find_null_deref(cfg, refs, &state_in);
    
    Ok(AbstractInterpInfo {
        state_in,
        state_out,
        potential_div_zero,
        potential_null_deref,
        function_name: cfg.function_name.clone(),
    })
}
```

### Behavioral Contracts

| ID | Contract | Test Case |
|----|----------|-----------|
| CAP-AI-01 | Nullability enum values | NEVER="never", MAYBE="maybe", ALWAYS="always" |
| CAP-AI-02 | AbstractValue is frozen/hashable | Can be used in HashSet |
| CAP-AI-03 | from_constant(5) -> range=[5,5], constant=5 | Integer constant |
| CAP-AI-04 | top() -> None type/range, MAYBE nullable | Unknown value |
| CAP-AI-05 | may_be_zero(): range [-5,5] -> true | Division check |
| CAP-AI-06 | may_be_null(): NEVER -> false | Null check |
| CAP-AI-07 | AbstractState.get() -> top() for missing | Default behavior |
| CAP-AI-08 | Join [1,1] and [10,10] -> [1,10] | Range union |
| CAP-AI-09 | Growing upper bound -> widen to +inf | Loop termination |
| CAP-AI-10 | Division by may_be_zero() divisor flagged | Safety check |
| CAP-AI-11 | Attribute access on may_be_null() flagged | Safety check |
| CAP-AI-12 | Constant propagation through arithmetic | x=5; y=x+1 -> y=[6,6] |
| CAP-AI-13 | Arithmetic on ranges | [5,5] + 3 -> [8,8] |
| CAP-AI-14 | RHS parsing: int, float, str, None | Literal extraction |
| CAP-AI-15 | Python None, TS null/undefined, Go nil | Multi-language null |
| CAP-AI-16 | Python True/False, TS true/false | Multi-language bool |
| CAP-AI-17 | Python #, TS // comment stripping | Parse accuracy |
| CAP-AI-18 | String "hello" -> range=[5,5] (length) | String tracking |
| CAP-AI-19 | y = x copies abstract value | Variable flow |
| CAP-AI-20 | x=5; y=1/x no warning (intra-block) | False positive prevention |
| CAP-AI-21 | Query methods return correct values | API contract |
| CAP-AI-22 | to_json() JSON-serializable | CLI output |

---

## CLI Interface

### Available Expressions Command

```bash
tldr available <file> <function> [--lang <lang>] [--check <expr>] [--at-line <n>]
```

**Output format:**
```json
{
  "function": "example",
  "avail_in": {
    "0": [],
    "1": [{"text": "a + b", "operands": ["a", "b"], "line": 2}]
  },
  "avail_out": {
    "0": [{"text": "a + b", "operands": ["a", "b"], "line": 2}],
    "1": [{"text": "a + b", "operands": ["a", "b"], "line": 2}]
  },
  "all_expressions": [
    {"text": "a + b", "operands": ["a", "b"], "line": 2}
  ],
  "redundant_computations": [
    {"expr": "a + b", "first_at": 2, "redundant_at": 4}
  ]
}
```

### Abstract Interpretation Command

```bash
tldr abstract-interp <file> <function> [--lang <lang>] [--var <name>] [--line <n>]
```

**Output format:**
```json
{
  "function": "example",
  "state_in": {
    "0": {},
    "1": {
      "x": {"type": "int", "range": [5, 5], "nullable": "never", "constant": 5}
    }
  },
  "state_out": {
    "0": {
      "x": {"type": "int", "range": [5, 5], "nullable": "never", "constant": 5}
    }
  },
  "potential_div_zero": [
    {"line": 10, "var": "divisor"}
  ],
  "potential_null_deref": [
    {"line": 15, "var": "obj"}
  ]
}
```

---

## Error Handling

```rust
/// Dataflow analysis errors
#[derive(Debug, thiserror::Error)]
pub enum DataflowError {
    #[error("CFG has no blocks")]
    EmptyCfg,
    
    #[error("Entry block {0} not found in CFG")]
    EntryBlockNotFound(u32),
    
    #[error("Analysis did not converge after {0} iterations")]
    NoConvergence(usize),
    
    #[error("Source line {0} out of range")]
    LineOutOfRange(u32),
    
    #[error("Block {0} not found")]
    BlockNotFound(u32),
}
```

---

## Edge Cases (from tests)

### Available Expressions

1. **Empty function** - No expressions, empty avail_in/avail_out
2. **Unreachable block** - Still included in results (entry/exit present)
3. **Self-loop** - Handles gracefully without infinite loop
4. **Multiple expressions** - Tracked independently
5. **Diamond CFG single branch** - Expression NOT available at merge (MUST)
6. **Diamond CFG both branches** - Expression IS available at merge

### Abstract Interpretation

1. **Empty function** - Empty state, no warnings
2. **Unknown RHS** - Defaults to top()
3. **Parameter starts as top** - Unknown input
4. **Negative constant** - Range [-5, -5]
5. **String constant tracks length** - "hello" -> [5, 5]
6. **Nested loops** - Terminates via widening
7. **Rust has no null keyword** - None is Option::None, not null

---

## Implementation Phases

### Phase 1: Foundation (types.rs)
- [ ] BlockId type alias
- [ ] build_predecessors() helper
- [ ] find_back_edges() helper
- [ ] reverse_postorder() helper

### Phase 2: Available Expressions (available.rs)
- [ ] Expression struct (CAP-AE-01)
- [ ] normalize_expression() (CAP-AE-02)
- [ ] BlockExpressions (CAP-AE-03)
- [ ] AvailableExprsInfo (CAP-AE-08-11)
- [ ] compute_available_exprs() (CAP-AE-04-05)
- [ ] redundant_computations() (CAP-AE-06-07)
- [ ] Function call exclusion (CAP-AE-12)

### Phase 3: Abstract Interpretation (abstract_interp.rs)
- [ ] Nullability enum (CAP-AI-01)
- [ ] AbstractValue struct (CAP-AI-02-06)
- [ ] AbstractState struct (CAP-AI-07)
- [ ] join_states(), widen_state() (CAP-AI-08-09)
- [ ] parse_rhs_abstract() (CAP-AI-14)
- [ ] apply_arithmetic() (CAP-AI-13)
- [ ] Multi-language keywords (CAP-AI-15-17)
- [ ] compute_abstract_interp()
- [ ] find_div_zero(), find_null_deref() (CAP-AI-10-11)
- [ ] AbstractInterpInfo (CAP-AI-21-22)

### Phase 4: Integration
- [ ] Add `pub mod dataflow;` to lib.rs
- [ ] Re-export main types
- [ ] CLI commands in tldr-cli

### Phase 5: Testing
- [ ] Port all 32 available_exprs tests
- [ ] Port all 56 abstract_interp tests
- [ ] Coverage target: 80%

---

## Success Criteria

1. All 34 capabilities implemented
2. All 88 test cases ported and passing
3. JSON output matches v1 CLI format
4. Multi-language support (Python, TypeScript, Go, Rust)
5. Performance: O(blocks * expressions) for available, O(blocks * vars * iterations) for abstract
6. No panics on edge cases (empty CFG, unreachable blocks, self-loops)

---

## References

- Dragon Book Ch. 10: Available Expressions
- Cooper & Torczon: "Engineering a Compiler" Ch. 9
- Cousot & Cousot (1977): "Abstract Interpretation: A Unified Lattice Model"
- Python source: packages/tldr-code/available_exprs.py (538 lines)
- Python source: packages/tldr-code/abstract_interp.py (980 lines)
- Gap analysis: thoughts/shared/gap-analysis/session13-available-abstract-gap.yaml