KiThe 0.3.2

A numerical suite for chemical kinetics and thermodynamics, combustion, heat and mass transfer,chemical engeneering. Work in progress. Advices and contributions will be appreciated
Documentation
# TGA Dataset Operation History System - Implementation Guide

## Overview
This document describes the complete implementation of the operation history tracking system for TGADataset.

## Core Data Structures

### 1. OperationRecord
```rust
#[derive(Clone, Debug)]
pub struct OperationRecord {
    pub timestamp: usize,           // Sequential counter
    pub operation_name: String,     // Function name
    pub affected_columns: AffectedColumns,
    pub expr: Option<Expr>,         // Polars expression if applicable
    pub description: String,        // Human-readable description
    pub reversible: bool,           // Can operation be undone?
}
```

### 2. AffectedColumns
```rust
#[derive(Clone, Debug)]
pub enum AffectedColumns {
    Specific(Vec<String>),          // Explicit column names
    All,                             // All columns affected
    Semantic(Vec<ColumnTypes>),     // Semantic column types
}
```

### 3. ColumnHistory
```rust
#[derive(Clone, Debug)]
pub struct ColumnHistory {
    pub column_name: String,
    pub operations: Vec<OperationRecord>,
}

impl ColumnHistory {
    pub fn reversible_count(&self) -> usize;
    pub fn irreversible_count(&self) -> usize;
    pub fn has_irreversible(&self) -> bool;
}
```

## TGADataset Integration

### Updated Structure
```rust
pub struct TGADataset {
    pub frame: LazyFrame,
    pub schema: TGASchema,
    pub oneframeplot: Option<OneFramePlot>,
    pub history_of_operations: Vec<OperationRecord>,  // NEW
}
```

## Core Methods

### 1. Logging Operations
```rust
fn log_operation(
    &mut self,
    name: &str,
    affected: AffectedColumns,
    expr: Option<Expr>,
    description: String,
    reversible: bool,
)
```

**Usage Example:**
```rust
self.log_operation(
    "trim_edges",
    AffectedColumns::All,
    None,
    format!("Trimmed {} rows from left, {} from right", left, right),
    false,  // irreversible - data is removed
);
```

### 2. Query Operations
```rust
// Get all operations affecting a specific column
pub fn operations_on_column(&self, col: &str) -> Vec<&OperationRecord>

// Get complete history for a column
pub fn get_column_history(&self, col: &str) -> ColumnHistory
```

### 3. Display Methods
```rust
// Print all operations
pub fn print_operation_history(&self)

// Print operations for specific column
pub fn print_column_history(&self, col: &str)
```

## Operation Classification

### Reversible Operations (reversible: true)
- **Column transformations**: scale, offset, unit conversions
- **Derived columns**: rates, dimensionless values
- **Algebraic operations**: exp, ln (with new column)
- **Renaming**: column name changes

**Characteristics:**
- Original data preserved
- Can be mathematically inverted
- No data loss

### Irreversible Operations (reversible: false)
- **Row removal**: trim_edges, filter_rows, cut_interval
- **Column deletion**: drop_column
- **Data aggregation**: operations that lose information

**Characteristics:**
- Data permanently removed
- Cannot be undone
- Information loss

## Implementation Examples

### Example 1: Reversible Operation (Unit Conversion)
```rust
pub fn celsius_to_kelvin(mut self) -> Self {
    let col_name = self.schema.temperature.as_ref().unwrap().clone();
    let expr = (col(&col_name) + lit(273.15)).alias(&col_name);
    self.frame = self.frame.with_column(expr.clone());
    
    // Update metadata
    if let Some(meta) = self.schema.columns.get_mut(&col_name) {
        meta.unit = Unit::Kelvin;
        meta.origin = ColumnOrigin::PolarsDerived;
    }

    // Log operation
    self.log_operation(
        "celsius_to_kelvin",
        AffectedColumns::Semantic(vec![ColumnTypes::Temperature]),
        Some(expr),
        format!("Converted {} from Celsius to Kelvin", col_name),
        true,  // reversible: can subtract 273.15
    );

    self
}
```

### Example 2: Irreversible Operation (Trim Edges)
```rust
pub fn trim_edges(mut self, left: usize, right: usize) -> Self {
    let df = self.frame.collect().unwrap();
    let total = df.height();
    let length = total.saturating_sub(left + right);
    let sliced_df = df.slice(left as i64, length);
    let frame = sliced_df.lazy();

    self.log_operation(
        "trim_edges",
        AffectedColumns::All,
        None,  // No Expr - physical row removal
        format!("Trimmed {} rows from left, {} from right", left, right),
        false,  // irreversible: data is permanently removed
    );

    Self {
        frame,
        schema: self.schema,
        oneframeplot: None,
        history_of_operations: self.history_of_operations,
    }
}
```

### Example 3: Derived Column (Rate Calculation)
```rust
pub fn derive_rate(mut self, source_col: &str, new_col: &str) -> Result<Self, TGADomainError> {
    // ... calculation logic ...
    
    let dv = col(source_col).shift(lit(-1)) - col(source_col).shift(lit(1));
    let dt = col(&time).shift(lit(-1)) - col(time).shift(lit(1));
    let rate_expr = (dv.clone() / dt.clone()).alias(new_col);
    
    self.frame = self.frame.with_column(rate_expr.clone());
    
    // ... metadata update ...

    self.log_operation(
        "derive_rate",
        AffectedColumns::Specific(vec![new_col.to_string()]),
        Some(rate_expr),
        format!("Computed rate {} from {} with unit {:?}", new_col, source_col, out_unit),
        true,  // reversible: new column, original preserved
    );

    Ok(self)
}
```

## Usage Patterns

### Pattern 1: Track Processing Pipeline
```rust
let dataset = TGADataset::from_csv("data.csv", "time", "temp", "mass")?
    .trim_edges(5, 5)
    .celsius_to_kelvin()
    .derive_rate("mass", "dm_dt")?
    .dimensionless_mass(0.0, 10.0, "alpha")?;

// View complete history
dataset.print_operation_history();
```

**Output:**
```
=== Operation History ===
[0] trim_edges - Trimmed 5 rows from left, 5 from right (reversible: false)
  Columns: ALL
[1] celsius_to_kelvin - Converted temp from Celsius to Kelvin (reversible: true)
  Semantic: [Temperature]
[2] derive_rate - Computed rate dm_dt from mass with unit MilligramPerSecond (reversible: true)
  Columns: ["dm_dt"]
[3] dimensionless_mass - Computed dimensionless mass alpha from mass (m0=10.5) (reversible: true)
  Columns: ["alpha"]
```

### Pattern 2: Column-Specific History
```rust
// Get history for specific column
dataset.print_column_history("alpha");
```

**Output:**
```
=== History for column 'alpha' ===
Total operations: 2
Reversible: 1
Irreversible: 1
  [0] trim_edges - Trimmed 5 rows from left, 5 from right (reversible: false)
  [3] dimensionless_mass - Computed dimensionless mass alpha from mass (m0=10.5) (reversible: true)
```

### Pattern 3: Programmatic Query
```rust
let history = dataset.get_column_history("mass");

if history.has_irreversible() {
    println!("Warning: Column 'mass' has undergone irreversible operations!");
}

println!("Total transformations: {}", history.operations.len());
println!("Reversible: {}", history.reversible_count());
println!("Irreversible: {}", history.irreversible_count());
```

## Integration Checklist

### For Each Column-Modifying Function:

1. **Identify operation type**:
   - Does it remove data? → `reversible: false`
   - Does it transform data? → `reversible: true`

2. **Determine affected columns**:
   - Specific columns? → `AffectedColumns::Specific(vec![...])`
   - All columns? → `AffectedColumns::All`
   - Semantic type? → `AffectedColumns::Semantic(vec![...])`

3. **Capture expression** (if applicable):
   - Polars operation? → `Some(expr.clone())`
   - Physical operation? → `None`

4. **Add logging call**:
   ```rust
   self.log_operation(
       "function_name",
       affected_columns,
       optional_expr,
       "Human-readable description".to_string(),
       is_reversible,
   );
   ```

## Functions Updated with Logging

### In one_experiment_dataset.rs:
- `derive_rate0`
-`derive_rate`
-`add_numeric_column`

### In exp_kinetics_column_manipulation.rs (see exp_kinetics_column_manipulation_logging.rs):
- `with_column_expr`
-`filter_rows`
-`trim_edges`
-`trim_null_edges`
-`rename_column`
-`drop_column`
-`celsius_to_kelvin`
-`seconds_to_hours`
-`scale_columns`
-`scale_column`
-`offset_column`
-`dimensionless_mass`
-`conversion`

### Functions to Update (TODO):
- `cut_interval`, `cut_time_interval`, `cut_temperature_interval`, `cut_mass_interval`
- `trim_range`, `trim_range_inverse`
- `trim_null_edges_for_columns`
- `scale_column_in_range_by_reference`, `scale_column_in_its_range`
- `offset_column_in_range_by_reference`, `offset_column_in_its_range`
- `calibrate_mass_from_voltage`, `calibrate_mass`
- `unary_column_op`
- `exp_column`, `ln_column`
- `derive_dimensionless_mass`, `derive_conversion`

## Benefits

1. **Transparency**: Complete audit trail of all transformations
2. **Debugging**: Trace issues back to specific operations
3. **Reproducibility**: Understand exact processing pipeline
4. **Safety**: Identify irreversible operations before they happen
5. **Documentation**: Self-documenting data processing workflow

## Future Enhancements

1. **Undo/Redo**: Implement operation reversal for reversible operations
2. **Export**: Save operation history to JSON/YAML
3. **Replay**: Reconstruct processing pipeline from history
4. **Validation**: Check operation compatibility before execution
5. **Visualization**: Generate flowchart of processing pipeline