1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
//! Core analyzer framework for computing metrics from data.
//!
//! This module provides the foundational traits and types for building analyzers
//! that compute metrics independently of validation checks. Analyzers support
//! incremental computation through state management and can be efficiently
//! combined by the AnalysisRunner.
//!
//! ## Available Analyzers
//!
//! - **Basic Analyzers** (`basic`): Fundamental metrics like count, mean, min/max
//! - **Advanced Analyzers** (`advanced`): Complex metrics like entropy, correlation
//! - **Column Profiler** (`profiler`): Three-pass algorithm for comprehensive column analysis
//! - **Type Inference Engine** (`inference`): Robust data type detection from string data
//! - **Constraint Suggestions** (`suggestions`): Intelligent recommendations for data quality checks
//!
//! ## Key Features
//!
//! ### Type Inference Engine
//! Automatically detects column data types with confidence scores:
//! - Numeric types (Integer, Float, Decimal with precision/scale)
//! - Temporal types (Date, DateTime, Time with format detection)
//! - Boolean values (various representations: true/false, yes/no, 1/0, etc.)
//! - Categorical vs. free text distinction
//! - Mixed type columns with graceful handling
//!
//! ### Column Profiler
//! Efficient three-pass profiling algorithm:
//! - Pass 1: Basic statistics and type sampling
//! - Pass 2: Histogram computation for low-cardinality columns
//! - Pass 3: Distribution analysis for numeric columns
//!
//! ### Constraint Suggestion Engine
//! Rule-based system that analyzes column profiles to recommend data quality constraints:
//! - **Completeness**: Suggests null checks based on current completeness levels
//! - **Uniqueness**: Identifies potential primary keys and unique constraints
//! - **Patterns**: Detects common formats (email, date, phone)
//! - **Ranges**: Recommends min/max bounds for numeric data
//! - **Data Types**: Ensures type consistency across columns
//! - **Cardinality**: Identifies categorical columns and monitors distinct values
//!
//! ## Example Usage
//!
//! ```rust,ignore
//! use term_guard::analyzers::{SuggestionEngine, CompletenessRule, ColumnProfile, BasicStatistics, DetectedDataType};
//! use term_guard::test_fixtures::create_minimal_tpc_h_context;
//!
//! # tokio::runtime::Runtime::new().unwrap().block_on(async {
//! let ctx = create_minimal_tpc_h_context().await.unwrap();
//!
//! // Create a mock profile for demonstration
//! // In a real scenario, this would come from the ColumnProfiler
//! let profile = ColumnProfile {
//! column_name: "l_orderkey".to_string(),
//! data_type: DetectedDataType::Integer,
//! basic_stats: BasicStatistics {
//! row_count: 1000,
//! null_count: 10,
//! null_percentage: 0.01,
//! approximate_cardinality: 980,
//! min_value: Some("1".to_string()),
//! max_value: Some("1000".to_string()),
//! sample_values: vec!["1".to_string(), "500".to_string(), "1000".to_string()],
//! },
//! categorical_histogram: None,
//! numeric_distribution: None,
//! passes_executed: vec![1, 2, 3],
//! profiling_time_ms: 50,
//! };
//!
//! // Constraint suggestions
//! let suggestion_engine = SuggestionEngine::new()
//! .add_rule(Box::new(CompletenessRule::new()))
//! .confidence_threshold(0.7);
//!
//! let suggestions = suggestion_engine.suggest_constraints(&profile);
//! for suggestion in suggestions {
//! println!("Suggested: {} (confidence: {:.2})",
//! suggestion.check_type, suggestion.confidence);
//! println!(" Rationale: {}", suggestion.rationale);
//! }
//! # })
//! ```
pub use ;
pub use AnalyzerContext;
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use AnalysisRunner;
pub use ;
pub use ;
pub use ;
pub use ;