1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
//! Traditional descriptive statistics for vector data.
//!
//! This module provides high-level statistical operations built on top of
//! Trueno's SIMD-optimized primitives. Key features:
//!
//! - Quantiles and percentiles using R-7 method (Hyndman & Fan 1996)
//! - Five-number summary (min, Q1, median, Q3, max)
//! - Histograms with multiple bin selection methods
//! - Hypothesis testing (t-tests, chi-square, ANOVA)
//! - Covariance and correlation matrices
//! - Optimized with Toyota Way principles (`QuickSelect` for O(n) quantiles)
//!
//! # Examples
//!
//! ```
//! use aprender::stats::DescriptiveStats;
//! use trueno::Vector;
//!
//! let data = Vector::from_slice(&[1.0, 2.0, 3.0, 4.0, 5.0]);
//! let stats = DescriptiveStats::new(&data);
//!
//! assert_eq!(stats.quantile(0.5).expect("median should be computable for valid data"), 3.0); // median
//! assert_eq!(stats.quantile(0.0).expect("min quantile should be computable for valid data"), 1.0); // min
//! assert_eq!(stats.quantile(1.0).expect("max quantile should be computable for valid data"), 5.0); // max
//! ```
pub use ;
pub use ;
use Vector;
/// Descriptive statistics computed on a vector of f32 values.
///
/// Holds a reference to the data vector to avoid unnecessary copying.
/// Uses lazy evaluation and caching for repeated computations.
/// Five-number summary: minimum, Q1, median, Q3, maximum.
///
/// This is the foundation for box plots and outlier detection.
/// Histogram representation with bin edges and counts.
/// Bin selection methods for histogram construction.
///
/// Different methods are optimal for different data distributions:
/// - `FreedmanDiaconis`: Default for unimodal distributions
/// - `Sturges`: Best for small datasets (n < 200)
/// - `Scott`: Best for smooth, normal-like data
/// - `SquareRoot`: Simple rule of thumb
/// - `Bayesian`: Best for multimodal/heavy-tailed distributions
include!;
include!;