1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
use load_diabetes_raw_data;
use ;
use OnceLock;
// Use `OnceLock` for thread-safe delayed initialization
static DIABETES_DATA: = new;
/// Internal function to load and process the raw diabetes dataset.
///
/// This function loads the raw diabetes dataset, parses the CSV-like format,
/// and converts it into structured ndarray arrays. It handles the parsing
/// of headers and data rows, extracting features and labels from the dataset.
///
/// # Returns
///
/// - `&'static Array1<&'static str>` - Static reference to the headers of the dataset.
/// - `&'static Array2<f64>` - Static reference to the feature matrix (768x8).
/// - `&'static Array1<f64>` - Static reference to the binary labels (0 or 1).
///
/// # Panics
///
/// This function will panic if:
/// - The raw data cannot be parsed as valid f64 values
/// - The dataset structure doesn't match the expected format (768 samples, 9 columns total)
/// - Memory allocation fails during array creation
/// Loads the diabetes dataset
///
/// # Returns
///
/// - `&'static Array1<&'static str>`: Static reference to the headers of the dataset
/// - `&'static Array2<f64>`: Static reference to the feature matrix where each row is a sample and each column is a feature
/// - `&'static Array1<f64>`: Static reference to class variable (0 or 1)
///
/// # Examples
/// ```rust
/// use rustyml::dataset::diabetes::load_diabetes;
///
/// let (headers, features, classes) = load_diabetes();
/// assert_eq!(headers.len(), 9);
/// assert_eq!(features.shape(), &[768, 8]);
/// assert_eq!(classes.len(), 768);
/// ```
///
/// # Panics
///
/// This function will panic if:
/// - The raw data cannot be parsed as valid f64 values
/// - The dataset structure doesn't match the expected format (768 samples, 9 columns total)
/// - Memory allocation fails during array creation
/// Loads the diabetes dataset and returns owned copies
///
/// Use this function when you need owned data that can be modified.
/// For read-only access, prefer `load_diabetes()` which returns references.
///
/// # Returns
///
/// - `Array1<&'static str>`: Owned array of column headers from the dataset, containing 9 feature names plus the target label name
/// - `Array2<f64>`: Owned feature matrix with shape (768, 8) where each row represents a patient sample and each column represents a feature (pregnancies, glucose, blood pressure, skin thickness, insulin, BMI, diabetes pedigree function, age)
/// - `Array1<f64>`: Owned target labels array with shape (768,) containing binary classification outcomes (0.0 for non-diabetic, 1.0 for diabetic)
///
/// # Performance Notes
///
/// This function creates owned copies by cloning the static data, which incurs additional memory allocation.
/// If you only need read-only access to the data, use `load_diabetes()` instead for better performance.
///
/// # Examples
/// ```rust
/// use rustyml::dataset::diabetes::load_diabetes_owned;
///
/// let (mut headers, mut features, mut labels) = load_diabetes_owned();
///
/// // You can now modify the data since these are owned copies
/// assert_eq!(headers.len(), 9);
/// assert_eq!(features.shape(), &[768, 8]);
/// assert_eq!(labels.len(), 768);
///
/// // Example: Modify feature values (not possible with references)
/// features[[0, 0]] = 10.0;
/// labels[0] = 1.0;
/// ```
///
/// # Panics
///
/// This function will panic if:
/// - The raw data cannot be parsed as valid f64 values
/// - The dataset structure doesn't match the expected format (768 samples, 9 columns total)
/// - Memory allocation fails during array creation