greeners 1.0.0

High-performance econometrics with R/Python formulas. Two-Way Clustering, Marginal Effects (AME/MEM), HC1-4, IV Predictions, Categorical C(var), Polynomial I(x^2), Interactions, Diagnostics. OLS, IV/2SLS, DiD, Logit/Probit, Panel (FE/RE), Time Series (VAR/VECM), Quantile!
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
# Greeners: High-Performance Econometrics in Rust

![Build Status](https://img.shields.io/badge/build-passing-brightgreen)
![Version](https://img.shields.io/badge/version-1.0.0-blue)
![License](https://img.shields.io/badge/license-GPLv3-green)
![Stability](https://img.shields.io/badge/stability-stable-green)

**Greeners** is a lightning-fast, type-safe econometrics library written in pure Rust. It provides a comprehensive suite of estimators for Cross-Sectional, Time-Series, and Panel Data analysis, leveraging linear algebra backends (LAPACK/BLAS) for maximum performance.

Designed for academic research, heavy simulations, and production-grade economic modeling.

## 🎉 v1.0.0 STABLE RELEASE: Specification Tests

Greeners reaches **production stability** with comprehensive **specification tests** for diagnosing regression assumptions!

### Specification Tests (NEW in v1.0.0)

Diagnose violations of classical regression assumptions and identify appropriate remedies:

```rust
use greeners::{OLS, SpecificationTests, Formula, DataFrame, CovarianceType};

// Estimate model
let model = OLS::from_formula(&Formula::parse("wage ~ education + experience")?, &df, CovarianceType::NonRobust)?;
let (y, x) = df.to_design_matrix(&formula)?;
let residuals = model.residuals(&y, &x);
let fitted = model.fitted_values(&x);

// 1. White Test for Heteroskedasticity
let (lm_stat, p_value, df) = SpecificationTests::white_test(&residuals, &x)?;
if p_value < 0.05 {
    println!("Heteroskedasticity detected! Use CovarianceType::HC3");
}

// 2. RESET Test for Functional Form Misspecification
let (f_stat, p_value, _, _) = SpecificationTests::reset_test(&y, &x, &fitted, 3)?;
if p_value < 0.05 {
    println!("Misspecification detected! Add polynomials or interactions");
}

// 3. Breusch-Godfrey Test for Autocorrelation
let (lm_stat, p_value, df) = SpecificationTests::breusch_godfrey_test(&residuals, &x, 1)?;
if p_value < 0.05 {
    println!("Autocorrelation detected! Use CovarianceType::NeweyWest(4)");
}

// 4. Goldfeld-Quandt Test for Heteroskedasticity
let (f_stat, p_value, _, _) = SpecificationTests::goldfeld_quandt_test(&residuals, 0.25)?;
```

**When to Use:**
- **White Test** → General heteroskedasticity test (any form)
- **RESET Test** → Detect omitted variables or wrong functional form
- **Breusch-Godfrey** → Detect autocorrelation in time series/panel data
- **Goldfeld-Quandt** → Test heteroskedasticity when you suspect specific ordering

**Remedies:**
- Heteroskedasticity → `CovarianceType::HC3` or `HC4`
- Autocorrelation → `CovarianceType::NeweyWest(lags)`
- Misspecification → Add `I(x^2)`, `x1*x2` interactions

**Stata/R/Python Equivalents:**
- **Stata**: `estat hettest`, `estat ovtest`, `estat bgodfrey`
- **R**: `lmtest::bptest()`, `lmtest::resettest()`, `lmtest::bgtest()`
- **Python**: `statsmodels.stats.diagnostic.het_white()`

📖 See `examples/specification_tests.rs` for comprehensive demonstration.

## ✨ NEW: R/Python-Style Formula API

Greeners now supports **R/Python-style formula syntax** (like `statsmodels` and `lm()`), making model specification intuitive and concise:

```rust
use greeners::{OLS, DataFrame, Formula, CovarianceType};

// Python equivalent: smf.ols('y ~ x1 + x2', data=df).fit(cov_type='HC1')
let formula = Formula::parse("y ~ x1 + x2")?;
let result = OLS::from_formula(&formula, &df, CovarianceType::HC1)?;
```

**All estimators support formulas:** OLS, WLS, DiD, IV/2SLS, Logit/Probit, Quantile Regression, Panel Data (FE/RE/Between), and more!

📖 See [FORMULA_API.md](FORMULA_API.md) for complete documentation and examples.

## 🚀 NEW in v0.9.0: Panel Diagnostics & Model Selection

Greeners now provides comprehensive tools for **panel data model selection** and **information criteria-based model comparison** - essential for rigorous empirical research!

### Model Selection & Comparison

Compare multiple models using **AIC/BIC** with automatic ranking and **Akaike weights** for model averaging:

```rust
use greeners::{OLS, ModelSelection, DataFrame, Formula, CovarianceType};

// Estimate competing models
let model1 = OLS::from_formula(&Formula::parse("y ~ x1 + x2 + x3")?, &df, CovarianceType::NonRobust)?;
let model2 = OLS::from_formula(&Formula::parse("y ~ x1 + x2")?, &df, CovarianceType::NonRobust)?;
let model3 = OLS::from_formula(&Formula::parse("y ~ x1")?, &df, CovarianceType::NonRobust)?;

// Compare models
let models = vec![
    ("Full Model", model1.log_likelihood, 4, n_obs),
    ("Restricted", model2.log_likelihood, 3, n_obs),
    ("Simple", model3.log_likelihood, 2, n_obs),
];
let comparison = ModelSelection::compare_models(models);
ModelSelection::print_comparison(&comparison);

// Calculate Akaike weights for model averaging
let aic_values: Vec<f64> = comparison.iter().map(|(_, aic, _, _, _)| *aic).collect();
let (delta_aic, weights) = ModelSelection::akaike_weights(&aic_values);
```

**Output:**
```
=============================== Model Comparison ===============================
Model                |          AIC |          BIC | Rank(AIC) | Rank(BIC)
--------------------------------------------------------------------------------
Full Model           |       183.83 |       191.48 |        1 |        1
Restricted           |       184.77 |       190.50 |        2 |        2
Simple               |       188.19 |       192.01 |        3 |        3

📊 AKAIKE WEIGHTS:
Δ_AIC < 2: Substantial support
Δ_AIC 4-7: Considerably less support
Δ_AIC > 10: Essentially no support
```

### Panel Diagnostics Tests

Test whether pooled OLS is appropriate or if panel data methods (Fixed/Random Effects) are needed:

#### Breusch-Pagan LM Test for Random Effects

```rust
use greeners::{PanelDiagnostics, OLS, Formula};

// Estimate pooled OLS
let model_pooled = OLS::from_formula(&formula, &df, CovarianceType::NonRobust)?;
let (y, x) = df.to_design_matrix(&formula)?;
let residuals = model_pooled.residuals(&y, &x);

// Test for random effects
let (lm_stat, p_value) = PanelDiagnostics::breusch_pagan_lm(&residuals, &firm_ids)?;

// Interpretation:
// H₀: σ²_u = 0 (no panel effects, pooled OLS adequate)
// H₁: σ²_u > 0 (random effects needed)
// If p < 0.05 → Use Random Effects or Fixed Effects
```

#### F-Test for Fixed Effects

```rust
// Test if firm fixed effects are significant
let (f_stat, p_value) = PanelDiagnostics::f_test_fixed_effects(
    ssr_pooled,
    ssr_fe,
    n_obs,
    n_firms,
    k_params,
)?;

// Interpretation:
// H₀: All firm effects are zero (pooled OLS adequate)
// H₁: Firm effects exist (use fixed effects)
// If p < 0.05 → Use Fixed Effects model
```

### Summary Statistics

Quick descriptive statistics for initial data exploration:

```rust
use greeners::SummaryStats;

let stats = SummaryStats::describe(&data);
// Returns: (mean, std, min, Q25, median, Q75, max, n_obs)

// Pretty-print summary table
let summary_data = vec![
    ("investment", stats_inv),
    ("profit", stats_profit),
    ("cash_flow", stats_cf),
];
SummaryStats::print_summary(&summary_data);
```

**Stata/R/Python Equivalents:**
- **Stata**: `estat ic` (AIC/BIC), `xttest0` (BP LM), `testparm` (F-test)
- **R**: `AIC()`, `BIC()`, `plm::plmtest()`, `plm::pFtest()`
- **Python**: `statsmodels` information criteria, `linearmodels.panel` diagnostics

📖 See `examples/panel_model_selection.rs` for comprehensive demonstration with panel data workflow.

## 🌟 NEW in v0.5.0: Marginal Effects for Binary Choice Models

After estimating Logit/Probit models, **coefficients alone are hard to interpret** (they're on log-odds/z-score scale). **Marginal effects** translate these to **probability changes** - essential for policy analysis and substantive interpretation!

### Average Marginal Effects (AME) - RECOMMENDED

```rust
use greeners::{Logit, Formula, DataFrame};

// Estimate Logit model
let formula = Formula::parse("admitted ~ gpa + sat + legacy")?;
let result = Logit::from_formula(&formula, &df)?;

// Get design matrix
let (_, x) = df.to_design_matrix(&formula)?;

// Calculate Average Marginal Effects (AME)
let ame = result.average_marginal_effects(&x)?;

// Interpretation: AME[gpa] = 0.15 means:
// "A 1-point increase in GPA increases admission probability by 15 percentage points"
// (averaged across all students in the sample)
```

**Why AME?**
- ✅ Accounts for heterogeneity across observations
- ✅ More robust to non-linearities
- ✅ Standard in modern econometrics (Stata, R, Python)
- ✅ Easy to interpret: probability changes, not log-odds

### Marginal Effects at Means (MEM)

```rust
// Calculate Marginal Effects at Means (MEM)
let mem = result.marginal_effects_at_means(&x)?;

// Interpretation: Effect for "average" student
// ⚠️ Less robust than AME - can evaluate at impossible values (e.g., average of dummies)
```

### Predictions

```rust
// Predict admission probabilities for new students
let probs = result.predict_proba(&x_new);

// Example: probs[0] = 0.85 → 85% chance of admission
```

### Logit vs Probit Comparison

```rust
// Both models give similar marginal effects
let logit_result = Logit::from_formula(&formula, &df)?;
let probit_result = Probit::from_formula(&formula, &df)?;

let ame_logit = logit_result.average_marginal_effects(&x)?;
let ame_probit = probit_result.average_marginal_effects(&x)?;

// Typically: ame_logit ≈ ame_probit (differences < 1-2 percentage points)
```

**Stata/R/Python Equivalents:**
- **Stata**: `margins, dydx(*)` (AME) or `margins, dydx(*) atmeans` (MEM)
- **R**: `mfx::logitmfx()` or `margins::margins()`
- **Python**: `statsmodels.discrete.discrete_model.Logit(...).get_margeff()`

📖 See `examples/marginal_effects.rs` for comprehensive demonstration with college admission data.

### Two-Way Clustered Standard Errors

For **panel data** with clustering along **two dimensions** (e.g., firms × time):

```rust
// Panel data: 4 firms × 6 time periods
let firm_ids = vec![0,0,0,0,0,0, 1,1,1,1,1,1, 2,2,2,2,2,2, 3,3,3,3,3,3];
let time_ids = vec![0,1,2,3,4,5, 0,1,2,3,4,5, 0,1,2,3,4,5, 0,1,2,3,4,5];

// Two-way clustering (Cameron-Gelbach-Miller, 2011)
let result = OLS::from_formula(
    &formula,
    &df,
    CovarianceType::ClusteredTwoWay(firm_ids, time_ids)
)?;

// Formula: V = V_firm + V_time - V_intersection
// Accounts for BOTH within-firm AND within-time correlation
```

**When to use:**
- ✅ Panel data (firms/countries over time)
- ✅ Correlation within entities AND within time periods
- ✅ More robust than one-way clustering
- ✅ Standard in modern panel data econometrics

**Stata equivalent:** `reghdfe y x, vce(cluster firm_id time_id)`

📖 See `examples/two_way_clustering.rs` for complete comparison of non-robust vs one-way vs two-way clustering.

## 🎊 NEW in v0.4.0: Categorical Variables & Polynomial Terms

### Categorical Variable Encoding
Automatic dummy variable creation with R/Python syntax:

```rust
// Categorical variable: creates dummies, drops first level
let formula = Formula::parse("sales ~ advertising + C(region)")?;
let result = OLS::from_formula(&formula, &df, CovarianceType::HC3)?;

// If region has values [0, 1, 2, 3] → creates 3 dummies (drops 0 as reference)
```

**How it works:**
- `C(var)` detects unique values in the variable
- Creates K-1 dummy variables (drops first category as reference)
- Essential for categorical data (regions, industries, treatment groups)

### Polynomial Terms
Non-linear relationships made easy:

```rust
// Quadratic model: captures diminishing returns
let formula = Formula::parse("output ~ input + I(input^2)")?;

// Cubic model: more flexible
let formula = Formula::parse("y ~ x + I(x^2) + I(x^3)")?;

// Alternative syntax (Python-style)
let formula = Formula::parse("y ~ x + I(x**2)")?;
```

**Use cases:**
- Production functions (diminishing returns)
- Wage curves (experience effects)
- Growth models (non-linear dynamics)

**Combine with interactions:**
```rust
// Region-specific quadratic effects
let formula = Formula::parse("sales ~ C(region) * I(advertising^2)")?;
```

## 🆕 NEW in v0.2.0: Clustered Standard Errors & Advanced Diagnostics

### Clustered Standard Errors
Critical for panel data and hierarchical structures where observations are grouped:

```rust
// Panel data: firms over time
let cluster_ids = vec![0,0,0, 1,1,1, 2,2,2]; // Firm IDs
let result = OLS::from_formula(&formula, &df, CovarianceType::Clustered(cluster_ids))?;
```

**Use clustered SE when:**
- Panel data (repeated observations per entity)
- Hierarchical data (students in schools, patients in hospitals)
- Experimental data with treatment clusters
- Geographic clustering (observations in regions/countries)

### Advanced Diagnostics
New diagnostic tools for model validation:

```rust
use greeners::Diagnostics;

// Multicollinearity detection
let vif = Diagnostics::vif(&x)?;              // Variance Inflation Factor
let cond_num = Diagnostics::condition_number(&x)?;  // Condition Number

// Influential observations
let leverage = Diagnostics::leverage(&x)?;    // Hat values
let cooks_d = Diagnostics::cooks_distance(&residuals, &x, mse)?;  // Cook's Distance

// Assumption testing (already available)
let (jb_stat, jb_p) = Diagnostics::jarque_bera(&residuals)?;  // Normality
let (bp_stat, bp_p) = Diagnostics::breusch_pagan(&residuals, &x)?;  // Heteroskedasticity
let dw_stat = Diagnostics::durbin_watson(&residuals);  // Autocorrelation
```

## 🎉 NEW in v0.3.0: Interactions, HC2/HC3, and Predictions

### Interaction Terms
Model interaction effects with R/Python syntax:

```rust
// Full interaction: x1 * x2 expands to x1 + x2 + x1:x2
let formula = Formula::parse("wage ~ education * female")?;
let result = OLS::from_formula(&formula, &df, CovarianceType::HC3)?;

// Interaction only: just the product term
let formula2 = Formula::parse("wage ~ education + female + education:female")?;
```

**Use cases:**
- Differential effects by groups (e.g., education returns by gender)
- Treatment effect heterogeneity
- Testing moderation/mediation hypotheses

### Enhanced Robust Standard Errors

```rust
// HC2: Leverage-adjusted (more efficient with small samples)
let result_hc2 = OLS::from_formula(&formula, &df, CovarianceType::HC2)?;

// HC3: Jackknife (most robust - RECOMMENDED for small samples)
let result_hc3 = OLS::from_formula(&formula, &df, CovarianceType::HC3)?;
```

**Comparison:**
- **HC1**: White (1980), uses n/(n-k) correction
- **HC2**: Adjusts for leverage: σ²/(1-h_i)
- **HC3**: Jackknife: σ²/(1-h_i)² - Most conservative & robust

### Post-Estimation Predictions

```rust
// Out-of-sample predictions
let x_new = Array2::from_shape_vec((3, 2), vec![1.0, 12.0, 1.0, 16.0, 1.0, 20.0])?;
let predictions = result.predict(&x_new);

// In-sample fitted values
let fitted = result.fitted_values(&x);

// Residuals
let resid = result.residuals(&y, &x);
```

## 🚀 Features

### Cross-Sectional & General
- **OLS & GLS:** Robust standard errors (White, Newey-West).
- **IV / 2SLS:** Instrumental Variables for endogeneity correction.
- **Quantile Regression:** Robust estimation via Iteratively Reweighted Least Squares (IRLS).
- **Discrete Choice:** Logit and Probit models (Newton-Raphson MLE).
- **Diagnostics:** R-squared, F-Test, T-Test, Confidence Intervals.

### Time Series (Macroeconometrics)
- **Unit Root Tests:** Augmented Dickey-Fuller (ADF).
- **VAR (Vector Autoregression):** Multivariate modeling with Information Criteria (AIC/BIC).
- **VARMA:** Hannan-Rissanen algorithm for ARMA structures.
- **VECM (Cointegration):** Johansen Procedure (Eigenvalue decomposition) for I(1) systems.
- **Impulse Response Functions (IRF):** Orthogonalized structural shocks.

### Panel Data
- **Fixed Effects (Within):** Absorbs individual heterogeneity.
- **Random Effects:** Swamy-Arora GLS estimator.
- **Between Estimator:** Long-run cross-sectional relationships.
- **Dynamic Panel:** Arellano-Bond (Difference GMM) to solve Nickell Bias.
- **Panel Threshold:** Hansen (1999) non-linear regime switching models.
- **Testing:** Hausman Test for FE vs RE.

### Systems of Equations
- **SUR:** Seemingly Unrelated Regressions (Zellner).
- **3SLS:** Three-Stage Least Squares (System IV).

## System Requirements (Pre-requisites)

### Debian / Ubuntu / Pop!_OS:

```bash
sudo apt-get update

sudo apt-get install gfortran libopenblas-dev liblapack-dev pkg-config build-essential
```

### Fedora / RHEL / CentOS:

```bash
sudo dnf install gcc-gfortran openblas-devel lapack-devel pkg-config
```

### Arch Linux / Manjaro:

```bash
sudo pacman -S gcc-fortran openblas lapack base-devel
```

### macOS:

```bash
brew install openblas lapack
```

## 📦 Installation

Add this to your `Cargo.toml`:

```toml
[dependencies]
greeners = "0.1.1"
ndarray = "0.15"
# Note: You must have a BLAS/LAPACK provider installed on your system
ndarray-linalg = { version = "0.14", features = ["openblas"] }
```

## 🎯 Quick Start

### Loading Data from CSV (NEW!)

```rust
use greeners::{DataFrame, Formula, OLS, CovarianceType};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load data from CSV file with headers (just like pandas!)
    let df = DataFrame::from_csv("data.csv")?;

    // Specify model using formula
    let formula = Formula::parse("y ~ x1 + x2")?;

    // Estimate with robust standard errors
    let result = OLS::from_formula(&formula, &df, CovarianceType::HC1)?;

    println!("{}", result);
    Ok(())
}
```

### Using Formula API (R/Python Style)

```rust
use greeners::{OLS, DataFrame, Formula, CovarianceType};
use ndarray::Array1;
use std::collections::HashMap;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create data manually (like a pandas DataFrame)
    let mut data = HashMap::new();
    data.insert("y".to_string(), Array1::from(vec![1.0, 2.0, 3.0, 4.0, 5.0]));
    data.insert("x1".to_string(), Array1::from(vec![1.0, 2.0, 3.0, 4.0, 5.0]));
    data.insert("x2".to_string(), Array1::from(vec![2.0, 2.5, 3.0, 3.5, 4.0]));

    let df = DataFrame::new(data)?;

    // Specify model using formula (just like Python/R!)
    let formula = Formula::parse("y ~ x1 + x2")?;

    // Estimate with robust standard errors
    let result = OLS::from_formula(&formula, &df, CovarianceType::HC1)?;

    println!("{}", result);
    Ok(())
}
```

### Traditional Matrix API

```rust
use greeners::{OLS, CovarianceType};
use ndarray::{Array1, Array2};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let y = Array1::from(vec![1.0, 2.0, 3.0, 4.0, 5.0]);
    let x = Array2::from_shape_vec((5, 2), vec![
        1.0, 2.0,
        2.0, 2.5,
        3.0, 3.0,
        4.0, 3.5,
        5.0, 4.0,
    ])?;

    let result = OLS::fit(&y, &x, CovarianceType::HC1)?;
    println!("{}", result);
    Ok(())
}
```

## 📚 Formula API Examples

### Difference-in-Differences

```rust
use greeners::{DiffInDiff, DataFrame, Formula, CovarianceType};

// Python: smf.ols('outcome ~ treated + post + treated:post', data=df).fit(cov_type='HC1')
let formula = Formula::parse("outcome ~ treated + post")?;
let result = DiffInDiff::from_formula(&formula, &df, "treated", "post", CovarianceType::HC1)?;
```

### Instrumental Variables (2SLS)

```rust
use greeners::{IV, Formula, CovarianceType};

// Endogenous equation: y ~ x1 + x_endog
// Instruments: z1, z2
let endog_formula = Formula::parse("y ~ x1 + x_endog")?;
let instrument_formula = Formula::parse("~ z1 + z2")?;
let result = IV::from_formula(&endog_formula, &instrument_formula, &df, CovarianceType::HC1)?;
```

### Logit/Probit

```rust
use greeners::{Logit, Probit, Formula};

// Binary choice models
let formula = Formula::parse("binary_outcome ~ x1 + x2 + x3")?;
let logit_result = Logit::from_formula(&formula, &df)?;
let probit_result = Probit::from_formula(&formula, &df)?;
```

### Panel Data (Fixed Effects)

```rust
use greeners::{FixedEffects, Formula};

let formula = Formula::parse("y ~ x1 + x2")?;
let result = FixedEffects::from_formula(&formula, &df, &entity_ids)?;
```

### Quantile Regression

```rust
use greeners::{QuantileReg, Formula};

// Median regression
let formula = Formula::parse("y ~ x1 + x2")?;
let result = QuantileReg::from_formula(&formula, &df, 0.5, 200)?;
```

## 🔧 Formula Syntax

- **Basic:** `y ~ x1 + x2 + x3` (with intercept)
- **No intercept:** `y ~ x1 + x2 - 1` or `y ~ 0 + x1 + x2`
- **Intercept only:** `y ~ 1`

All formulas follow R/Python syntax for familiarity and ease of use.

## 📖 Documentation

- **[FORMULA_API.md]FORMULA_API.md** - Complete formula API guide with Python/R equivalents
- **[examples/]examples/** - Working examples for all estimators
  - `csv_formula_example.rs` - **Load CSV files and run regressions**
  - `formula_example.rs` - General formula API demonstration
  - `did_formula_example.rs` - Difference-in-Differences with formulas
  - `quickstart_formula.rs` - Quick start example

Run examples:
```bash
cargo run --example csv_formula_example
cargo run --example formula_example
cargo run --example did_formula_example
```

## 🎯 Why Greeners?

1. **Familiar Syntax:** R/Python-style formulas make transition seamless
2. **Type Safety:** Rust's type system catches errors at compile time
3. **Performance:** Native speed with BLAS/LAPACK backends
4. **Comprehensive:** Full suite of econometric estimators
5. **Production Ready:** Memory safe, no garbage collection pauses