1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
// SPDX-FileCopyrightText: 2025-2026 Carlson Büth <code@cbueth.de>
//
// SPDX-License-Identifier: MIT OR Apache-2.0
//! # Transfer Entropy $T_{X \to Y}$
//! Transfer entropy (TE) from the source process $X$ to the target process $Y$ is the amount of
//! uncertainty reduced in the future values of target $Y$ by knowing the past values of source $X$,
//! after considering the past values of the target $Y$.
//! It is the model-free and directional measure between two processes.
//! ## Definition
//! Let $X(x_n)$ and $Y(y_n)$ be two time series processes as source and target variables.
//! Then $T_{X \to Y}$ from source to target is written as:
//! $$T_{x \to y}(k, l) = -\sum_{y_{n+1}, \mathbf{y}_n^{(l)}, \mathbf{x}_n^{(k)}}
//! p(y_{n+1}, \mathbf{y}_n^{(l)}, \mathbf{x}_n^{(k)})
//! \log \left( \frac{p(y_{n+1} \mid \mathbf{y}_n^{(l)}, \mathbf{x}_n^{(k)})}
//! {p(y_{n+1} \mid \mathbf{y}_n^{(l)})} \right)$$
//! where:
//! - $y_{n+1}$ is the next state of $Y$ at time $n$,
//! - $\mathbf{y}_n^{(l)} = \{y_n, \dots, y_{n-l+1}\}$ is the embedding vector of $Y$ with history length $l$,
//! - $\mathbf{x}_n^{(k)} = \{x_n, \dots, x_{n-k+1}\}$ is the embedding vector of $X$ with history length $k$.
//! ## Transfer Entropy with Lag
//! One can add a source-to-destination time propagation or lag $u$:
//! $$T_{x \to y}(k, l, u) = -\sum_{y_{n+1+u}, \mathbf{y}_n^{(l)}, \mathbf{x}_n^{(k)}}
//! p(y_{n+1+u}, \mathbf{y}_n^{(l)}, \mathbf{x}_n^{(k)})
//! \log \left( \frac{p(y_{n+1+u} \mid \mathbf{y}_n^{(l)}, \mathbf{x}_n^{(k)})}
//! {p(y_{n+1+u} \mid \mathbf{y}_n^{(l)})} \right)$$
//! ## Local Transfer Entropy
//! Similar to local entropy and MI, we can extract the **local or point-wise transfer entropy**.
//! It is the amount of information transfer attributed to the specific realisation
//! $(x_{n+1}, \mathbf{X}_n^{(k)}, \mathbf{Y}_n^{(l)})$ at time step $n+1$:
//! $$t_{X \to Y}(n+1, k, l) = -\log \left(\frac{p(y_{n+1} \mid \mathbf{y}_n^{(l)}, \mathbf{x}_n^{(k)})}
//! {p(y_{n+1} \mid \mathbf{y}_n^{(l)})} \right)$$
//! The TE can be written as the global average of the local TE:
//! $$T_{X \to Y}(k, l, u) = \langle t_{X \to Y}(n + 1, k, l) \rangle$$
//! Local TE values can be negative, unlike its global counterpart; this means the source
//! is misleading about the prediction of target's next step.
//! In this crate, local TE can be accessed via the [`LocalValues`](crate::estimators::traits::LocalValues) trait
//! on TE estimators that support it.
//! ## Effective Transfer Entropy (eTE)
//! Real world time series data are usually biased due to finite size effects.
//! Effective TE is defined as the difference between the original TE and TE calculated on
//! surrogate (randomly shuffled) data:
//! $$\\operatorname{eTE} = \\operatorname{T}_{X \\to Y} - \\operatorname{T}_{X_{\\text{shuffled}} \\to Y}$$
//! This helps correct for bias from finite sample sizes.
//! ## Implemented in This Crate
//! ### Continuous TE: Kraskov-Stögbauer-Grassberger (KSG)
//! The KSG method [Kraskov et al., 2004](super::references#ksg2004) can be adapted for transfer entropy estimation
//! [Kaiser & Schreiber, 2002](super::references#kaiser2002). It uses the same neighbor-counting logic as KSG MI but
//! applied to the specific embedding vectors:
//! $$\mathrm{TE}(X \to Y) = \psi(k) + \langle \psi(n_{Y_\mathrm{past}} + 1) - \psi(n_{Y_\mathrm{future}, Y_\mathrm{past}} + 1) - \psi(n_{Y_\mathrm{past}, X_\mathrm{past}} + 1) \rangle$$
//! See the [KSG Approach Module](crate::estimators::approaches::expfam::ksg) for technical details.
//! ```rust
//! use infomeasure::estimators::transfer_entropy::TransferEntropy;
//! use infomeasure::estimators::traits::GlobalValue;
//! use ndarray::array;
//! let x = array![0.1, 0.2, 0.3, 0.4, 0.5];
//! let y = array![0.15, 0.25, 0.35, 0.45, 0.55];
//! let te = TransferEntropy::new_ksg(&x, &y, 1, 1, 1, 2, 1e-10).global_value();
//! assert!(te >= -1.0);
//! ```
//! ### Other Estimators
//! TE is available through the [`TransferEntropy`](crate::estimators::transfer_entropy::TransferEntropy) facade type:
//! - [`TransferEntropy::new_discrete_mle`](crate::estimators::transfer_entropy::TransferEntropy::new_discrete_mle)
//! - [`TransferEntropy::new_discrete_miller_madow`](crate::estimators::transfer_entropy::TransferEntropy::new_discrete_miller_madow)
//! - [`TransferEntropy::new_discrete_shrink`](crate::estimators::transfer_entropy::TransferEntropy::new_discrete_shrink)
//! - [`TransferEntropy::new_discrete_grassberger`](crate::estimators::transfer_entropy::TransferEntropy::new_discrete_grassberger)
//! - [`TransferEntropy::new_discrete_nsb`](crate::estimators::transfer_entropy::TransferEntropy::new_discrete_nsb)
//! - [`TransferEntropy::new_kernel`](crate::estimators::transfer_entropy::TransferEntropy::new_kernel)
//! - [`TransferEntropy::new_ksg`](crate::estimators::transfer_entropy::TransferEntropy::new_ksg)
//! - [`TransferEntropy::new_renyi`](crate::estimators::transfer_entropy::TransferEntropy::new_renyi)
//! - [`TransferEntropy::new_ordinal`](crate::estimators::transfer_entropy::TransferEntropy::new_ordinal)
//! ## Transfer Entropy as Conditional Mutual Information
//! Transfer entropy is conceptually equivalent to **conditional mutual information** where
//! the conditioning is on the target's own past. This provides a useful perspective:
//! $$T_{X \to Y}(k, l) = I(X^{(k)}; Y_{t+1} \mid Y^{(l)})$$
//! This means:
//! - We measure MI between the source's history ($X^{(k)}$) and the target's future ($Y_{t+1}$)
//! - But we condition on the target's own history ($Y^{(l)}$) to remove spurious correlations
//! from the target's temporal autocorrelation
//!
//! This perspective connects TE to:
//! - [Conditional MI](super::cond_mi): TE is a special case of CMI
//! - [Mutual Information](super::mutual_information): Without conditioning, it's just lagged MI
//! ## Choosing History Lengths
//! The parameters $k$ (source history) and $l$ (target history) control the memory of the analysis:
//! | Parameter | Description | Practical Guidance |
//! |-----------|-------------|-------------------|
//! | $k$ | Source history length | Start small (1-2). Too large wastes data and adds noise. |
//! | $l$ | Target history length | Should capture the target's autocorrelation. For Markov processes, $l=1$ may suffice. |
//! | $u$ | Lag (propagation delay) | Set to expected delay in the causal mechanism. $u=1$ is immediate next-step. |
//! ### Practical Tips
//! - **Start with $k=l=1$**: This is the simplest case and often sufficient for many systems.
//! - **Avoid overfitting**: Longer histories require more data to estimate reliably.
//! - **Check stationarity**: TE assumes stationary processes. Transform or detrend non-stationary data first.
//! - **Consider effective TE**: Use shuffling to correct for finite-sample bias (see eTE section).
//! ## Practical Example: Detecting Causality
//! Transfer entropy is commonly used to detect directed coupling between time series.
//! Here's an example with synthetic data where $X$ causally influences $Y$:
//! ```rust
//! use infomeasure::estimators::transfer_entropy::TransferEntropy;
//! use infomeasure::estimators::traits::GlobalValue;
//! use ndarray::array;
//! // Synthetic causal system: X -> Y
//! // Y(t) = X(t-1)
//! let x = array![0, 1, 0, 1, 0, 1, 0, 1, 0, 1];
//! let y = array![0, 0, 1, 0, 1, 0, 1, 0, 1, 0];
//! // TE from X to Y should be positive (causal influence)
//! let te_xy = TransferEntropy::new_discrete_mle(
//! &x, &y,
//! 1, // source history
//! 1, // target history
//! 1 // step
//! ).global_value();
//! assert!(te_xy > 0.0); // Positive TE for causal influence
//! // TE from Y to X should be lower (no reverse causation)
//! let te_yx = TransferEntropy::new_discrete_mle(
//! &y, &x,
//! 1, // source history
//! 1, // target history
//! 1 // step
//! ).global_value();
//! assert!(te_yx >= 0.0);
//! // For this deterministic system, TE(X->Y) > TE(Y->X)
//! assert!(te_xy > te_yx);
//! ```
//! ## Example
//! ```rust
//! use infomeasure::estimators::transfer_entropy::TransferEntropy;
//! use infomeasure::estimators::traits::GlobalValue;
//! use ndarray::array;
//! let source = array![0, 1, 0, 1, 0, 1, 0, 1];
//! let dest = array![0, 0, 1, 0, 1, 0, 1, 0];
//! let te = TransferEntropy::new_discrete_mle(
//! &source, &dest,
//! 1, // source history length
//! 1, // dest history length
//! 1 // step size
//! ).global_value();
//! assert!(te >= 0.0); // TE is always non-negative
//! ```
//! ## See Also
//! - [Mutual Information](super::mutual_information) — TE is CMI on histories
//! - [Conditional MI](super::cond_mi) — General CMI
//! - [Estimator Usage Guide](super::estimator_usage) — Detailed usage examples
//! - [Conditional TE Guide](super::cond_te) — TE with conditioning
//! - [Macros Guide](super::macros) — Convenience macros for TE estimators