# Module quantiles::greenwald_khanna
[−]
[src]

Greenwald Khanna calculates epsilon-approximate quantiles.
If the desired quantile is phi, the epsilon-approximate
quantile is any element in the range of elements that rank
between `lbound((phi-epsilon) x N)`

and `lbound((phi+epsilon) x N)`

terminology from the paper:

- S: set of observations
- n: number of observations in S
- v[i]: observation i in S
- r: rank of observation in S from 1 to n.
`r_min(v[i])`

: lower bound on rank r of v[i]`r_max(v[i])`

: upper bound on rank r of v[i]`g[i] = r_min(v[i]) - r_min(v[i - 1])`

`delta[i] = r_max(v[i]) - r_min(v[i])`

`t[i] = tuple(v[i], g[i], delta[i])`

- phi: quantile as a real number in the range [0,1]
- r: ubound(phi * n)

identities:

`r_min(v[i]) = forall j<=i sum of g[j]`

`r_max(v[i]) = ( forall j<=i sum of g[j] ) + delta[i]`

- g[i] + delta[i] - 1 is an upper bound on the total number of observations
- between v[i] and v[i-1]
- sum of g[i] = n

results:

`max_i(g[i] + delta[i]) <= 2 * epsilon * n`

- a tuple is full if g[i] + delta[i] = floor(2 * epsilon * n)

`@inproceedings{Greenwald:2001:SOC:375663.375670, author = {Greenwald, Michael and Khanna, Sanjeev}, title = {Space-efficient Online Computation of Quantile Summaries}, booktitle = {Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data}, series = {SIGMOD '01}, year = {2001}, isbn = {1-58113-332-4}, location = {Santa Barbara, California, USA}, pages = {58--66}, numpages = {9}, url = {http://doi.acm.org/10.1145/375663.375670}, doi = {10.1145/375663.375670}, acmid = {375670}, publisher = {ACM}, address = {New York, NY, USA}, }`

# Examples

use quantiles::greenwald_khanna::*; let epsilon = 0.01; let mut stream = Stream::new(epsilon); let n = 1001; for i in 1..n { stream.insert(i); } let in_range = |phi: f64, value: u32| { let lower = ((phi - epsilon) * (n as f64)) as u32; let upper = ((phi + epsilon) * (n as f64)) as u32; (epsilon > phi || lower <= value) && value <= upper }; assert!(in_range(0f64, *stream.quantile(0f64))); assert!(in_range(0.1f64, *stream.quantile(0.1f64))); assert!(in_range(0.2f64, *stream.quantile(0.2f64))); assert!(in_range(0.3f64, *stream.quantile(0.3f64))); assert!(in_range(0.4f64, *stream.quantile(0.4f64))); assert!(in_range(1f64, *stream.quantile(1f64)));

## Structs

Stream |
The summary S of the observations seen so far. |

Tuple |
3-tuple of a value v[i], g[i] and delta[i]. |

## Functions

find_insert_pos |
Locates the proper position of v in a vector vs such that when v is inserted at position i, it is less then the element at i+1 if any, and greater than or equal to the element at i-1 if any. |

find_insert_pos_linear |
Locates the proper position of v in a vector vs such that when v is inserted at position i, it is less then the element at i+1 if any, and greater than or equal to the element at i-1 if any. Works by scanning the slice from start to end. |