Trait StatArray

Source
pub trait StatArray<X> {
    // Required methods
    fn min(&self) -> X;
    fn max(&self) -> X;
    fn ranks(&self) -> Vec<f64>;
    fn mean(&self) -> X;
    fn median(&self) -> X;
    fn cumsum(&self) -> Vec<X>;
    fn cumprod(&self) -> Vec<X>;
    fn cummax(&self) -> Vec<X>;
    fn cummin(&self) -> Vec<X>;
    fn mad(&self, deviation_type: DeviationType) -> X;
    fn wmad(&self, w: &Self) -> X;
}

Required Methods§

Source

fn min(&self) -> X

Maxima and Minima

§Description:

Returns the (regular or parallel) maxima and minima of the input values.

‘pmax*()’ and ‘pmin*()’ take one or more vectors as arguments, recycle them to common length and return a single vector giving the ‘parallel’ maxima (or minima) of the argument vectors.

§Usage:

max(…, na.rm = FALSE) min(…, na.rm = FALSE)

pmax(…, na.rm = FALSE) pmin(…, na.rm = FALSE)

pmax.int(…, na.rm = FALSE) pmin.int(…, na.rm = FALSE)

§Arguments:
  • …: numeric or character arguments (see Note).
  • na.rm: a logical indicating whether missing values should be removed.
§Details:

‘max’ and ‘min’ return the maximum or minimum of all the values present in their arguments, as ‘integer’ if all are ‘logical’ or ‘integer’, as ‘double’ if all are numeric, and character otherwise.

If ‘na.rm’ is ‘FALSE’ an ‘NA’ value in any of the arguments will cause a value of ‘NA’ to be returned, otherwise ‘NA’ values are ignored.

The minimum and maximum of a numeric empty set are ‘+Inf’ and ‘-Inf’ (in this order!) which ensures transitivity, e.g., ‘min(x1, min(x2)) == min(x1, x2)’. For numeric ‘x’ ‘max(x) ==/ -Inf’ and ‘min(x) == +Inf’ whenever ‘length(x) == 0’ (after removing missing values if requested). However, ‘pmax’ and ‘pmin’ return ‘NA’ if all the parallel elements are ‘NA’ even for ‘na.rm = TRUE’.

‘pmax’ and ‘pmin’ take one or more vectors (or matrices) as arguments and return a single vector giving the ‘parallel’ maxima (or minima) of the vectors. The first element of the result is the maximum (minimum) of the first elements of all the arguments, the second element of the result is the maximum (minimum) of the second elements of all the arguments and so on. Shorter inputs (of non-zero length) are recycled if necessary. Attributes (see ‘attributes’: such as ‘names’ or ‘dim’) are copied from the first argument (if applicable, e.g., not for an ‘S4’ object).

‘pmax.int’ and ‘pmin.int’ are faster internal versions only used when all arguments are atomic vectors and there are no classes: they drop all attributes. (Note that all versions fail for raw and complex vectors since these have no ordering.)

‘max’ and ‘min’ are generic functions: methods can be defined for them individually or via the ‘Summary’ group generic. For this to work properly, the arguments ‘…’ should be unnamed, and dispatch is on the first argument.

By definition the min/max of a numeric vector containing an ‘NaN’ is ‘NaN’, except that the min/max of any vector containing an ‘NA’ is ‘NA’ even if it also contains an ‘NaN’. Note that ‘max(NA, Inf) == NA’ even though the maximum would be ‘Inf’ whatever the missing value actually is.

Character versions are sorted lexicographically, and this depends on the collating sequence of the locale in use: the help for ‘Comparison’ gives details. The max/min of an empty character vector is defined to be character ‘NA’. (One could argue that as ‘“”’ is the smallest character element, the maximum should be ‘“”’, but there is no obvious candidate for the minimum.)

§Value:

For ‘min’ or ‘max’, a length-one vector. For ‘pmin’ or ‘pmax’, a vector of length the longest of the input vectors, or length zero if one of the inputs had zero length.

The type of the result will be that of the highest of the inputs in the hierarchy integer < double < character.

For ‘min’ and ‘max’ if there are only numeric inputs and all are empty (after possible removal of ‘NA’s), the result is double (‘Inf’ or ‘-Inf’).

§S4 methods:

‘max’ and ‘min’ are part of the S4 ‘Summary’ group generic. Methods for them must use the signature ‘x, …, na.rm’.

§Note:

‘Numeric’ arguments are vectors of type integer and numeric, and logical (coerced to integer). For historical reasons, ‘NULL’ is accepted as equivalent to ‘integer(0)’.

‘pmax’ and ‘pmin’ will also work on classed S3 or S4 objects with appropriate methods for comparison, ‘is.na’ and ‘rep’ (if recycling of arguments is needed).

§References:

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

§See Also:

‘range’ (both min and max) and ‘which.min’ (‘which.max’) for the arg min, i.e., the location where an extreme value occurs.

‘plotmath’ for the use of ‘min’ in plot annotation.

§Examples:
require(stats); require(graphics)
 min(5:1, pi) #-> one number
pmin(5:1, pi) #->  5  numbers

x <- sort(rnorm(100));  cH <- 1.35
pmin(cH, quantile(x)) # no names
pmin(quantile(x), cH) # has names
plot(x, pmin(cH, pmax(-cH, x)), type = "b", main =  "Huber's function")

cut01 <- function(x) pmax(pmin(x, 1), 0)
curve( x^2 - 1/4, -1.4, 1.5, col = 2)
curve(cut01(x^2 - 1/4), col = "blue", add = TRUE, n = 500)
## pmax(), pmin() preserve attributes of *first* argument
D <- diag(x = (3:1)/4) ; n0 <- numeric()
stopifnot(identical(D,  cut01(D) ),
identical(n0, cut01(n0)),
identical(n0, cut01(NULL)),
identical(n0, pmax(3:1, n0, 2)),
identical(n0, pmax(n0, 4)))
Source

fn max(&self) -> X

` Maxima and Minima

§Description:

Returns the (regular or parallel) maxima and minima of the input values.

‘pmax*()’ and ‘pmin*()’ take one or more vectors as arguments, recycle them to common length and return a single vector giving the ‘parallel’ maxima (or minima) of the argument vectors.

§Usage:

max(…, na.rm = FALSE) min(…, na.rm = FALSE)

pmax(…, na.rm = FALSE) pmin(…, na.rm = FALSE)

pmax.int(…, na.rm = FALSE) pmin.int(…, na.rm = FALSE)

§Arguments:
  • …: numeric or character arguments (see Note).
  • na.rm: a logical indicating whether missing values should be removed.
§Details:

‘max’ and ‘min’ return the maximum or minimum of all the values present in their arguments, as ‘integer’ if all are ‘logical’ or ‘integer’, as ‘double’ if all are numeric, and character otherwise.

If ‘na.rm’ is ‘FALSE’ an ‘NA’ value in any of the arguments will cause a value of ‘NA’ to be returned, otherwise ‘NA’ values are ignored.

The minimum and maximum of a numeric empty set are ‘+Inf’ and ‘-Inf’ (in this order!) which ensures transitivity, e.g., ‘min(x1, min(x2)) == min(x1, x2)’. For numeric ‘x’ ‘max(x) ==/ -Inf’ and ‘min(x) == +Inf’ whenever ‘length(x) == 0’ (after removing missing values if requested). However, ‘pmax’ and ‘pmin’ return ‘NA’ if all the parallel elements are ‘NA’ even for ‘na.rm = TRUE’.

‘pmax’ and ‘pmin’ take one or more vectors (or matrices) as arguments and return a single vector giving the ‘parallel’ maxima (or minima) of the vectors. The first element of the result is the maximum (minimum) of the first elements of all the arguments, the second element of the result is the maximum (minimum) of the second elements of all the arguments and so on. Shorter inputs (of non-zero length) are recycled if necessary. Attributes (see ‘attributes’: such as ‘names’ or ‘dim’) are copied from the first argument (if applicable, e.g., not for an ‘S4’ object).

‘pmax.int’ and ‘pmin.int’ are faster internal versions only used when all arguments are atomic vectors and there are no classes: they drop all attributes. (Note that all versions fail for raw and complex vectors since these have no ordering.)

‘max’ and ‘min’ are generic functions: methods can be defined for them individually or via the ‘Summary’ group generic. For this to work properly, the arguments ‘…’ should be unnamed, and dispatch is on the first argument.

By definition the min/max of a numeric vector containing an ‘NaN’ is ‘NaN’, except that the min/max of any vector containing an ‘NA’ is ‘NA’ even if it also contains an ‘NaN’. Note that ‘max(NA, Inf) == NA’ even though the maximum would be ‘Inf’ whatever the missing value actually is.

Character versions are sorted lexicographically, and this depends on the collating sequence of the locale in use: the help for ‘Comparison’ gives details. The max/min of an empty character vector is defined to be character ‘NA’. (One could argue that as ‘“”’ is the smallest character element, the maximum should be ‘“”’, but there is no obvious candidate for the minimum.)

§Value:

For ‘min’ or ‘max’, a length-one vector. For ‘pmin’ or ‘pmax’, a vector of length the longest of the input vectors, or length zero if one of the inputs had zero length.

The type of the result will be that of the highest of the inputs in the hierarchy integer < double < character.

For ‘min’ and ‘max’ if there are only numeric inputs and all are empty (after possible removal of ‘NA’s), the result is double (‘Inf’ or ‘-Inf’).

§S4 methods:

‘max’ and ‘min’ are part of the S4 ‘Summary’ group generic. Methods for them must use the signature ‘x, …, na.rm’.

§Note:

‘Numeric’ arguments are vectors of type integer and numeric, and logical (coerced to integer). For historical reasons, ‘NULL’ is accepted as equivalent to ‘integer(0)’.

‘pmax’ and ‘pmin’ will also work on classed S3 or S4 objects with appropriate methods for comparison, ‘is.na’ and ‘rep’ (if recycling of arguments is needed).

§References:

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

§See Also:

‘range’ (both min and max) and ‘which.min’ (‘which.max’) for the arg min, i.e., the location where an extreme value occurs.

‘plotmath’ for the use of ‘min’ in plot annotation.

§Examples:
require(stats); require(graphics)
 min(5:1, pi) #-> one number
pmin(5:1, pi) #->  5  numbers

x <- sort(rnorm(100));  cH <- 1.35
pmin(cH, quantile(x)) # no names
pmin(quantile(x), cH) # has names
plot(x, pmin(cH, pmax(-cH, x)), type = "b", main =  "Huber's function")

cut01 <- function(x) pmax(pmin(x, 1), 0)
curve( x^2 - 1/4, -1.4, 1.5, col = 2)
curve(cut01(x^2 - 1/4), col = "blue", add = TRUE, n = 500)
## pmax(), pmin() preserve attributes of *first* argument
D <- diag(x = (3:1)/4) ; n0 <- numeric()
stopifnot(identical(D,  cut01(D) ),
identical(n0, cut01(n0)),
identical(n0, cut01(NULL)),
identical(n0, pmax(3:1, n0, 2)),
identical(n0, pmax(n0, 4)))
Source

fn ranks(&self) -> Vec<f64>

Sample Ranks

§Description:

Returns the sample ranks of the values in a vector. Ties (i.e., equal values) and missing values can be handled in several ways.

§Usage:

rank(x, na.last = TRUE, ties.method = c(“average”, “first”, “last”, “random”, “max”, “min”))

§Arguments:
  • x: a numeric, complex, character or logical vector.
  • na.last: a logical or character string controlling the treatment of ‘NA’s. If ‘TRUE’, missing values in the data are put last; if ‘FALSE’, they are put first; if ‘NA’, they are removed; if ‘“keep”’ they are kept with rank ‘NA’.
  • ties.method: a character string specifying how ties are treated, see ‘Details’; can be abbreviated.
§Details:

If all components are different (and no ‘NA’s), the ranks are well defined, with values in ‘seq_along(x)’. With some values equal (called ‘ties’), the argument ‘ties.method’ determines the result at the corresponding indices. The ‘“first”’ method results in a permutation with increasing values at each index set of ties, and analogously ‘“last”’ with decreasing values. The ‘“random”’ method puts these in random order whereas the default, ‘“average”’, replaces them by their mean, and ‘“max”’ and ‘“min”’ replaces them by their maximum and minimum respectively, the latter being the typical sports ranking.

‘NA’ values are never considered to be equal: for ‘na.last = TRUE’ and ‘na.last = FALSE’ they are given distinct ranks in the order in which they occur in ‘x’.

NB: ‘rank’ is not itself generic but ‘xtfrm’ is, and ‘rank(xtfrm(x), ….)’ will have the desired result if there is a ‘xtfrm’ method. Otherwise, ‘rank’ will make use of ‘==’, ‘>’, ‘is.na’ and extraction methods for classed objects, possibly rather slowly.

§Value:

A numeric vector of the same length as ‘x’ with names copied from ‘x’ (unless ‘na.last = NA’, when missing values are removed). The vector is of integer type unless ‘x’ is a long vector or ‘ties.method = “average”’ when it is of double type (whether or not there are any ties).

§References:

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

§See Also:

‘order’ and ‘sort’; ‘xtfrm’, see above.

§Examples:
(r1 <- rank(x1 <- c(3, 1, 4, 15, 92)))
x2 <- c(3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5)
names(x2) <- letters[1:11]
(r2 <- rank(x2)) # ties are averaged

## rank() is "idempotent": rank(rank(x)) == rank(x) :
stopifnot(rank(r1) == r1, rank(r2) == r2)

## ranks without averaging
rank(x2, ties.method= "first")  # first occurrence wins
rank(x2, ties.method= "last")   #  last occurrence wins
rank(x2, ties.method= "random") # ties broken at random
rank(x2, ties.method= "random") # and again

## keep ties ties, no average
(rma <- rank(x2, ties.method= "max"))  # as used classically
(rmi <- rank(x2, ties.method= "min"))  # as in Sports
stopifnot(rma + rmi == round(r2 + r2))

## Comparing all tie.methods:
tMeth <- eval(formals(rank)$ties.method)
rx2 <- sapply(tMeth, function(M) rank(x2, ties.method=M))
cbind(x2, rx2)
## ties.method's does not matter w/o ties:
x <- sample(47)
rx <- sapply(tMeth, function(MM) rank(x, ties.method=MM))
stopifnot(all(rx[,1] == rx))
Source

fn mean(&self) -> X

Arithmetic Mean

§Description:

Generic function for the (trimmed) arithmetic mean.

§Usage:

mean(x, …)

§Default S3 method:

mean(x, trim = 0, na.rm = FALSE, …)

§Arguments:
  • x: An R object. Currently there are methods for numeric/logical vectors and date, date-time and time interval objects. Complex vectors are allowed for ‘trim = 0’, only.
  • trim: the fraction (0 to 0.5) of observations to be trimmed from each end of ‘x’ before the mean is computed. Values of trim outside that range are taken as the nearest endpoint.
  • na.rm: a logical evaluating to ‘TRUE’ or ‘FALSE’ indicating whether ‘NA’ values should be stripped before the computation proceeds.
  • …: further arguments passed to or from other methods.
§Value:

If ‘trim’ is zero (the default), the arithmetic mean of the values in ‘x’ is computed, as a numeric or complex vector of length one. If ‘x’ is not logical (coerced to numeric), numeric (including integer) or complex, ‘NA_real_’ is returned, with a warning.

If ‘trim’ is non-zero, a symmetrically trimmed mean is computed with a fraction of ‘trim’ observations deleted from each end before the mean is computed.

§References:

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

§See Also:

‘weighted.mean’, ‘mean.POSIXct’, ‘colMeans’ for row and column means.

§Examples:
x <- c(0:10, 50)
xm <- mean(x)
c(xm, mean(x, trim = 0.10))
Source

fn median(&self) -> X

Median Value

§Description:

Compute the sample median.

§Usage:

median(x, na.rm = FALSE, …)

§Arguments:
  • x: an object for which a method has been defined, or a numeric vector containing the values whose median is to be computed.
  • na.rm: a logical value indicating whether ‘NA’ values should be stripped before the computation proceeds.
  • …: potentially further arguments for methods; not used in the default method.
§Details:

This is a generic function for which methods can be written. However, the default method makes use of ‘is.na’, ‘sort’ and ‘mean’ from package ‘base’ all of which are generic, and so the default method will work for most classes (e.g., ‘“Date”’) for which a median is a reasonable concept.

§Value:

The default method returns a length-one object of the same type as ‘x’, except when ‘x’ is logical or integer of even length, when the result will be double.

If there are no values or if ‘na.rm = FALSE’ and there are ‘NA’ values the result is ‘NA’ of the same type as ‘x’ (or more generally the result of ‘x[NA_integer_]’).

§References:

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

§See Also:

‘quantile’ for general quantiles.

§Examples:
median(1:4) # = 2.5 [even number]
median(c(1:3, 100, 1000))  # = 3 [odd, robust]
Source

fn cumsum(&self) -> Vec<X>

Cumulative Sums, Products, and Extremes

§Description:

Returns a vector whose elements are the cumulative sums, products, minima or maxima of the elements of the argument.

§Usage:

cumsum(x) cumprod(x) cummax(x) cummin(x)

§Arguments:
  • x: a numeric or complex (not ‘cummin’ or ‘cummax’) object, or an object that can be coerced to one of these.
§Details:

These are generic functions: methods can be defined for them individually or via the ‘Math’ group generic.

§Value:

A vector of the same length and type as ‘x’ (after coercion), except that ‘cumprod’ returns a numeric vector for integer input (for consistency with ‘*’). Names are preserved.

An ‘NA’ value in ‘x’ causes the corresponding and following elements of the return value to be ‘NA’, as does integer overflow in ‘cumsum’ (with a warning).

§S4 methods:

‘cumsum’ and ‘cumprod’ are S4 generic functions: methods can be defined for them individually or via the ‘Math’ group generic. ‘cummax’ and ‘cummin’ are individually S4 generic functions.

§References:

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole. (‘cumsum’ only.)

§Examples:
cumsum(1:10)
cumprod(1:10)
cummin(c(3:1, 2:0, 4:2))
cummax(c(3:1, 2:0, 4:2))
Source

fn cumprod(&self) -> Vec<X>

Source

fn cummax(&self) -> Vec<X>

Source

fn cummin(&self) -> Vec<X>

Source

fn mad(&self, deviation_type: DeviationType) -> X

Median Absolute Deviation

§Description:

Compute the median absolute deviation, i.e., the (lo-/hi-) median of the absolute deviations from the median, and (by default) adjust by a factor for asymptotically normal consistency.

§Usage:

mad(x, center = median(x), constant = 1.4826, na.rm = FALSE, low = FALSE, high = FALSE)

§Arguments:
  • x: a numeric vector.
  • center: Optionally, the centre: defaults to the median.
  • constant: scale factor.
  • na.rm: if ‘TRUE’ then ‘NA’ values are stripped from ‘x’ before computation takes place.
  • low: if ‘TRUE’, compute the ‘lo-median’, i.e., for even sample size, do not average the two middle values, but take the smaller one.
  • high: if ‘TRUE’, compute the ‘hi-median’, i.e., take the larger of the two middle values for even sample size.
§Details:

The actual value calculated is ‘constant * cMedian(abs(x - center))’ with the default value of ‘center’ being ‘median(x)’, and ‘cMedian’ being the usual, the ‘low’ or ‘high’ median, see the arguments description for ‘low’ and ‘high’ above.

In the case of n = 1 non-missing values and default ‘center’, the result is ‘0’, consistent with “no deviation from the center”.

The default ‘constant = 1.4826’ $(\text{approximately} 1/ \Phi^(-1)(3/4) = ‘1/\text{qnorm}(3/4)’)$ ensures consistency, i.e.,

$E[mad(X_1,…,X_n)] = \sigma$

for $X_i$ distributed as $N(\mu, \sigma^2)$ and large n.

If ‘na.rm’ is ‘TRUE’ then ‘NA’ values are stripped from ‘x’ before computation takes place. If this is not done then an ‘NA’ value in ‘x’ will cause ‘mad’ to return ‘NA’.

§See Also:

‘IQR’ which is simpler but less robust, ‘median’, ‘var’.

§Examples:
mad(c(1:9))
print(mad(c(1:9),constant = 1)) ==
 mad(c(1:8, 100), constant = 1)  # = 2 ; TRUE
x <- c(1,2,3,5,7,8)
sort(abs(x - median(x)))
c(mad(x, constant = 1),
  mad(x, constant = 1, low = TRUE),
  mad(x, constant = 1, high = TRUE))
Source

fn wmad(&self, w: &Self) -> X

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Implementations on Foreign Types§

Source§

impl<X> StatArray<X> for &[X]

Source§

fn min(&self) -> X

Source§

fn max(&self) -> X

Source§

fn ranks(&self) -> Vec<f64>

Source§

fn mean(&self) -> X

Source§

fn median(&self) -> X

Source§

fn cumsum(&self) -> Vec<X>

Source§

fn cumprod(&self) -> Vec<X>

Source§

fn cummax(&self) -> Vec<X>

Source§

fn cummin(&self) -> Vec<X>

Source§

fn mad(&self, deviation_type: DeviationType) -> X

Source§

fn wmad(&self, w: &&[X]) -> X

Source§

impl<X> StatArray<X> for Vec<X>

Source§

fn min(&self) -> X

Source§

fn max(&self) -> X

Source§

fn ranks(&self) -> Vec<f64>

Source§

fn mean(&self) -> X

Source§

fn median(&self) -> X

Source§

fn cumsum(&self) -> Vec<X>

Source§

fn cumprod(&self) -> Vec<X>

Source§

fn cummax(&self) -> Vec<X>

Source§

fn cummin(&self) -> Vec<X>

Source§

fn mad(&self, deviation_type: DeviationType) -> X

Source§

fn wmad(&self, w: &Vec<X>) -> X

Implementors§