Indxvec
Vecs indexing, ranking, sorting, merging, searching, reversing, intersecting, printing, etc.
The following will import everything
use ;
Description
This crate is lightweight and has no dependencies. The methods of all four traits can be functionally chained together to achieve numerous manipulations of vectors and their indices in compact form.
The facilities provided are:
- ranking, sorting (merge sort and hash sort), merging, binary searching, indexing, selecting, partitioning
- many useful operations on generic vectors and their indices
- set operations
- serialising generic slices and slices of vectors to Strings:
to_plainstr()
- printing generic slices and slices of vectors:
pvec()
- writing generic slices and slices of vectors to files:
wvec(&mut f)
- coloured pretty printing (ANSI terminal output, mainly for testing)
- macro
here!()
for more informative errors reporting
It is highly recommended to read and run tests/tests.rs
to learn from examples of usage. Use a single thread to run them. It may be a bit slower but it will write the results in the right order. It is also necessary to tun the timing benchmark sorts()
on its own for meaningful results.
Glossary
-
Sort Index - is obtained by stable merge sort
sort_indexed
or byhashsort_indexed
. The original data is immutable (unchanged). The sort index produced is a list of subscripts to the data, such that the first subscript identifies the smallest item in the data, and so on (in ascending order). Suitable for bulky data that are not easily moved. It answers the question: 'what data item occupies a given sort position?'. -
Reversing an index - sort index can be reversed by generic reversal operation
revs()
, ormutrevs()
. This has the effect of changing between ascending/descending sort orders without re-sorting or even reversing the (possibly bulky) actual data. -
Rank Index - corresponds to the given data order, listing the sort positions (ranks) for the data items, e.g.the third entry in the rank index gives the rank of the third data item. Some statistical measures require ranks of data. It answers the question: 'what is the sort position of a given data item?'.
-
Inverting an index - sort index and rank index are mutually inverse. Thus they can be easily switched by
invindex()
. This is usually the easiest way to obtain a rank index. They will both be equal to0..n
for data that is already in ascending order. -
Complement of an index - beware that the standard reversal will not convert directly between ascending and descending ranks. This purpose is served by
complindex()
. Alternatively, descending ranks can be reconstructed by applyinginvindex()
to a descending sort index. -
Unindexing - given a sort index and some data,
unindex()
will pick the data in the new order defined by the sort index. It can be used to efficiently transform lots of data vectors into the same (fixed) order. For example: Suppose we have vectors:keys
anddata_1,..data_n
, not explicitly joined together in some bulky Struct elements. The sort index obtained by:let indx = keys.sort_indexed()
can then be efficiently applied to sort the data vectors individually, e.g.indx.unindex(data_n,true)
(false to obtain a descending order at no extra cost).
Struct and Utility Functions
use ;
pub struct Minmax
holds minimum and maximum values of aVec
and their indices.binary_find
is a general purpose binary search/solver.here!()
is a macro giving the filename, line number and function name of the place from where it was invoked. It can be interpolated into any error/tracing messages and reports.
Trait Indices
use ;
The methods of this trait are implemented for slices of subscripts, i.e. they take the type &[usize]
as input (self) and produce new index Vec<usize>
, new data vector Vec<T>
or Vec<f64>
, or other results, as appropriate. Please see the Glossary below for descriptions of the indices and operations on them.
/// Methods to manipulate indices of `Vec<usize>` type.
Trait Vecops
use ;
The methods of this trait are applicable to all generic slices &[T]
(the data). Thus they will work on all Rust primitive numeric end types, such as f64. They can also work on slices holding any arbitrarily complex end type T
, as long as the required traits, PartialOrd
and/or Clone
, are implemented for T
.
Trait Mutops
use ;
This trait contains muthashsort
, which overwrites self
with sorted data. When we do not need to keep the original order, this is the most efficient way to sort.
Nota bene: muthashsort
really wins on longer Vecs. For about one thousand items upwards, it is on average about 25%-30% faster than the default Rust (Quicksort) sort_unstable
.
/// Mutable Operators on `&mut[T]`
Trait Printing
use Printing; // the trait methods
use *; // the colour constants
This trait provides utility methods to 'stringify' (serialise) generic slices and slices of Vec
s. Also, methods for writing or printing them. Optionally, it enables printing them in bold ANSI terminal colours for adding emphasis. See tests/tests.rs
for examples of usage.
The methods of this trait are implemented for generic individual items T
, for slices &[T]
for slices of slices &[&[T]]
and for slices of Vecs &[Vec<T>]
. Note that these types are normally unprintable in Rust (do not have Display
implemented).
The following methods: .to_plainstr
, .to_str()
, .gr()
, .rd()
, .yl()
.bl()
, .mg()
, .cy()
convert all these types to printable strings. The colouring methods just add the relevant colouring to the formatted output of .to_str()
.
fn wvec(self,f:&mut File) -> Result<(), io::Error> where Self: Sized;
writes plain space separated values (.ssv
) to files, possibly raising io::Error(s).
fn pvec(self) where Self: Sized;
prints to stdout.
For finer control of the colouring, import the colour constants from module printing
and use them in any formatting strings manually. For example,
switching colours:
use *; // ANSI colours constants
println!;
Note that all of these methods and interpolations set their own new colour regardless of the previous settings. Interpolating {UN}
resets the terminal to its default foreground rendering.
UN
is automatically appended at the end of strings produced by the colouring methods rd()..cy()
. Be careful to always close with one of these, or explicit {UN}
, otherwise all the following output will continue with the last selected colour foreground rendering.
Example from tests/tests.rs
:
println!;
memsearch
returns Option(None)
, when midval
is not found in vm
. Here, None
will be printed in red, while any found item will be printed in green. This is also an example of how to process Option
s without the long-winded match
statements.
Release Notes (Latest First)
Version 1.3.11 - Added module search.rs
. Improved general binary_any
and binary_all
search algorithms now within.