[−][src]Crate agnes
Data management library for Rust. It provides data structs and utilities for data loading, preprocessing, aggregation, manipulation, viewing, and serialization.
For a more complete description of agnes
along with a feature list, usage information, example
code, and more, see the respository README. For a guide on how
to get started with agnes
, click here.
Primary Structures
agnes
works with heterogeneously-typed labeled tabular data -- a group of fields
(columns) each with a label and distinct data type, where each field has the same number
of rows.
Labels in agnes
are unit-like
marker structs which only exist to uniquely identify a field at the type level (i.e. compile time).
The tablespace macro exists to define these labels.
The primary data storage structure in agnes
is the DataStore, which
is a list of FieldData objects which each contain the data for a
single field. A DataStore
contains all of the data loaded from a single source into a program, and
once data is added to a DataStore
it is immutable.
The primary data structure used by the end user of this library is the
DataView, which references one or more
DataFrame objects, each of which holds a reference and provides
access to a single DataStore
. The DataView
struct provides a way of selecting
fields (columns) across one or more data sources, with the DataFrame
struct providing a way to
select specific rows from those data sources (after, for example, a filtering or join operation).
The FieldSelect and
SelectFieldByLabel traits provide methods to access a single
field from a DataView
. They return a type that implements
DataIndex, which provides accessor methods to the data of that field:
an index-based method get_datum and an iterator
provided by iter.
Heterogenerous Lists
agnes
makes extensive use of heterogeneous cons-lists
to provide data structures that can hold data of varying types (as long as the types are known to
the user of the library at compile time). Much of this framework was originally inspired by the
frunk Rust library and the
HList Haskell library.
In the DataStore
struct, a cons-list is used to hold a list of the the FieldData
objects (each
type-parameterized on a potentially different type). The DataView
struct has a cons-list of
labels referenced by that DataView
along with another cons-list of DataFrame
s for each data
source it references.
The basic cons-list implementation can be found in the cons module. Additional functionality for labeling cons-list elements and retrieving elements based on labels can be found in the label module.
Re-exports
pub extern crate typenum; |
Modules
access | Traits for accessing data within agnes data structures. |
cons | Basic heterogeneous list (cons-list) implementation. |
error | General error struct and helpful conversions. |
field | Data structures and implementations for fields.FieldData |
fieldlist | Type aliases and macro for handling specifications of fields in a data source. |
frame | Structs and implementation for row-selecting data structure. |
join | Traits and implementations to handle joining or merging two |
label | Traits, structs, and type aliases for handling cons-list element labels and associated logic. |
partial | Framework for partial function handling (where some functionality is implemented for some but not all of the data types of fields in a data structure). |
permute | Structures, traits, and implementations for handling data permutations. |
select | Traits for selecting a field from a data structure. |
source | Data sources. |
stats | Useful statistics-calculating traits for fields with numeric data. |
store | Data storage struct and implementation. |
test_utils | Functions for generating sample data tables for tests. |
view | Main |
view_stats | Functions for displaying statistics about a |
Macros
Fields | Create a FieldCons cons-list based on a list of provided labels and data types. Used by tablespace macro. |
Labels | Create a LabelCons cons-list based on a list of provided labels. Used to specify a list of field labels to operate over. |
declare_fields | Macro for declaring field labels. Used by tablespace macro. |
first_label | Macro for handling creation of the first label in a table. Used by declare_fields. |
length | Utility macro to determine the length of a cons-list. |
nat_label | Macro for defining a single label and its backing natural. Used by next_label and first_label macros. |
next_label | Macro for handling creation of the subsequent (non-initial) labels in a table. Used by declare_fields. |
schema | Macro for creating a source specification structure used to specify how to extract fields from a data source. It correlates labels (defined using the tablespace macro) to field / column names or indices in a data source. This source specification structure is implemented as a |
table | Creates a data table with supplied data. |
tablespace | Declares a set of data tables that all occupy the same tablespace (i.e. can be merged or
joined together). This macro should be used at the beginning of any |
valref | Small utility macro to construct a Value enum with a reference to an existing value. Typically only used for tests. |