[][src]Crate agnes

Data management library for Rust. It provides data structs and utilities for data loading, preprocessing, aggregation, manipulation, viewing, and serialization.

For a more complete description of agnes along with a feature list, usage information, example code, and more, see the respository README. For a guide on how to get started with agnes, click here.

Primary Structures

agnes works with heterogeneously-typed labeled tabular data -- a group of fields (columns) each with a label and distinct data type, where each field has the same number of rows.

Labels in agnes are unit-like marker structs which only exist to uniquely identify a field at the type level (i.e. compile time). The tablespace macro exists to define these labels.

The primary data storage structure in agnes is the DataStore, which is a list of FieldData objects which each contain the data for a single field. A DataStore contains all of the data loaded from a single source into a program, and once data is added to a DataStore it is immutable.

The primary data structure used by the end user of this library is the DataView, which references one or more DataFrame objects, each of which holds a reference and provides access to a single DataStore. The DataView struct provides a way of selecting fields (columns) across one or more data sources, with the DataFrame struct providing a way to select specific rows from those data sources (after, for example, a filtering or join operation).

The FieldSelect and SelectFieldByLabel traits provide methods to access a single field from a DataView. They return a type that implements DataIndex, which provides accessor methods to the data of that field: an index-based method get_datum and an iterator provided by iter.

Heterogenerous Lists

agnes makes extensive use of heterogeneous cons-lists to provide data structures that can hold data of varying types (as long as the types are known to the user of the library at compile time). Much of this framework was originally inspired by the frunk Rust library and the HList Haskell library.

In the DataStore struct, a cons-list is used to hold a list of the the FieldData objects (each type-parameterized on a potentially different type). The DataView struct has a cons-list of labels referenced by that DataView along with another cons-list of DataFrames for each data source it references.

The basic cons-list implementation can be found in the cons module. Additional functionality for labeling cons-list elements and retrieving elements based on labels can be found in the label module.

Re-exports

pub extern crate typenum;

Modules

access

Traits for accessing data within agnes data structures.

cons

Basic heterogeneous list (cons-list) implementation.

error

General error struct and helpful conversions.

field

Data structures and implementations for fields.FieldData

fieldlist

Type aliases and macro for handling specifications of fields in a data source.

frame

Structs and implementation for row-selecting data structure.

join

Traits and implementations to handle joining or merging two DataViews.

label

Traits, structs, and type aliases for handling cons-list element labels and associated logic.

partial

Framework for partial function handling (where some functionality is implemented for some but not all of the data types of fields in a data structure).

permute

Structures, traits, and implementations for handling data permutations.

select

Traits for selecting a field from a data structure.

source

Data sources.

stats

Useful statistics-calculating traits for fields with numeric data.

store

Data storage struct and implementation.

test_utils

Functions for generating sample data tables for tests.

view

Main DataView struct and associated implementations.

view_stats

Functions for displaying statistics about a DataView.

Macros

Fields

Create a FieldCons cons-list based on a list of provided labels and data types. Used by tablespace macro.

Labels

Create a LabelCons cons-list based on a list of provided labels. Used to specify a list of field labels to operate over.

declare_fields

Macro for declaring field labels. Used by tablespace macro.

first_label

Macro for handling creation of the first label in a table. Used by declare_fields.

length

Utility macro to determine the length of a cons-list.

nat_label

Macro for defining a single label and its backing natural. Used by next_label and first_label macros.

next_label

Macro for handling creation of the subsequent (non-initial) labels in a table. Used by declare_fields.

schema

Macro for creating a source specification structure used to specify how to extract fields from a data source. It correlates labels (defined using the tablespace macro) to field / column names or indices in a data source. This source specification structure is implemented as a

table

Creates a data table with supplied data.

tablespace

Declares a set of data tables that all occupy the same tablespace (i.e. can be merged or joined together). This macro should be used at the beginning of any agnes-using code, to declare the various source and constructed table field labels.

valref

Small utility macro to construct a Value enum with a reference to an existing value. Typically only used for tests.