forester 0.0.3

A crate for implementing various flavors of random forests and decision trees.
Documentation
# forester
A rust crate for tailoring random forests and decision trees to *your* 
data set.

The aim of this project is to provide generic functionality for working
with random forests. It is currently in a very early development stage.
Key elements of the API are starting to stabilize, so if you happen to
have anything to say about it, now would be a good time to open an Issue
[here][issues] 

Don't forget to check out the [examples in the repository][ex-repo].

## Overview

This implementation of random forests is heavily inspired by (1). In
particular, models for classification, regression, and density
estimation will be provided in a unified framework based on traits.

Conceptually, the crate provides two main parts:

1. A generic framework consisting of
    - Functionality for fitting and predicting trees and forests
    - Traits that allow these functions to understand arbitrary user data
2. Common building blocks for plugging into the framework
    - Split/Performance criteria (RMSE, GINI, ...)
    - Split Finding strategies (best random, CART, ...)
    - Ensemble combiners (aggregating, boosting - to be done)

## Usage

Most implementations of random forests work on tabular data, more or
less randomly selecting which feature columns to try for a particular
split. This works only with a finite set of predefined features.
However, as described in (1), random forests can work with
infinite-dimensional feature spaces. In other words, the parameter that
identifies a feature can be continuous value rather than a discrete
column index.

An example of an infinite-dimensional feature space is a feature that is
formed as the linear combination of two columns (see
`rotational_classifier` example). Which features to use and how to
interpret them strongly depends on the data, so it hardly makes sense to
provide a few arbitrary feature extraction methods. Instead, the work
of reasoning about the data is deferred to the users of the crate, who
need to implement the [`SampleDescription`][SampleDescription] and
[`TrainingData`][TrainingData] traits. These traits define how features
are parameterized and extracted from the data, how the final prediction
in tree leaves is made, how to evaluate splits, and much more...

## Examples

Examples can be found in the [repository][repo].


## Literature

1. A. Criminisi, J. Shotton and E. Konukoglu, "*Decision Forests for
   Classification, Regression, Density Estimation, Manifold Learning and
   Semi-Supervised Learning*", Microsoft Research technical report
   TR-2011-114 ([PDF][1])


[1]: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/decisionForests_MSR_TR_2011_114.pdf

[repo]: https://github.com/mbillingr/forester
[ex-repo]: https://github.com/mbillingr/forester/tree/master/forester-examples
[issues]: https://github.com/mbillingr/forester/issues

[SampleDescription]: https://docs.rs/forester/0.0.2/forester/data/trait.SampleDescription.html
[TrainingData]: https://docs.rs/forester/0.0.2/forester/data/trait.TrainingData.html