CLAM: Clustering, Learning and Approximation with Manifolds (v0.29.0)
The Rust implementation of CLAM.
As of writing this document, the project is still in a pre-1.0 state. This means that the API is not yet stable and breaking changes may occur frequently.
Usage
CLAM is a library crate so you can add it to your crate using cargo add abd_clam@0.29.0
.
Cakes: Nearest Neighbor Search
use ;
use *;
/// The distance function with with to perform clustering and search.
///
/// We use the `distances` crate for the distance function.
/// Generate some random data. You can use your own data here.
///
/// CLAM can handle arbitrarily large datasets. We use a small one here for
/// demonstration.
///
/// We use the `symagen` crate for generating interesting datasets for examples
/// and tests.
let seed = 42;
let mut rng = seed_from_u64;
let = ;
let = ;
let data: = random_tabular;
// We will generate some random labels for each point.
let labels: = data.iter.map.collect;
// We will use the origin as our query.
let query: = vec!;
// RNN search will use a radius of 0.05.
let radius: f32 = 0.05;
// KNN search will find the 10 nearest neighbors.
let k = 10;
// The name of the dataset.
let name = "demo".to_string;
// We will assume that our distance function is cheap to compute.
let is_metric_expensive = false;
// We create the dataset from the data and distance function.
let dataset = new;
// At this point, `dataset` has taken ownership of the `data`.
// The default metadata is the indices of the points in the dataset. We will,
// however, use our random labels as metadata.
let dataset = dataset
.assign_metadata
.unwrap_or_else;
// At this point, `dataset` has also taken ownership of `labels`.
// We will use the default partition criteria for this example. This will partition
// the data until each Cluster contains a single unique point.
let criteria = default;
// The Cakes struct provides the functionality described in the paper.
// We use a single shard here because the demo data is small.
let model = new;
// This line performs a non-trivial amount of work. #understatement
// At this point, the dataset has been reordered to improve search performance.
// We can now perform RNN search on the model.
let rnn_results: = model.rnn_search;
// We can also perform KNN search on the model.
let knn_results: = model.knn_search;
// Both results are a Vec of 2-tuples where the first element is the index of
// the point in the dataset and the second element is the distance from the
// query point.
// We can borrow the reordered labels from the model.
let labels: & = model.shards.metadata;
// We can use the results to get the labels of the points that are within the
// radius of the query point.
let rnn_labels: = rnn_results.iter.map.collect;
// We can use the results to get the labels of the points that are the k nearest
// neighbors of the query point.
let knn_labels: = knn_results.iter.map.collect;
// TODO: Add snippets for saving/loading models.
Chaoda: Anomaly Detection
TODO ...
License
- MIT