Struct cogset::Dbscan [] [src]

pub struct Dbscan<P: RegionQuery + ListPoints> where
    P::Point: Hash + Eq + Clone
{ /* fields omitted */ }

Clustering via the DBSCAN algorithm[1].

[DBSCAN] is a density-based clustering algorithm: given a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away).wikipedia

An instance of Dbscan is an iterator over clusters of P. Points classified as noise once all clusters are found are available via noise_points.

This uses the P::Point yielded by the iterators provided by ListPoints and RegionQuery as a unique identifier for each point. The algorithm will behave strangely if the identifier is not unique or not stable within a given execution of DBSCAN. The identifier is cloned several times in the course of execution, so it should be cheap to duplicate (e.g. a usize index, or a &T reference).

[1]: Ester, Martin; Kriegel, Hans-Peter; Sander, Jörg; Xu, Xiaowei (1996). Simoudis, Evangelos; Han, Jiawei; Fayyad, Usama M., eds. A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press. pp. 226–231.

Examples

A basic example:

use cogset::{Dbscan, BruteScan, Euclid};

let points = [Euclid([0.1]), Euclid([0.2]), Euclid([1.0])];

let scanner = BruteScan::new(&points);
let mut dbscan = Dbscan::new(scanner, 0.2, 2);

// get the clusters themselves
let clusters = dbscan.by_ref().collect::<Vec<_>>();
// the first two points are the only cluster
assert_eq!(clusters, &[&[0, 1]]);

// now the noise
let noise = dbscan.noise_points();
// which is just the last point
assert_eq!(noise.iter().cloned().collect::<Vec<_>>(),
           &[2]);

A more complicated example that renders the output nicely:

use std::str;
use cogset::{Dbscan, BruteScan, Euclid};

fn write_points<I>(output: &mut [u8; 76], byte: u8, it: I)
    where I: Iterator<Item = Euclid<[f64; 1]>>
{
    for p in it { output[(p.0[0] * 30.0) as usize] = byte; }
}

// the points we're going to cluster, considered as points in ℝ
// with the conventional distance.
let points = [Euclid([0.25]), Euclid([0.9]), Euclid([2.0]), Euclid([1.2]),
              Euclid([1.9]), Euclid([1.1]),  Euclid([1.35]), Euclid([1.85]),
              Euclid([1.05]), Euclid([0.1]), Euclid([2.5]), Euclid([0.05]),
              Euclid([0.6]), Euclid([0.55]), Euclid([1.6])];

// print the points before clustering
let mut original = [b' '; 76];
write_points(&mut original, b'x', points.iter().cloned());
println!("{}", str::from_utf8(&original).unwrap());

// set-up the data structure that will manage the queries that
// Dbscan needs to do.
let scanner = BruteScan::new(&points);

// create the clusterer: we need 3 points to consider a group a
// cluster, and we're only looking at points 0.2 units apart.
let min_points = 3;
let epsilon = 0.2;
let mut dbscan = Dbscan::new(scanner, epsilon, min_points);

let mut clustered = [b' '; 76];

// run over all the clusters, writing each to the output
for (i, cluster) in dbscan.by_ref().enumerate() {
    // since we used `BruteScan`, `cluster` is a vector of indices
    // into `points`, not the points themselves, so lets map back
    // to the points.
    let actual_points = cluster.iter().map(|idx| points[*idx]);

    write_points(&mut clustered, b'0' + i as u8,
                 actual_points)
}
// now run over the noise points, i.e. points that aren't close
// enough to others to be in a cluster.
let noise = dbscan.noise_points();
write_points(&mut clustered, b'.',
             noise.iter().map(|idx| points[*idx]));

// print the numbered clusters
println!("{}", str::from_utf8(&clustered).unwrap());

Output:

 x x   x        x x        x   x x  x   x       x      x x  x              x
 0 0   0        . .        2   2 2  2   2       .      1 1  1              .

Methods

impl<P: RegionQuery + ListPoints> Dbscan<P> where
    P::Point: Hash + Eq + Clone
[src]

Create a new DBSCAN instance, with the given eps and min_points.

eps is the maximum distance between points when creating neighbours to construct clusters. min_points is the minimum of points for a cluster.

This does not perform any significant computation immediately; clusters are found on the fly via the Iterator instance.

Points that have been classified as noise once the algorithm finishes.

This only makes sense to call once the iterator is exhausted, and will give unspecified nonsense if called earlier.

Trait Implementations

impl<P: RegionQuery + ListPoints> Iterator for Dbscan<P> where
    P::Point: Hash + Eq + Clone
[src]

The type of the elements being iterated over.

Advances the iterator and returns the next value. Read more

Returns the bounds on the remaining length of the iterator. Read more

Consumes the iterator, counting the number of iterations and returning it. Read more

Consumes the iterator, returning the last element. Read more

Returns the nth element of the iterator. Read more

🔬 This is a nightly-only experimental API. (iterator_step_by)

unstable replacement of Range::step_by

Creates an iterator starting at the same point, but stepping by the given amount at each iteration. Read more

Takes two iterators and creates a new iterator over both in sequence. Read more

'Zips up' two iterators into a single iterator of pairs. Read more

Takes a closure and creates an iterator which calls that closure on each element. Read more

Creates an iterator which uses a closure to determine if an element should be yielded. Read more

Creates an iterator that both filters and maps. Read more

Creates an iterator which gives the current iteration count as well as the next value. Read more

Creates an iterator which can use peek to look at the next element of the iterator without consuming it. Read more

Creates an iterator that [skip]s elements based on a predicate. Read more

Creates an iterator that yields elements based on a predicate. Read more

Creates an iterator that skips the first n elements. Read more

Creates an iterator that yields its first n elements. Read more

An iterator adaptor similar to [fold] that holds internal state and produces a new iterator. Read more

Creates an iterator that works like map, but flattens nested structure. Read more

Creates an iterator which ends after the first [None]. Read more

Do something with each element of an iterator, passing the value on. Read more

Borrows an iterator, rather than consuming it. Read more

Transforms an iterator into a collection. Read more

Consumes an iterator, creating two collections from it. Read more

An iterator adaptor that applies a function, producing a single, final value. Read more

Tests if every element of the iterator matches a predicate. Read more

Tests if any element of the iterator matches a predicate. Read more

Searches for an element of an iterator that satisfies a predicate. Read more

Searches for an element in an iterator, returning its index. Read more

Searches for an element in an iterator from the right, returning its index. Read more

Returns the maximum element of an iterator. Read more

Returns the minimum element of an iterator. Read more

Returns the element that gives the maximum value from the specified function. Read more

Returns the element that gives the maximum value with respect to the specified comparison function. Read more

Returns the element that gives the minimum value from the specified function. Read more

Returns the element that gives the minimum value with respect to the specified comparison function. Read more

Reverses an iterator's direction. Read more

Converts an iterator of pairs into a pair of containers. Read more

Creates an iterator which [clone]s all of its elements. Read more

Repeats an iterator endlessly. Read more

Sums the elements of an iterator. Read more

Iterates over the entire iterator, multiplying all the elements Read more

Lexicographically compares the elements of this Iterator with those of another. Read more

Lexicographically compares the elements of this Iterator with those of another. Read more

Determines if the elements of this Iterator are equal to those of another. Read more

Determines if the elements of this Iterator are unequal to those of another. Read more

Determines if the elements of this Iterator are lexicographically less than those of another. Read more

Determines if the elements of this Iterator are lexicographically less or equal to those of another. Read more

Determines if the elements of this Iterator are lexicographically greater than those of another. Read more

Determines if the elements of this Iterator are lexicographically greater than or equal to those of another. Read more