Struct cogset::Dbscan
[−]
[src]
pub struct Dbscan<P: RegionQuery + ListPoints> where
P::Point: Hash + Eq + Clone, { /* fields omitted */ }
Clustering via the DBSCAN algorithm[1].
[DBSCAN] is a density-based clustering algorithm: given a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away).wikipedia
An instance of Dbscan
is an iterator over clusters of
P
. Points classified as noise once all clusters are found are
available via noise_points
.
This uses the P::Point
yielded by the iterators provided by
ListPoints
and RegionQuery
as a unique identifier for each
point. The algorithm will behave strangely if the identifier is
not unique or not stable within a given execution of DBSCAN. The
identifier is cloned several times in the course of execution, so
it should be cheap to duplicate (e.g. a usize
index, or a &T
reference).
[1]: Ester, Martin; Kriegel, Hans-Peter; Sander, Jörg; Xu, Xiaowei (1996). Simoudis, Evangelos; Han, Jiawei; Fayyad, Usama M., eds. A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press. pp. 226–231.
Examples
A basic example:
use cogset::{Dbscan, BruteScan, Euclid}; let points = [Euclid([0.1]), Euclid([0.2]), Euclid([1.0])]; let scanner = BruteScan::new(&points); let mut dbscan = Dbscan::new(scanner, 0.2, 2); // get the clusters themselves let clusters = dbscan.by_ref().collect::<Vec<_>>(); // the first two points are the only cluster assert_eq!(clusters, &[&[0, 1]]); // now the noise let noise = dbscan.noise_points(); // which is just the last point assert_eq!(noise.iter().cloned().collect::<Vec<_>>(), &[2]);
A more complicated example that renders the output nicely:
use std::str; use cogset::{Dbscan, BruteScan, Euclid}; fn write_points<I>(output: &mut [u8; 76], byte: u8, it: I) where I: Iterator<Item = Euclid<[f64; 1]>> { for p in it { output[(p.0[0] * 30.0) as usize] = byte; } } // the points we're going to cluster, considered as points in ℝ // with the conventional distance. let points = [Euclid([0.25]), Euclid([0.9]), Euclid([2.0]), Euclid([1.2]), Euclid([1.9]), Euclid([1.1]), Euclid([1.35]), Euclid([1.85]), Euclid([1.05]), Euclid([0.1]), Euclid([2.5]), Euclid([0.05]), Euclid([0.6]), Euclid([0.55]), Euclid([1.6])]; // print the points before clustering let mut original = [b' '; 76]; write_points(&mut original, b'x', points.iter().cloned()); println!("{}", str::from_utf8(&original).unwrap()); // set-up the data structure that will manage the queries that // Dbscan needs to do. let scanner = BruteScan::new(&points); // create the clusterer: we need 3 points to consider a group a // cluster, and we're only looking at points 0.2 units apart. let min_points = 3; let epsilon = 0.2; let mut dbscan = Dbscan::new(scanner, epsilon, min_points); let mut clustered = [b' '; 76]; // run over all the clusters, writing each to the output for (i, cluster) in dbscan.by_ref().enumerate() { // since we used `BruteScan`, `cluster` is a vector of indices // into `points`, not the points themselves, so lets map back // to the points. let actual_points = cluster.iter().map(|idx| points[*idx]); write_points(&mut clustered, b'0' + i as u8, actual_points) } // now run over the noise points, i.e. points that aren't close // enough to others to be in a cluster. let noise = dbscan.noise_points(); write_points(&mut clustered, b'.', noise.iter().map(|idx| points[*idx])); // print the numbered clusters println!("{}", str::from_utf8(&clustered).unwrap());
Output:
x x x x x x x x x x x x x x x
0 0 0 . . 2 2 2 2 2 . 1 1 1 .
Methods
impl<P: RegionQuery + ListPoints> Dbscan<P> where
P::Point: Hash + Eq + Clone,
[src]
P::Point: Hash + Eq + Clone,
fn new(points: P, eps: f64, min_points: usize) -> Dbscan<P>
Create a new DBSCAN instance, with the given eps
and
min_points
.
eps
is the maximum distance between points when creating
neighbours to construct clusters. min_points
is the minimum
of points for a cluster.
This does not perform any significant computation immediately;
clusters are found on the fly via the Iterator
instance.
fn noise_points(&self) -> &HashSet<P::Point>
Points that have been classified as noise once the algorithm finishes.
This only makes sense to call once the iterator is exhausted, and will give unspecified nonsense if called earlier.
Trait Implementations
impl<P: RegionQuery + ListPoints> Iterator for Dbscan<P> where
P::Point: Hash + Eq + Clone,
[src]
P::Point: Hash + Eq + Clone,
type Item = Vec<P::Point>
The type of the elements being iterated over.
fn next(&mut self) -> Option<Vec<P::Point>>
Advances the iterator and returns the next value. Read more
fn size_hint(&self) -> (usize, Option<usize>)
1.0.0
Returns the bounds on the remaining length of the iterator. Read more
fn count(self) -> usize
1.0.0
Consumes the iterator, counting the number of iterations and returning it. Read more
fn last(self) -> Option<Self::Item>
1.0.0
Consumes the iterator, returning the last element. Read more
fn nth(&mut self, n: usize) -> Option<Self::Item>
1.0.0
Returns the n
th element of the iterator. Read more
fn step_by(self, step: usize) -> StepBy<Self>
🔬 This is a nightly-only experimental API. (iterator_step_by
)
unstable replacement of Range::step_by
Creates an iterator starting at the same point, but stepping by the given amount at each iteration. Read more
fn chain<U>(self, other: U) -> Chain<Self, <U as IntoIterator>::IntoIter> where
U: IntoIterator<Item = Self::Item>,
1.0.0
U: IntoIterator<Item = Self::Item>,
Takes two iterators and creates a new iterator over both in sequence. Read more
fn zip<U>(self, other: U) -> Zip<Self, <U as IntoIterator>::IntoIter> where
U: IntoIterator,
1.0.0
U: IntoIterator,
'Zips up' two iterators into a single iterator of pairs. Read more
fn map<B, F>(self, f: F) -> Map<Self, F> where
F: FnMut(Self::Item) -> B,
1.0.0
F: FnMut(Self::Item) -> B,
Takes a closure and creates an iterator which calls that closure on each element. Read more
fn filter<P>(self, predicate: P) -> Filter<Self, P> where
P: FnMut(&Self::Item) -> bool,
1.0.0
P: FnMut(&Self::Item) -> bool,
Creates an iterator which uses a closure to determine if an element should be yielded. Read more
fn filter_map<B, F>(self, f: F) -> FilterMap<Self, F> where
F: FnMut(Self::Item) -> Option<B>,
1.0.0
F: FnMut(Self::Item) -> Option<B>,
Creates an iterator that both filters and maps. Read more
fn enumerate(self) -> Enumerate<Self>
1.0.0
Creates an iterator which gives the current iteration count as well as the next value. Read more
fn peekable(self) -> Peekable<Self>
1.0.0
Creates an iterator which can use peek
to look at the next element of the iterator without consuming it. Read more
fn skip_while<P>(self, predicate: P) -> SkipWhile<Self, P> where
P: FnMut(&Self::Item) -> bool,
1.0.0
P: FnMut(&Self::Item) -> bool,
Creates an iterator that [skip
]s elements based on a predicate. Read more
fn take_while<P>(self, predicate: P) -> TakeWhile<Self, P> where
P: FnMut(&Self::Item) -> bool,
1.0.0
P: FnMut(&Self::Item) -> bool,
Creates an iterator that yields elements based on a predicate. Read more
fn skip(self, n: usize) -> Skip<Self>
1.0.0
Creates an iterator that skips the first n
elements. Read more
fn take(self, n: usize) -> Take<Self>
1.0.0
Creates an iterator that yields its first n
elements. Read more
fn scan<St, B, F>(self, initial_state: St, f: F) -> Scan<Self, St, F> where
F: FnMut(&mut St, Self::Item) -> Option<B>,
1.0.0
F: FnMut(&mut St, Self::Item) -> Option<B>,
An iterator adaptor similar to [fold
] that holds internal state and produces a new iterator. Read more
fn flat_map<U, F>(self, f: F) -> FlatMap<Self, U, F> where
F: FnMut(Self::Item) -> U,
U: IntoIterator,
1.0.0
F: FnMut(Self::Item) -> U,
U: IntoIterator,
Creates an iterator that works like map, but flattens nested structure. Read more
fn fuse(self) -> Fuse<Self>
1.0.0
Creates an iterator which ends after the first [None
]. Read more
fn inspect<F>(self, f: F) -> Inspect<Self, F> where
F: FnMut(&Self::Item) -> (),
1.0.0
F: FnMut(&Self::Item) -> (),
Do something with each element of an iterator, passing the value on. Read more
fn by_ref(&mut self) -> &mut Self
1.0.0
Borrows an iterator, rather than consuming it. Read more
fn collect<B>(self) -> B where
B: FromIterator<Self::Item>,
1.0.0
B: FromIterator<Self::Item>,
Transforms an iterator into a collection. Read more
fn partition<B, F>(self, f: F) -> (B, B) where
B: Default + Extend<Self::Item>,
F: FnMut(&Self::Item) -> bool,
1.0.0
B: Default + Extend<Self::Item>,
F: FnMut(&Self::Item) -> bool,
Consumes an iterator, creating two collections from it. Read more
fn fold<B, F>(self, init: B, f: F) -> B where
F: FnMut(B, Self::Item) -> B,
1.0.0
F: FnMut(B, Self::Item) -> B,
An iterator adaptor that applies a function, producing a single, final value. Read more
fn all<F>(&mut self, f: F) -> bool where
F: FnMut(Self::Item) -> bool,
1.0.0
F: FnMut(Self::Item) -> bool,
Tests if every element of the iterator matches a predicate. Read more
fn any<F>(&mut self, f: F) -> bool where
F: FnMut(Self::Item) -> bool,
1.0.0
F: FnMut(Self::Item) -> bool,
Tests if any element of the iterator matches a predicate. Read more
fn find<P>(&mut self, predicate: P) -> Option<Self::Item> where
P: FnMut(&Self::Item) -> bool,
1.0.0
P: FnMut(&Self::Item) -> bool,
Searches for an element of an iterator that satisfies a predicate. Read more
fn position<P>(&mut self, predicate: P) -> Option<usize> where
P: FnMut(Self::Item) -> bool,
1.0.0
P: FnMut(Self::Item) -> bool,
Searches for an element in an iterator, returning its index. Read more
fn rposition<P>(&mut self, predicate: P) -> Option<usize> where
P: FnMut(Self::Item) -> bool,
Self: ExactSizeIterator + DoubleEndedIterator,
1.0.0
P: FnMut(Self::Item) -> bool,
Self: ExactSizeIterator + DoubleEndedIterator,
Searches for an element in an iterator from the right, returning its index. Read more
fn max(self) -> Option<Self::Item> where
Self::Item: Ord,
1.0.0
Self::Item: Ord,
Returns the maximum element of an iterator. Read more
fn min(self) -> Option<Self::Item> where
Self::Item: Ord,
1.0.0
Self::Item: Ord,
Returns the minimum element of an iterator. Read more
fn max_by_key<B, F>(self, f: F) -> Option<Self::Item> where
B: Ord,
F: FnMut(&Self::Item) -> B,
1.6.0
B: Ord,
F: FnMut(&Self::Item) -> B,
Returns the element that gives the maximum value from the specified function. Read more
fn max_by<F>(self, compare: F) -> Option<Self::Item> where
F: FnMut(&Self::Item, &Self::Item) -> Ordering,
1.15.0
F: FnMut(&Self::Item, &Self::Item) -> Ordering,
Returns the element that gives the maximum value with respect to the specified comparison function. Read more
fn min_by_key<B, F>(self, f: F) -> Option<Self::Item> where
B: Ord,
F: FnMut(&Self::Item) -> B,
1.6.0
B: Ord,
F: FnMut(&Self::Item) -> B,
Returns the element that gives the minimum value from the specified function. Read more
fn min_by<F>(self, compare: F) -> Option<Self::Item> where
F: FnMut(&Self::Item, &Self::Item) -> Ordering,
1.15.0
F: FnMut(&Self::Item, &Self::Item) -> Ordering,
Returns the element that gives the minimum value with respect to the specified comparison function. Read more
fn rev(self) -> Rev<Self> where
Self: DoubleEndedIterator,
1.0.0
Self: DoubleEndedIterator,
Reverses an iterator's direction. Read more
fn unzip<A, B, FromA, FromB>(self) -> (FromA, FromB) where
FromA: Default + Extend<A>,
FromB: Default + Extend<B>,
Self: Iterator<Item = (A, B)>,
1.0.0
FromA: Default + Extend<A>,
FromB: Default + Extend<B>,
Self: Iterator<Item = (A, B)>,
Converts an iterator of pairs into a pair of containers. Read more
fn cloned<'a, T>(self) -> Cloned<Self> where
Self: Iterator<Item = &'a T>,
T: 'a + Clone,
1.0.0
Self: Iterator<Item = &'a T>,
T: 'a + Clone,
Creates an iterator which [clone
]s all of its elements. Read more
fn cycle(self) -> Cycle<Self> where
Self: Clone,
1.0.0
Self: Clone,
Repeats an iterator endlessly. Read more
fn sum<S>(self) -> S where
S: Sum<Self::Item>,
1.11.0
S: Sum<Self::Item>,
Sums the elements of an iterator. Read more
fn product<P>(self) -> P where
P: Product<Self::Item>,
1.11.0
P: Product<Self::Item>,
Iterates over the entire iterator, multiplying all the elements Read more
fn cmp<I>(self, other: I) -> Ordering where
I: IntoIterator<Item = Self::Item>,
Self::Item: Ord,
1.5.0
I: IntoIterator<Item = Self::Item>,
Self::Item: Ord,
Lexicographically compares the elements of this Iterator
with those of another. Read more
fn partial_cmp<I>(self, other: I) -> Option<Ordering> where
I: IntoIterator,
Self::Item: PartialOrd<<I as IntoIterator>::Item>,
1.5.0
I: IntoIterator,
Self::Item: PartialOrd<<I as IntoIterator>::Item>,
Lexicographically compares the elements of this Iterator
with those of another. Read more
fn eq<I>(self, other: I) -> bool where
I: IntoIterator,
Self::Item: PartialEq<<I as IntoIterator>::Item>,
1.5.0
I: IntoIterator,
Self::Item: PartialEq<<I as IntoIterator>::Item>,
Determines if the elements of this Iterator
are equal to those of another. Read more
fn ne<I>(self, other: I) -> bool where
I: IntoIterator,
Self::Item: PartialEq<<I as IntoIterator>::Item>,
1.5.0
I: IntoIterator,
Self::Item: PartialEq<<I as IntoIterator>::Item>,
Determines if the elements of this Iterator
are unequal to those of another. Read more
fn lt<I>(self, other: I) -> bool where
I: IntoIterator,
Self::Item: PartialOrd<<I as IntoIterator>::Item>,
1.5.0
I: IntoIterator,
Self::Item: PartialOrd<<I as IntoIterator>::Item>,
Determines if the elements of this Iterator
are lexicographically less than those of another. Read more
fn le<I>(self, other: I) -> bool where
I: IntoIterator,
Self::Item: PartialOrd<<I as IntoIterator>::Item>,
1.5.0
I: IntoIterator,
Self::Item: PartialOrd<<I as IntoIterator>::Item>,
Determines if the elements of this Iterator
are lexicographically less or equal to those of another. Read more
fn gt<I>(self, other: I) -> bool where
I: IntoIterator,
Self::Item: PartialOrd<<I as IntoIterator>::Item>,
1.5.0
I: IntoIterator,
Self::Item: PartialOrd<<I as IntoIterator>::Item>,
Determines if the elements of this Iterator
are lexicographically greater than those of another. Read more
fn ge<I>(self, other: I) -> bool where
I: IntoIterator,
Self::Item: PartialOrd<<I as IntoIterator>::Item>,
1.5.0
I: IntoIterator,
Self::Item: PartialOrd<<I as IntoIterator>::Item>,
Determines if the elements of this Iterator
are lexicographically greater than or equal to those of another. Read more