Expand description
§bhtsne
bhtsne
contains the implementations of both a parallel, exact, version of the t-SNE algorithm
and a parallel, approximate, version leveraging the Barnes-Hut algorithm.
The implementation supports custom data types and custom defined metrics. See tSNE
for more
details.
This crate also includes load_csv
, a commodity function to parse data, record by record,
from a csv file.
§Example
use bhtsne;
const N: usize = 150; // Number of vectors to embed.
const D: usize = 4; // The dimensionality of the
// original space.
const THETA: f32 = 0.5; // Parameter used by the Barnes-Hut algorithm.
// Small values improve accuracy but increase complexity.
const PERPLEXITY: f32 = 10.0; // Perplexity of the conditional distribution.
const EPOCHS: usize = 2000; // Number of fitting iterations.
const NO_DIMS: u8 = 2; // Dimensionality of the embedded space.
// Loads the data from a csv file skipping the first row,
// treating it as headers and skipping the 5th column,
// treating it as a class label.
// Do note that you can also switch to f64s for higher precision.
let data: Vec<f32> = bhtsne::load_csv("iris.csv", true, Some(&[4]), |float| {
float.parse().unwrap()
})?;
let samples: Vec<&[f32]> = data.chunks(D).collect();
// Executes the Barnes-Hut approximation of the algorithm and writes the embedding to the
// specified csv file.
bhtsne::tSNE::new(&samples)
.embedding_dim(NO_DIMS)
.perplexity(PERPLEXITY)
.epochs(EPOCHS)
.barnes_hut(THETA, |sample_a, sample_b| {
sample_a
.iter()
.zip(sample_b.iter())
.map(|(a, b)| (a - b).powi(2))
.sum::<f32>()
.sqrt()
})
.write_csv("iris_embedding.csv")?;
Structs§
- t-distributed stochastic neighbor embedding. Provides a parallel implementation of both the exact version of the algorithm and the tree accelerated one leveraging space partitioning trees.
Functions§
- Loads data from a csv file.