sklears-feature-extraction

Latest release: 0.1.0-beta.1 (January 1, 2026). See the workspace release notes for highlights and upgrade guidance.

Overview

sklears-feature-extraction contains text, signal, and image feature transformers designed to mirror scikit-learn’s feature extraction API with Rust-first performance.

Key Features

Text Processing: CountVectorizer, TfidfVectorizer, HashingVectorizer, N-gram analyzers, character models.
Image Features: Patch extraction, HOG descriptors, SIFT-like outlines, and GPU pipelines.
Signal Features: Windowed statistics, spectrograms, wavelet transforms, and FFT-based descriptors.
Pipeline Support: Integrates with sklears preprocessing, selection, and model selection crates.

Quick Start

use sklears_feature_extraction::text::TfidfVectorizer;

let docs = vec![
    "Rust brings fearless concurrency",
    "Machine learning in Rust is fast",
];

let vectorizer = TfidfVectorizer::builder()
    .ngram_range((1, 2))
    .min_df(1)
    .max_features(Some(4096))
    .build();

let tfidf = vectorizer.fit_transform(&docs)?;

Status

Extensively tested via the 11,292 passing workspace suites shipped in 0.1.0-beta.1.
Offers >99% parity with scikit-learn’s feature extraction module, plus GPU paths.
Additional work (streaming text ingestion, audio-specific transforms) documented in TODO.md.

sklears-feature-extraction 0.1.0-beta.1

sklears-feature-extraction

Overview

Key Features

Quick Start

Status