Crate sklears_core

Expand description

§sklears-core - Core Traits and Utilities

This crate provides the foundational traits, types, and utilities that power the entire sklears machine learning ecosystem.

§Overview

sklears-core defines the essential building blocks for machine learning in Rust:

Core Traits: Estimator, Fit, Predict, Transform, Score
Type System: Type-safe state machines (Untrained/Trained)
Error Handling: Comprehensive error types with context
Validation: Input validation and consistency checks
Utilities: Common helper functions and types
Parallel Processing: Abstractions for parallel algorithms
Dataset Handling: Data loading, splitting, and manipulation

§Core Traits

§Estimator

The base trait for all machine learning models:

pub trait Estimator {
    type Config;
    type Error;
}

§Fit

Training an estimator on data:

pub trait Fit<X, Y> {
    type Fitted;
    fn fit(self, x: &X, y: &Y) -> Result<Self::Fitted, Self::Error>;
}

§Predict

Making predictions with a trained model:

pub trait Predict<X, Y> {
    fn predict(&self, x: &X) -> Result<Y, Self::Error>;
}

§Transform

Transforming data (for preprocessing and dimensionality reduction):

pub trait Transform<X> {
    fn transform(&self, x: &X) -> Result<X, Self::Error>;
}

§Type-Safe State Machines

Models use phantom types to track training state at compile time:

pub struct Untrained;
pub struct Trained;

pub struct Model<State = Untrained> {
    config: ModelConfig,
    state: PhantomData<State>,
    weights: Option<Weights>, // Only Some in Trained state
}

This ensures:

✅ Can’t predict with an untrained model (compile error)
✅ Can’t accidentally re-train a trained model
✅ Type system enforces correct usage patterns

§Error Handling

Comprehensive error types with rich context:

pub enum SklearsError {
    InvalidInput(String),
    ShapeMismatch { expected: Shape, got: Shape },
    NotFitted,
    ConvergenceError { iterations: usize },
    // ... and many more
}

§Validation

Input validation utilities ensure data consistency:

use sklears_core::validation;

// Check that X and y have compatible shapes
validation::check_consistent_length(x, y)?;

// Check for NaN/Inf values
validation::check_array(x)?;

// Validate classification targets
validation::check_classification_targets(y)?;

§Parallel Processing

Abstractions for parallel algorithm execution:

use sklears_core::parallel::ParallelConfig;
use rayon::prelude::*;

let config = ParallelConfig::new().n_jobs(-1); // Use all cores

data.par_iter()
    .map(|sample| process(sample))
    .collect()

§Feature Flags

simd - Enable SIMD optimizations
gpu_support - GPU acceleration support
arrow - Apache Arrow interoperability
binary - Binary serialization support

§Examples

See individual module documentation for detailed examples.

§Integration

This crate is re-exported by the main sklears crate, so you typically don’t need to depend on it directly unless you’re building custom estimators.

Modules§

advanced_array_ops
advanced_benchmarking: Advanced Benchmarking Suite with Performance Regression Detection
algorithm_markers
api_analyzers: API Analysis Engines and Validation Components
api_data_structures: Core Data Structures for API Reference Generation
api_formatters: Output Formatters and Document Generators
api_generator_config: API Generator Configuration Module
async_traits
auto_benchmark_generation: Automatic Benchmark Generation System
autodiff
benchmarking
code_coverage
compatibility
compile_time_macros: Compile-Time Model Verification and Macro System
compile_time_validation
contribution
dataset
dependency_audit
dependent_types: Dependent Type Experiments for sklears-core
derive_macros
distributed
distributed_algorithms: Distributed Machine Learning Algorithms
dsl_impl: Domain-Specific Language (DSL) implementation for machine learning pipelines
effect_types
ensemble_improvements
error
exhaustive_error_handling
exotic_hardware: Auto-generated module structure
exotic_hardware_impls: Concrete Exotic Hardware Implementations
fallback_strategies
features
formal_verification: Formal Verification System for Machine Learning Algorithms
format_io
formatting
input_sanitization
interactive_api_reference: Interactive API Reference Generator
interactive_playground: Auto-generated module structure
macros
memory_safety
mock_objects
parallel
performance_profiling: Advanced Performance Profiling and Optimization Framework
performance_reporting
plugin: Plugin System Module
plugin_marketplace_impl: Concrete Plugin Marketplace Implementation
prelude
public
refinement_types: Refinement Types System for sklears-core
search_engines
streaming_lifetimes
trait_explorer: Trait Explorer Module
traits
tutorial_examples: Concrete Tutorial Examples and Learning Paths
tutorial_system
types
unsafe_audit
utils
validation
validation_examples
wasm_playground_impl: WebAssembly Playground Implementation

Macros§

auto_benchmark: Macro to automatically generate benchmarks for a type
benchmark_suite: Creates comprehensive benchmarking suite for ML algorithms
cfg_feature: Macro for conditional compilation based on feature flags
cfg_impl: Macro for feature-gated function implementations
cfg_type: Macro for conditional type definitions based on features
define_algorithm_category: Macro for defining algorithm categories with compile-time checking
define_estimator: Advanced macro for creating ML estimators with builder pattern and validation
define_ml_algorithm: Creates a complete ML algorithm with all necessary boilerplate
define_ml_float_bounds: Helper macro for creating trait bound combinations commonly used in ML
destructure: Macro for easy destructuring of complex types
error_context: Macro for adding location context automatically
estimator_test_suite: Creates a test suite for an estimator implementation
impl_algorithm_markers: Macro for implementing multiple marker traits at once
impl_default_config: Helper macro for creating default trait implementations
impl_ml_traits: Implements standard machine learning traits for an estimator
io_effect
parameter_map: Creates a simple parameter mapping for algorithm configurations
pattern_guard: Macro for creating pattern guards with custom validation logic
pure_effect: Convenience macros for effect creation
quick_dataset: Advanced macro definitions for sklears-core
random_effect
refinement_predicate: Macro to create a custom refinement predicate
simd_operations: Creates SIMD-optimized operation implementations
validate: Convenience macro for validation
validate_performance: Macro for performance validation
validated_param: Macro for creating compile-time validated parameters
validation_rules: Macro for creating type-safe validation rules
verify_dimensions: Macro for dimension verification
verify_model: Macro for model verification
with_fallback: Convenience macro for executing operations with fallback