single_algebra 0.6.0

# single-algebra 🧮

A powerful linear algebra and machine learning utilities library for Rust, providing efficient matrix operations, dimensionality reduction, and statistical analysis tools.

## Features 🚀

- **Efficient Matrix Operations**: Support for both dense and sparse matrices (CSR/CSC formats)
- **Dimensionality Reduction**: PCA implementations for both dense and sparse matrices
- **SVD Implementations**: Multiple SVD backends including LAPACK and Faer
- **Statistical Analysis**: Comprehensive statistical operations with batch processing support
- **Similarity Measures**: Collection of distance/similarity metrics for high-dimensional data
- **Masking Support**: Selective data processing with boolean masks
- **Parallel Processing**: Efficient multi-threaded implementations using Rayon
- **Feature-Rich**: Configurable through feature flags for specific needs

## Matrix Operations 📊

- **SVD Decomposition**: Choose between parallel, LAPACK, or Faer implementations
- **Sparse Matrix Support**: Comprehensive operations for CSR and CSC sparse matrix formats
- **Masked Operations**: Selective data processing with boolean masks
- **Batch Processing**: Statistical operations grouped by batch identifiers
- **Normalization**: Row and column normalization with customizable targets

## Dimensionality Reduction ⬇️

- **PCA Framework**: Flexible implementation with customizable SVD backends
- **Dense Matrix PCA**: Optimized implementation for dense matrices
- **Sparse Matrix PCA**: Memory-efficient PCA for sparse matrices
- **Masked Sparse PCA**: Apply PCA on selected features only
- **Incremental Processing**: Support for large datasets that don't fit in memory

## Similarity Measures 📏

- **Cosine Similarity**: Measure similarity based on the cosine of the angle between vectors
- **Euclidean Similarity**: Similarity based on Euclidean distance
- **Pearson Similarity**: Measure linear correlation between vectors
- **Manhattan Similarity**: Similarity based on Manhattan distance
- **Jaccard Similarity**: Measure similarity as intersection over union

## Statistical Analysis 📈

- **Basic Statistics**: Mean, variance, sum, min/max operations
- **Batch Statistics**: Compute statistics grouped by batch identifiers
- **Matrix Variance**: Efficient variance calculations for matrices
- **Nonzero Counting**: Count non-zero elements in sparse matrices
- **Masked Statistics**: Compute statistics on selected rows/columns only

## Installation

Add this to your `Cargo.toml`:

```toml
[dependencies]
single-algebra = "0.5.0"
```

### Feature Flags

Enable optional features based on your needs:

```toml
[dependencies]
single-algebra = { version = "0.5.0", features = ["lapack", "faer"] }
```

Available features:
- `smartcore`: Enable integration with the SmartCore machine learning library
- `lapack`: Use the LAPACK backend for linear algebra operations
- `faer`: Use the Faer backend for linear algebra operations
- `simba`: Enable SIMD optimizations via simba

## Usage Examples

### Basic PCA with LAPACK Backend

```rust
use ndarray::{Array2, ArrayView2};
use single_algebra::dimred::pca::dense::{PCABuilder, LapackSVD};

// Create a sample matrix
let data = array![[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]];

// Build PCA with LAPACK backend
let mut pca = PCABuilder::new(LapackSVD)
    .n_components(2)
    .center(true)
    .scale(false)
    .build();

// Fit and transform data
pca.fit(data.view()).unwrap();
let transformed = pca.transform(data.view()).unwrap();

// Access results
let components = pca.components().unwrap();
let explained_variance = pca.explained_variance_ratio().unwrap();
```

### Sparse Matrix Operations

```rust
use nalgebra_sparse::{CooMatrix, CsrMatrix};
use single_algebra::sparse::MatrixSum;

// Create a sparse matrix
let mut coo = CooMatrix::new(3, 3);
coo.push(0, 0, 1.0);
coo.push(1, 1, 2.0);
coo.push(2, 2, 3.0);
let csr: CsrMatrix<f64> = (&coo).into();

// Calculate column sums
let col_sums: Vec<f64> = csr.sum_col().unwrap();
```

### Batch Processing

```rust
use nalgebra_sparse::CsrMatrix;
use single_algebra::sparse::BatchMatrixMean;

// Sample data with batch identifiers
let matrix = create_sparse_matrix();
let batches = vec!["batch1", "batch1", "batch2", "batch2", "batch3"];

// Calculate mean per batch
let batch_means = matrix.mean_batch_col(&batches).unwrap();

// Access results for a specific batch
let batch1_means = batch_means.get("batch1").unwrap();
```

### Similarity Measures

```rust
use ndarray::Array1;
use single_algebra::similarity::{SimilarityMeasure, CosineSimilarity};

let a = Array1::from_vec(vec![1.0, 2.0, 3.0]);
let b = Array1::from_vec(vec![4.0, 5.0, 6.0]);

let cosine = CosineSimilarity;
let similarity = cosine.calculate(a.view(), b.view());
```

## Performance Considerations

- For large matrices, consider using sparse representations (CSR/CSC)
- Enable the appropriate backend (`lapack` or `faer`) based on your needs
- Use masked operations when working with subsets of data
- Batch processing can significantly improve performance for grouped operations

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the BSD 3-Clause License - see the LICENSE.md file for details.

## Acknowledgments

- The LAPACK integration is built upon the `nalgebra-lapack` crate
- Some components are inspired by scikit-learn's implementations
- The Faer backend leverages the high-performance `faer` crate