avila-parallel 0.2.0

Zero-dependency parallel computation library with advanced operations: sorting, zipping, chunking
Documentation
# avila-parallel

[![Crates.io](https://img.shields.io/crates/v/avila-parallel.svg)](https://crates.io/crates/avila-parallel)
[![Documentation](https://docs.rs/avila-parallel/badge.svg)](https://docs.rs/avila-parallel)
[![License](https://img.shields.io/crates/l/avila-parallel.svg)](LICENSE)
[![Rust](https://img.shields.io/badge/rust-1.70%2B-blue.svg)](https://www.rust-lang.org)
[![CI](https://img.shields.io/badge/CI-passing-brightgreen.svg)]()

A **zero-dependency** parallel computation library for Rust with **true parallel execution**.

## ๐Ÿ“š Documentation

- **[Quick Start]#-quick-start** - Get started in 5 minutes
- **[API Documentation]https://docs.rs/avila-parallel** - Full API reference
- **[Optimization Guide]OPTIMIZATION_GUIDE.md** - Performance tuning tips
- **[Contributing]CONTRIBUTING.md** - How to contribute
- **[Changelog]CHANGELOG.md** - Version history

## โœจ Features

- **๐Ÿš€ True Parallel Execution**: Real multi-threaded processing using `std::thread::scope`
- **๐Ÿ“ฆ Zero Dependencies**: Only uses Rust standard library (`std::thread`, `std::sync`)
- **๐Ÿ”’ Thread Safe**: All operations use proper synchronization primitives
- **๐Ÿ“Š Order Preservation**: Results maintain original element order
- **โšก Smart Optimization**: Automatically falls back to sequential for small datasets
- **๐ŸŽฏ Rich API**: Familiar iterator-style methods

## ๐Ÿ“‹ Quick Start

Add to your `Cargo.toml`:

```toml
[dependencies]
avila-parallel = "0.1.0"
```

### Basic Usage

```rust
use avila_parallel::prelude::*;

fn main() {
    // Parallel iteration
    let data = vec![1, 2, 3, 4, 5];
    let sum: i32 = data.par_iter()
        .map(|x| x * 2)
        .sum();
    println!("Sum: {}", sum); // Sum: 30

    // High-performance par_vec API
    let results: Vec<i32> = data.par_vec()
        .map(|&x| x * x)
        .collect();
    println!("{:?}", results); // [1, 4, 9, 16, 25]
}
```

## ๐ŸŽฏ Available Operations

### Transformation
- `map` - Transform each element
- `filter` - Keep elements matching predicate
- `cloned` - Clone elements (for reference iterators)

### Aggregation
- `sum` - Sum all elements
- `reduce` - Reduce with custom operation
- `fold` - Fold with identity and operation
- `count` - Count elements matching predicate

### Search
- `find_any` - Find any element matching predicate
- `all` - Check if all elements match
- `any` - Check if any element matches

### Other
- `partition` - Split into two vectors based on predicate
- `for_each` - Execute function on each element
- `collect` - Collect results into a collection

## ๐Ÿ“Š Performance

The library automatically:
- Detects CPU core count
- Distributes work efficiently across threads
- Falls back to sequential execution for small datasets (< 512 elements/chunk)
- Maintains result order

### Benchmark Results (10M elements)

| Operation | Sequential | Parallel | Speedup |
|-----------|------------|----------|---------|
| Simple Sum | 7.1ms | 14.2ms | 0.50x* |
| Complex Computation | 229.5ms | 236.1ms | 0.97x |
| Filter | 67.5ms | 92.4ms | 0.73x |

*Simple operations have thread overhead. Use for CPU-bound work (>100ยตs per element).

## ๐Ÿ”ง Advanced Usage

### Using Executor Functions Directly

```rust
use avila_parallel::executor::*;

let data = vec![1, 2, 3, 4, 5];

// Parallel map
let results = parallel_map(&data, |x| x * 2);

// Parallel filter
let evens = parallel_filter(&data, |x| *x % 2 == 0);

// Parallel reduce
let sum = parallel_reduce(&data, |a, b| a + b);

// Parallel partition
let (evens, odds) = parallel_partition(&data, |x| *x % 2 == 0);

// Find first matching
let found = parallel_find(&data, |x| *x > 3);

// Count matching
let count = parallel_count(&data, |x| *x % 2 == 0);
```

### Mutable Iteration

```rust
use avila_parallel::prelude::*;

let mut data = vec![1, 2, 3, 4, 5];
data.par_iter_mut()
    .for_each(|x| *x *= 2);
println!("{:?}", data); // [2, 4, 6, 8, 10]
```

## ๐Ÿ—๏ธ Architecture

### Thread Management
- Uses `std::thread::scope` for lifetime-safe thread spawning
- Automatic CPU detection via `std::thread::available_parallelism()`
- Chunk-based work distribution with adaptive sizing

### Synchronization
- `Arc<Mutex<>>` for safe result collection
- No unsafe code in public API
- Order preservation through indexed chunks

### Performance Tuning

**Default Configuration:**
```rust
const MIN_CHUNK_SIZE: usize = 1024;  // Optimized based on benchmarks
const MAX_CHUNKS_PER_THREAD: usize = 8;
```

**Environment Variables:**
```bash
# Customize minimum chunk size (useful for tuning specific workloads)
export AVILA_MIN_CHUNK_SIZE=2048

# Run your program
cargo run --release
```

**When to Adjust:**
- **Increase** (2048+): Very expensive operations (>1ms per element)
- **Decrease** (512): Light operations but large datasets
- **Keep default** (1024): Most use cases

## ๐Ÿงช Examples

### CPU-Intensive Computation
```rust
use avila_parallel::prelude::*;

let data: Vec<i32> = (0..10_000_000).collect();

// Perform expensive computation in parallel
let results = data.par_vec()
    .map(|&x| {
        // Simulate expensive operation
        let mut result = x;
        for _ in 0..100 {
            result = (result * 13 + 7) % 1_000_000;
        }
        result
    })
    .collect();
```

### Data Analysis
```rust
use avila_parallel::prelude::*;

let data: Vec<f64> = vec![1.0, 2.0, 3.0, 4.0, 5.0];

// Calculate statistics in parallel
let sum: f64 = data.par_iter().sum();
let count = data.len();
let mean = sum / count as f64;

let variance = data.par_vec()
    .map(|&x| (x - mean).powi(2))
    .into_iter()
    .sum::<f64>() / count as f64;
```

## ๐Ÿ” When to Use

### โœ… Good Use Cases
- CPU-bound operations (image processing, calculations, etc.)
- Large datasets (>10,000 elements)
- Independent computations per element
- Expensive operations (>100ยตs per element)

### โŒ Not Ideal For
- I/O-bound operations (use async instead)
- Very small datasets (<1,000 elements)
- Simple operations (<10ยตs per element)
- Operations requiring shared mutable state

## ๐Ÿ› ๏ธ Building from Source

```bash
git clone https://github.com/your-org/avila-parallel
cd avila-parallel
cargo build --release
cargo test
```

## ๐Ÿ“ License

MIT License - see [LICENSE](LICENSE) file for details

## ๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## ๐Ÿ“š Documentation

Full API documentation is available at [docs.rs/avila-parallel](https://docs.rs/avila-parallel)

## ๐Ÿ”— Related Projects

- [Rayon]https://github.com/rayon-rs/rayon - Full-featured data parallelism library
- [crossbeam]https://github.com/crossbeam-rs/crossbeam - Concurrent programming tools

## โญ Star History

If you find this project useful, consider giving it a star!