# circulant-rs
A high-performance Rust library for block-circulant matrix operations and quantum walk simulations.
[]()
[]()
---
<table>
<tr>
<td width="33%" align="center">
### ⚡ 676× Faster
FFT-based O(N log N) vs naive O(N²)
*Scales to 50,000× at N=1M*
</td>
<td width="33%" align="center">
### 💾 99.9999% Less Memory
N=1M: Dense 7.5 TB → circulant-rs 8 MB
*From impossible to trivial*
</td>
<td width="33%" align="center">
### 🔬 1000× Larger Simulations
Quantum walks with N=1,000,000 positions
*Previously limited to N≈1,000*
</td>
</tr>
</table>
---
## Overview
`circulant-rs` exploits the mathematical structure of circulant matrices to achieve **O(N log N)** complexity instead of **O(N²)** for matrix-vector multiplication. This is accomplished through the FFT diagonalization property: every circulant matrix can be diagonalized by the Discrete Fourier Transform.
The library provides:
- **1D Circulant matrices** - For signal processing, convolution, and 1D quantum walks
- **2D Block Circulant (BCCB) matrices** - For image processing and 2D periodic systems
- **Quantum Walk simulation** - Discrete-time coined quantum walks with FFT-accelerated evolution
- **Full serialization support** - Save and restore quantum states and walk configurations
## Installation
Add to your `Cargo.toml`:
```toml
[dependencies]
circulant-rs = "0.2"
```
### Feature Flags
| `std` | Yes | Standard library support |
| `physics` | Yes | Quantum walk simulation module |
| `parallel` | Yes | Rayon-based parallelization |
| `serde` | Yes | Serialization with serde/bincode |
| `vision` | No | Image processing with BCCB filters |
| `visualize` | No | Visualization with plotters |
| `visualize-svg` | No | SVG output backend |
| `visualize-bitmap` | No | PNG/bitmap output backend |
| `python` | No | Python bindings via PyO3 |
## Quick Start
### 1D Circulant Matrix Multiplication
```rust
use circulant_rs::prelude::*;
use num_complex::Complex;
fn main() -> Result<()> {
// Create a circulant matrix from its first row
let generator = vec![1.0, 2.0, 3.0, 4.0];
let matrix = Circulant::from_real(generator)?;
// Multiply by a vector - O(N log N) via FFT!
let x = vec![
Complex::new(1.0, 0.0),
Complex::new(0.0, 0.0),
Complex::new(1.0, 0.0),
Complex::new(0.0, 0.0),
];
let result = matrix.mul_vec(&x)?;
println!("Result: {:?}", result);
Ok(())
}
```
### Quantum Walk Simulation
```rust
use circulant_rs::prelude::*;
fn main() -> Result<()> {
// Create a quantum walk on a 256-node ring with Hadamard coin
let walk = CoinedWalk1D::<f64>::new(256, Coin::Hadamard)?;
// Initialize walker at position 128
let initial_state = QuantumState::localized(128, 256, 2)?;
// Simulate 100 steps
let final_state = walk.simulate(initial_state, 100);
// Get probability distribution
let _probabilities = final_state.position_probabilities();
// Quantum walks show ballistic spreading (linear in time)
// compared to classical random walks (sqrt of time)
println!("Norm preserved: {}", final_state.norm_squared());
Ok(())
}
```
### 2D Block Circulant Operations
```rust
use circulant_rs::core::BlockCirculant;
use ndarray::Array2;
use num_complex::Complex;
fn main() {
// Create a 2D convolution kernel
let kernel = Array2::from_shape_vec((3, 3), vec![
Complex::new(1.0, 0.0), Complex::new(2.0, 0.0), Complex::new(1.0, 0.0),
Complex::new(2.0, 0.0), Complex::new(4.0, 0.0), Complex::new(2.0, 0.0),
Complex::new(1.0, 0.0), Complex::new(2.0, 0.0), Complex::new(1.0, 0.0),
]).unwrap();
// Create BCCB filter (zero-padded to output size)
let filter = BlockCirculant::from_kernel(kernel, 64, 64).unwrap();
// Apply to 2D data - O(N log N) via 2D FFT!
let input = Array2::zeros((64, 64));
let output = filter.mul_array(&input).unwrap();
}
```
## Performance
### Measured Benchmarks: FFT vs Naive O(N²)
| 64 | 573 ns | 18.8 µs | **33×** |
| 256 | 3.2 µs | 335 µs | **105×** |
| 1,024 | 12.3 µs | 4.76 ms | **387×** |
| 2,048 | 28.4 µs | 19.2 ms | **676×** |
| 65,536 | ~800 µs | ~68 min* | **~5,000×** |
| 1,048,576 | ~15 ms | ~17 days* | **~50,000×** |
*Extrapolated from measured scaling
### Quantum Walk Simulation (100 steps, Hadamard coin)
| 256 | 578 µs | 5.8 µs | 44M pos/sec |
| 1,024 | 2.44 ms | 24.4 µs | 42M pos/sec |
| 4,096 | 11.9 ms | 119 µs | 34M pos/sec |
| 16,384 | 71.3 ms | 713 µs | 23M pos/sec |
| 65,536 | 505 ms | 5.05 ms | 13M pos/sec |
### Memory Scaling
| 1,000 | 16 KB | 16 MB | 1,000× |
| 10,000 | 160 KB | 1.6 GB | 10,000× |
| 100,000 | 1.6 MB | 160 GB | 100,000× |
| 1,000,000 | 16 MB | 16 TB | **1,000,000×** |
## Use Cases
| **Quantum Computing** | Quantum walk simulations on rings | Simulate N=1M positions vs N=1K with dense matrices |
| **Signal Processing** | Real-time periodic FIR filtering | O(N log N) convolution at audio sample rates |
| **Image Processing** | 2D periodic convolution (blur, edge detection) | 4K image filtering in milliseconds |
| **Cryptography** | Lattice-based schemes (NTRU) | Efficient polynomial multiplication |
| **Machine Learning** | Circulant neural network layers | Reduced parameter count with structured weights |
See [docs/OVERVIEW.md](./docs/OVERVIEW.md) for detailed use case examples with code.
## Mathematical Background
### Circulant Matrix Structure
A circulant matrix is defined by its first row `[c₀, c₁, ..., cₙ₋₁]`:
```
| cₙ₋₂ cₙ₋₁ c₀ ... cₙ₋₃ |
| ... |
| c₁ c₂ c₃ ... c₀ |
```
### FFT Diagonalization
Every circulant matrix C can be diagonalized by the DFT matrix F:
```
C = F⁻¹ · D · F
```
Where D is diagonal with eigenvalues equal to the DFT of the generator. This enables O(N log N) multiplication:
```
y = C·x = F⁻¹ · D · F · x = IFFT(FFT(c) ⊙ FFT(x))
```
### Quantum Walk Evolution
The discrete-time coined quantum walk evolves as:
1. **Coin flip**: Apply coin operator C to internal degree of freedom
2. **Shift**: Move walker based on coin state
The shift operator is a circulant matrix, enabling FFT-accelerated simulation.
## Architecture
See [ARCHITECTURE.md](./docs/ARCHITECTURE.md) for detailed technical documentation including:
- Module structure and dependencies
- Type system and trait hierarchy
- FFT backend abstraction
- Quantum walk implementation details
- Mathematical derivations
## Modules
| `core` | `Circulant<T>` and `BlockCirculant<T>` types |
| `fft` | FFT backend trait and RustFFT implementation |
| `traits` | `Scalar`, `CirculantOps`, `BlockOps` traits |
| `physics` | Quantum state, coins, 1D/2D walks, Hamiltonian |
| `vision` | BCCB image filtering and kernels |
| `visualize` | Quantum state and heatmap plotting |
| `python` | PyO3 bindings for Python |
| `error` | Error types and Result alias |
| `prelude` | Convenient re-exports |
## Running Benchmarks
You can reproduce all benchmark results on your own machine.
### Prerequisites
```bash
# Rust 1.70+ required
rustc --version
# For Python comparison (optional)
pip install numpy scipy
```
### Rust Benchmarks (Criterion)
```bash
# Run all benchmarks (includes FFT vs naive, quantum walk, 2D BCCB)
cargo bench --bench scaling_benchmark --features "physics parallel serde"
# Run core FFT multiplication benchmark only
cargo bench --bench fft_multiply
# View results in browser (after running benchmarks)
open target/criterion/report/index.html # macOS
xdg-open target/criterion/report/index.html # Linux
```
**Benchmark output location:** `target/criterion/`
Each benchmark group generates:
- `report/index.html` - Interactive HTML report with graphs
- `<group>/report/index.html` - Detailed per-group analysis
- Raw data in JSON format for custom analysis
### Python Comparison Benchmarks
```bash
cd benchmarks/python
# Run all comparisons (NumPy FFT vs SciPy dense vs naive)
python compare_circulant.py
# Results saved to results_1d.json
```
**Output includes:**
- 1D multiplication timing across sizes (64 to 262,144)
- Accuracy verification (FFT vs naive)
- Memory usage comparison
- Quantum walk simulation timing
### Quick Validation
```bash
# Run test suite to verify correctness
cargo test --all-features
# Run example to see library in action
cargo run --example quantum_walk_1d --features physics
```
### Interpreting Results
- **Speedup** = Naive time / FFT time (higher is better)
- **Throughput** = Elements processed per second
- Criterion reports include statistical analysis (mean, std dev, outliers)
- Compare your results with the tables above; relative speedups should be consistent across hardware
For detailed methodology, see [docs/BENCHMARKS.md](./docs/BENCHMARKS.md).
## Roadmap
### v0.2.0 (Current)
- [x] 1D Circulant with FFT multiplication
- [x] 2D Block Circulant (BCCB)
- [x] Quantum walk simulation (1D coined walks)
- [x] Serde serialization
- [x] Rayon parallelization
- [x] Vision module (image filtering with BCCB)
- [x] 2D quantum walks on torus
- [x] Continuous-time walks (Hamiltonian module)
- [x] Visualization with plotters
- [x] Python bindings via PyO3
### v0.3.0 (Planned)
- [ ] GPU acceleration (wgpu/cuda)
- [ ] N-dimensional circulant tensors
- [ ] Circulant neural network layers
## License
Licensed under either of:
- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.