Tilesort
A sorting algorithm optimized for datasets with pre-sorted contiguous blocks (tiles).
Overview
Tilesort is a specialized sorting algorithm that achieves high performance when your data consists of non-overlapping, pre-sorted contiguous blocks. Instead of sorting individual elements, tilesort identifies these "tiles" and arranges them as discrete units.
When to Use Tilesort
Tilesort is particularly effective when:
- Data arrives in pre-sorted chunks or batches
- Data structures maintain sorted regions
- Distributed systems produce sorted shards that need merging
- Merging sorted log files or event streams
- Processing time-series data with sorted segments
- You have k tiles in a dataset of n elements where k << n
When NOT to Use Tilesort
Do not use tilesort if:
- Your data is randomly shuffled - Tilesort has O(n²) worst-case complexity when there are no pre-sorted tiles. Use standard sort algorithms instead.
- You don't know if your data has tiles - If your data doesn't have pre-sorted regions, tilesort will be significantly slower than standard sorting.
- The number of tiles approaches the number of elements (k ≈ n) - The overhead of tile detection and management provides no benefit when most elements are in their own tile.
Tilesort is a specialized algorithm for a specific data pattern. If you're unsure whether your data has pre-sorted tiles, use a standard sorting algorithm.
Performance
For a dataset of n elements partitioned into k tiles:
- Time Complexity: O(n + k²) where k << n
- Space Complexity: O(n) for the output buffer
- Best Case: O(n) when data is already sorted (k = 1)
- Typical Case: Significantly faster than O(n log n) when k is small
The performance is primarily determined by k (number of tiles) rather than n (total elements), making it highly efficient when the number of tiles is much smaller than the total number of elements.
Installation
Python
Requirements: Python 3.8-3.14
Rust
Add to your Cargo.toml:
[]
= "0.1.0"
Usage
Python
The Python API mirrors Python's built-in list.sort() and sorted() functions:
# Sort a list in place (like list.sort())
=
# [1, 2, 3, 4, 5, 6, 7, 8]
# Return a sorted copy (like sorted())
=
=
# [1, 2, 3, 4, 5, 6, 7, 8]
# [3, 4, 5, 1, 2, 6, 7, 8] (unchanged)
# Sort with a key function
=
# ["a", "cat", "dog", "bear", "elephant"]
# Sort in reverse order
=
# [9, 6, 5, 4, 3, 2, 1, 1]
# Combine key and reverse
=
# [-5, 4, -3, 2, -1]
# Sort custom objects
=
=
=
# Now sorted by age: Bob (25), Alice (30), Charlie (35)
Rust
use ;
How It Works
Tilesort operates in two phases:
- Scan Phase: Identifies contiguous sorted blocks (tiles) in the input data
- Restructure Phase: Rearranges tiles in sorted order to produce the final output
The algorithm automatically detects tile boundaries by scanning for order violations. When elements are out of order, a new tile begins. The tiles are then sorted based on their key ranges and concatenated to produce the final sorted sequence.
Example
Given input: [3, 4, 5, 1, 2, 6, 7, 8]
-
Scan identifies three tiles:
- Tile 0:
[3, 4, 5](range 3-5) - Tile 1:
[1, 2](range 1-2) - Tile 2:
[6, 7, 8](range 6-8)
- Tile 0:
-
Tiles are sorted by their ranges: Tile 1, Tile 0, Tile 2
-
Output:
[1, 2, 3, 4, 5, 6, 7, 8]
API Reference
Python
tilesort.sort(list, *, key=None, reverse=False)- Sort a list in placetilesort.sorted(list, *, key=None, reverse=False)- Return a sorted copy
Both functions support:
key: Optional function to extract comparison key from each elementreverse: IfTrue, sort in descending order
Rust
In-place sorting:
tilesort(data: &mut [T])- Sort in ascending ordertilesort_reverse(data: &mut [T])- Sort in descending ordertilesort_by_key(data: &mut [T], key_fn: F)- Sort by custom keytilesort_by_key_reverse(data: &mut [T], key_fn: F)- Sort by custom key, descending
Copying variants:
tilesorted(data: &[T]) -> Vec<T>- Return sorted copytilesorted_reverse(data: &[T]) -> Vec<T>- Return sorted copy, descendingtilesorted_by_key(data: &[T], key_fn: F) -> Vec<T>- Return sorted copy by keytilesorted_by_key_reverse(data: &[T], key_fn: F) -> Vec<T>- Return sorted copy by key, descending
All functions work with any type T that implements Ord + Clone. Key functions must return a type K that implements
Ord.
Development
Building from Source
Rust Library
# Run tests
# Build the library
# Generate documentation
Python Package
Requirements:
- Rust toolchain (1.71.1+)
- Python 3.8-3.14
- uv (recommended) or maturin
# Install development dependencies
# Build and install in development mode
# Run Python tests
# or: uv run --group dev pytest python/tests/
# Run type checking
# or: uv run --group dev mypy python/
# Run all tests (Rust + Python)
# Run linter
# Format code
Development Commands (Just)
This project uses Just as a command runner:
Benchmarks
Performance benchmarks compare tilesort against Rust's standard sort across different scenarios:
# Run all benchmarks
# Run specific benchmark group
Benchmark scenarios:
- uniform_tiles: All tiles have the same size (~1K elements)
- varied_tiles: Tiles of different sizes (100, 1K, 5K, 10K)
- hybrid_tiles: Mix of single elements and large blocks
- random_data: Completely random (worst case for tilesort)
- key_function: Structured data requiring key extraction
- realistic_workload: 1M elements with ~10K element tiles (mirrors real-world usage)
Results are saved to target/criterion/ with HTML reports.
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE or http://opensource.org/licenses/MIT)
at your option.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Areas for Contribution
- Performance benchmarks and optimizations
- Additional language bindings
- Documentation improvements
- Bug reports and fixes
Changelog
See CHANGELOG.md for a list of changes in each release.