# Chunk
[](https://crates.io/crates/tailcall-chunk)
[](https://docs.rs/tailcall-chunk)
[](https://github.com/tailcallhq/tailcall-chunk/actions)
[](LICENSE)
A Rust implementation of a persistent data structure that provides O(1) append and concatenation operations through structural sharing.
- [Chunk](#chunk)
- [Features](#features)
- [Theoretical Background](#theoretical-background)
- [Relationship to Finger Trees](#relationship-to-finger-trees)
- [Performance Trade-offs](#performance-trade-offs)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Detailed Usage](#detailed-usage)
- [Working with Custom Types](#working-with-custom-types)
- [Memory Efficiency](#memory-efficiency)
- [Performance Characteristics](#performance-characteristics)
- [Benchmark Comparison](#benchmark-comparison)
- [Implementation Details](#implementation-details)
- [Contributing](#contributing)
- [License](#license)
- [References](#references)
## Features
- **O(1) Append/Prepend Operations**: Add elements to your chunk in constant time
- **O(1) Concatenation**: Combine two chunks efficiently
- **Immutable/Persistent**: All operations create new versions while preserving the original
- **Memory Efficient**: Uses structural sharing via reference counting
- **Safe Rust**: Implemented using 100% safe Rust
## Theoretical Background
This implementation is inspired by the concepts presented in Hinze and Paterson's work on Finger Trees[^1], though simplified for our specific use case. While our implementation differs in structure, it shares similar performance goals and theoretical foundations.
### Relationship to Finger Trees
Finger Trees are a functional data structure that supports:
- Access to both ends in amortized constant time
- Concatenation in logarithmic time
- Persistence through structural sharing
Our `Chunk` implementation achieves similar goals through a simplified approach:
- We use `Append` nodes for constant-time additions
- The `Concat` variant enables efficient concatenation
- `Rc` (Reference Counting) provides persistence and structural sharing
Like Finger Trees, our structure can be viewed as an extension of Okasaki's implicit deques[^2], but optimized for our specific use cases. While Finger Trees offer a more general-purpose solution with additional capabilities, our implementation focuses on providing:
- Simpler implementation
- More straightforward mental model
- Specialized performance characteristics for append/concat operations
### Performance Trade-offs
While Finger Trees achieve logarithmic time for concatenation, our implementation optimizes for constant-time operations through lazy evaluation. This means:
- Append, Prepend and concatenation are always O(1)
- The cost is deferred to when we need to materialize the sequence (via `as_vec()`)
- Memory usage grows with the number of operations until materialization
This trade-off is particularly beneficial in scenarios where:
- Write operations need to be performed extensively
- Multiple transformations are chained
- Not all elements need to be materialized
- Structural sharing can be leveraged across operations
## Installation
Add this to your `Cargo.toml`:
```toml
[dependencies]
tailcall-chunk = "0.1.0"
```
## Quick Start
```rust
use chunk::Chunk;
// Create a new chunk and append some elements
let chunk1 = Chunk::default()
.append(1)
.append(2);
// Create another chunk
let chunk2 = Chunk::default()
.append(3)
.append(4);
// Concatenate chunks in O(1) time
let combined = chunk1.concat(chunk2);
// Convert to vector when needed
assert_eq!(combined.as_vec(), vec![1, 2, 3, 4]);
```
## Detailed Usage
### Working with Custom Types
```rust
use chunk::Chunk;
#[derive(Debug, PartialEq)]
struct Person {
name: String,
age: u32,
}
let people = Chunk::default()
.append(Person {
name: "Alice".to_string(),
age: 30
})
.append(Person {
name: "Bob".to_string(),
age: 25
});
// Access elements
let people_vec = people.as_vec();
assert_eq!(people_vec[0].name, "Alice");
assert_eq!(people_vec[1].name, "Bob");
```
### Memory Efficiency
The `Chunk` type uses structural sharing through reference counting (`Rc`), which means:
- Appending or concatenating chunks doesn't copy the existing elements
- Memory is automatically freed when no references remain
- Multiple versions of the data structure can coexist efficiently
```rust
use chunk::Chunk;
let original = Chunk::default().append(1).append(2);
let version1 = original.clone().append(3); // Efficient cloning
let version2 = original.clone().append(4); // Both versions share data
```
## Performance Characteristics
| `new()` | O(1) | O(1) |
| `append()` | O(1) | O(1) |
| `concat()` | O(1) | O(1) |
| `transform()` | O(1) | O(1) |
| `transform_flatten()` | O(1) | O(1) |
| `as_vec()` | O(n) | O(n) |
| `clone()` | O(1) | O(1) |
### Benchmark Comparison
The following table compares the actual performance of Chunk vs Vector operations based on [our benchmarks](benches/operations.rs) (lower is better):
| Append | 604.02 µs | 560.58 µs | Vec is `~1.08` times faster |
| Prepend | 1.63 ms | 21.95 ms | Chunk is `~13.47` times faster |
| Concat | 71.17 ns | 494.45 µs | Chunk is `~6,947` times faster |
| Clone | 4.16 ns | 1.10 µs | Chunk is `~264` times faster |
Note: These benchmarks represent specific test scenarios and actual performance may vary based on usage patterns. Chunk operations are optimized for bulk operations and scenarios where structural sharing provides benefits. View the complete benchmark code and results in our [operations.rs](benches/operations.rs) benchmark file.
## Implementation Details
The `Chunk<A>` type is implemented as an enum with four variants:
- `Empty`: Represents an empty chunk
- `Single`: Represents a chunk with a single element
- `Concat`: Represents the concatenation of two chunks
- `TransformFlatten`: Represents a lazy transformation and flattening of elements
The data structure achieves its performance characteristics through:
- Structural sharing using `Rc`
- Lazy evaluation of concatenation and transformations
- Immutable operations that preserve previous versions
## Contributing
We welcome contributions! Here's how you can help:
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -am 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
Please make sure to:
- Update documentation
- Add tests for new features
- Follow the existing code style
- Update the README.md if needed
## License
This project is licensed under either of
- Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
## References