A high-performance, extensible Content-Defined Chunking (CDC) library for Rust.
English │ 한국어
Overview
Clast is a modular library designed for Content-Defined Chunking (CDC). It splits data into variable-sized chunks based on content, rather than fixed offsets (Fixed-Size Chunking, FSC), making it a critical building block for data deduplication, backup systems, and efficient storage solutions.
Clast is architected to support multiple chunking algorithms, allowing developers to choose the best strategy for their specific use cases.
Key Features
- High Performance: Optimized for throughput and low CPU overhead.
- Modular Architecture: Designed to support various CDC algorithms.
- Async & Sync: Support for both synchronous
std::ioand asynchronoustokioruntimes.
Supported Algorithms
FastCDC
An implementation of the FastCDC algorithm as described in The Design of Fast Content-Defined Chunking for Data Deduplication Based Storage Systems.
It incorporates five key optimizations:
- Gear-based Rolling Hashing
- Optimized Hash Judgment
- Sub-minimum Chunk Cut-Point Skipping
- Normalized Chunking
- Rolling Two Bytes
Installation
Use cargo to install the package:
Feature Flags
Clast uses feature flags to minimize the compiled binary size. You can selectively enable the features you need.
fastcdc: Enables the FastCDC algorithm implementation. (Enabled by default)async: Enables asynchronous support usingtokio.
Example of enabling only fastcdc (default behavior):
Example of enabling fastcdc and async support:
Or in your Cargo.toml:
[]
= { = "1.0.2", = ["async"] }
Usage
Please refer to the Tutorials for detailed usage examples.
Reference
- FastCDC: Wen Xia et al., "The Design of Fast Content-Defined Chunking for Data Deduplication Based Storage Systems," IEEE Transactions on Parallel and Distributed Systems, 2020.
Contributing
Contributions are welcome! Please feel free to open a Pull Request.
- Fork the repository.
- Create your feature branch (
git checkout -b feature/new-feature). - Commit your changes (
git commit -m 'Add some feature'). - Push to the branch (
git push origin feature/new-feature). - Open a Pull Request.
Please make sure to run cargo fmt and cargo test before opening a Pull Request.
License
MIT © Arcadia Softs. See LICENSE for details.