<div align='center'>

# 🦛 ChonkieR 🦀✨
[](https://crates.io/crates/chonkie)
[](https://github.com/chonkie-inc/chonkie-rs/blob/main/LICENSE)
[](https://github.com/chonkie-inc/chonkie-rs/blob/main/README.md#installation)
[](https://discord.gg/rYYp6DC4cv)
[](https://github.com/chonkie-inc/chonkie-rs/stargazers)
_The no-nonsense, lightweight and fast chunking library that's ready to CHONK your text, in Rust 🦀!_
[Installation](#installation) •
[Usage](#usage) •
[Chunkers](#chunkers) •
[Acknowledgements](#acknowledgements) •
[Citation](#citation)
</div>
Chonkie just got low-leveled! 🦀 Your favorite python chunking library is now in Rust~ even faster, smaller and reliable than ever!
**🦀 Rusty & Reliable**: Built with Rust for memory safety and performance. </br>
**🚀 Feature-rich**: All the CHONKs you'd ever need </br>
**✨ Easy to use**: Add Crate, Use Crate, CHONK </br>
**⚡ Blazingly Fast**: CHONK at the speed of Rust! zooooom </br>
**🪶 Light-weight**: No bloat, just CHONK </br>
**🦛 Cute CHONK mascot**: psst it's a pygmy hippo btw </br>
**❤️ [Moto Moto](#acknowledgements)'s favorite Rust library** </br>
**ChonkieR** is a chunking library that "**just works**" ✨
# Installation
To add `ChonkieR` to your project, run:
```bash
cargo add chonkie # Or add it to your Cargo.toml
```
`ChonkieR` follows the rule of minimum dependencies. Features can be enabled via Cargo features.
Don't want to think about it? Simply enable `all` features (Not recommended for production binaries unless needed)
```toml
# Cargo.toml
[dependencies]
chonkie = { version = "0.1.0", features = ["all"] } # Replace with desired version
```
# Usage
Here's a basic example to get you started:
```rust
use chonkie::CharacterTokenizer;
use chonkie::RecursiveChunker;
use chonkie::types::RecursiveRules;
fn main() {
// Initialize the chunker
let chunker = RecursiveChunker::new(CharacterTokenizer::new(), 512, RecursiveRules::default());
// Chunk some text
let text = "ChonkieR is the goodest boi! My favorite chunking hippo hehe.";
let chunks: Vec<RecursiveChunk> = chunker.chunk(text);
// Access chunks
for chunk in chunks {
println!("Chunk: {}", chunk.text);
println!("Tokens: {}", chunk.token_count);
}
}
```
Check out more usage examples in the [examples](https://github.com/chonkie-inc/chonkie-rs/tree/main/examples) folder!
# Chunkers
ChonkieR currently supports the following chunkers:
- **TokenChunker**: Split text into fixed-size token chunks.
- **SentenceChunker**: Split text into chunks based on sentence boundaries.
- **RecursiveChunker**: Recursively split the text into chunks based on the rules provided.
# Acknowledgements
ChonkieR would like to CHONK its way through a special thanks to all the users and contributors who have helped make this library what it is today! Your feedback, issue reports, and improvements have helped make ChonkieR the CHONKIEST it can be.
And of course, special thanks to [Moto Moto](https://www.youtube.com/watch?v=I0zZC4wtqDQ&t=5s) for endorsing ChonkieR with his famous quote:
> "I like them big, I like them chonkieR." ~ Moto Moto (He really said this)
# Citation
If you use ChonkieR in your research, please cite it as follows:
```bibtex
@software{chonkie2025,
author = {Minhas, Bhavnick AND Nigam, Shreyash},
title = {Chonkie: A no-nonsense fast, lightweight, and efficient text chunking library},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/chonkie-inc/chonkie-rs}},
}
```