# base122-fast
[](https://crates.io/crates/base122-fast)
[](https://docs.rs/base122-fast)
[](https://opensource.org/licenses/MIT)
A high-performance [Base122](https://github.com/kevinAlbs/Base122) implementation in Rust.
Base122 is a binary-to-text encoding scheme designed to be significantly more space-efficient than Base64. It incurs only **~14% overhead** (compared to Base64's 33%) while remaining valid UTF-8.
## Performance
This crate is engineered for maximum throughput leveraging several low-level optimizations:
* **SWAR (SIMD Within A Register)**: Processes 64-bit words using bitwise masks to detect illegal characters across multiple bytes simultaneously, minimizing per-byte overhead.
* **Branchless Fast-Paths** – Efficiently bypasses escape-character logic for ASCII-compatible segments.
* **Zero-Copy Strategy** – Utilizes direct pointer arithmetic and pre-allocated buffers to minimize heap allocations.
* **Unsafe Intrinsics** – Leverages `unsafe` Rust for unchecked memory access and optimized bit manipulation.
### Benchmarks
The following throughput was measured on **uniform random binary data** using an **AMD Ryzen 5 5600** (single core).
| 16 B | 331 | 318 |
| 64 B | 672 | 626 |
| 1 KiB | 1135 | 1070 |
| 64 KiB | 1089 | 871 |
| 1 MiB | 518 | 573 |
| 16 MiB | 533 | 597 |
For large payloads (≥1 MiB), the implementation sustains approximately **4.1 Gbps encoding** and **4.8 Gbps decoding**.
> **Note:** Run `cargo bench` to reproduce these results. The benchmarks evaluate encoding, decoding, and round-trip integrity on random byte streams.
## Quick Start
Add to `Cargo.toml`:
```toml
[dependencies]
base122-fast = "0.1"
```
### Encode
```rust
use base122_fast::encode;
let data = b"hello world";
let encoded = encode(data);
println!("{}", encoded);
```
### Decode
```rust
use base122_fast::{encode, decode};
let data = b"hello world";
let encoded = encode(data);
let decoded = decode(&encoded).expect("decoding failed");
assert_eq!(decoded, data);
```
## Implementation Details
Base122 maps binary data to a UTF-8 safe subset of 122 non-control bytes. Six ASCII codes (`\x00`, `\x0A`, `\x0D`, `\x22`, `\x26`, `\x5C`) are considered illegal and are handled via an escape mechanism.
### Encoding Logic
1. Read 7-byte input chunks.
2. Split chunks into eight 7-bit groups.
3. Write groups directly if they are "safe."
4. If a group collides with an illegal byte, it triggers a two-byte escape sequence.
### Optimization Strategy
High throughput is achieved by processing 64-bit chunks. When a chunk contains no illegal bytes, a **branchless fast-path** is taken. By utilizing `unsafe` pointer arithmetic and pre-calculating output capacities, the hot loop avoids bounds checking and reallocations.
## License
This project is licensed under the MIT License.