base122-fast 0.1.0

High-performance Base122 encoding (4+ Gbps) with lower overhead (~14%) than Base64.
Documentation
# base122-fast


[![Crates.io](https://img.shields.io/crates/v/base122-fast.svg)](https://crates.io/crates/base122-fast)
[![Documentation](https://docs.rs/base122-fast/badge.svg)](https://docs.rs/base122-fast)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A high-performance [Base122](https://github.com/kevinAlbs/Base122) implementation in Rust.

Base122 is a binary-to-text encoding scheme designed to be significantly more space-efficient than Base64. It incurs only **~14% overhead** (compared to Base64's 33%) while remaining valid UTF-8.

## Performance


This crate is engineered for maximum throughput leveraging several low-level optimizations:

*   **SWAR (SIMD Within A Register)**: Processes 64-bit words using bitwise masks to detect illegal characters across multiple bytes simultaneously, minimizing per-byte overhead.
*   **Branchless Fast-Paths** – Efficiently bypasses escape-character logic for ASCII-compatible segments.
*   **Zero-Copy Strategy** – Utilizes direct pointer arithmetic and pre-allocated buffers to minimize heap allocations.
*   **Unsafe Intrinsics** – Leverages `unsafe` Rust for unchecked memory access and optimized bit manipulation.

### Benchmarks


The following throughput was measured on **uniform random binary data** using an **AMD Ryzen 5 5600** (single core).

| Data Size | Encode (MiB/s) | Decode (MiB/s) |
| :--- | :--- | :--- |
| 16 B | 331 | 318 |
| 64 B | 672 | 626 |
| 1 KiB | 1135 | 1070 |
| 64 KiB | 1089 | 871 |
| 1 MiB | 518 | 573 |
| 16 MiB | 533 | 597 |

For large payloads (≥1 MiB), the implementation sustains approximately **4.1 Gbps encoding** and **4.8 Gbps decoding**.

> **Note:** Run `cargo bench` to reproduce these results. The benchmarks evaluate encoding, decoding, and round-trip integrity on random byte streams.

## Quick Start


Add to `Cargo.toml`:

```toml
[dependencies]
base122-fast = "0.1"
```

### Encode


```rust
use base122_fast::encode;

let data = b"hello world";
let encoded = encode(data);
println!("{}", encoded);
```

### Decode


```rust
use base122_fast::{encode, decode};

let data = b"hello world";
let encoded = encode(data);
let decoded = decode(&encoded).expect("decoding failed");
assert_eq!(decoded, data);
```

## Implementation Details


Base122 maps binary data to a UTF-8 safe subset of 122 non-control bytes. Six ASCII codes (`\x00`, `\x0A`, `\x0D`, `\x22`, `\x26`, `\x5C`) are considered illegal and are handled via an escape mechanism.

### Encoding Logic

1.  Read 7-byte input chunks.
2.  Split chunks into eight 7-bit groups.
3.  Write groups directly if they are "safe."
4.  If a group collides with an illegal byte, it triggers a two-byte escape sequence.

### Optimization Strategy

High throughput is achieved by processing 64-bit chunks. When a chunk contains no illegal bytes, a **branchless fast-path** is taken. By utilizing `unsafe` pointer arithmetic and pre-calculating output capacities, the hot loop avoids bounds checking and reallocations.

## License


This project is licensed under the MIT License.