fast-hex-lite
Ultra-fast hex encoding/decoding in Rust with zero allocations and #![no_std] support.
Why fast-hex-lite?
- Zero allocations (except optional
encode_to_string) no_stdby default- Precise error reporting (byte index)
- Deterministic performance across input sizes
- Optional SIMD acceleration
- Stable Rust only (no nightly features)
Designed for performance-critical systems such as cryptography,
networking stacks, blockchain infrastructure, and embedded environments
where no_std and zero heap usage are mandatory.
Features
| Feature | Default | Description |
|---|---|---|
| (none) | yes | no_std, alloc-free scalar encoder/decoder |
std |
Implements std::error::Error for Error |
|
simd |
SIMD-accelerated decoder via architecture intrinsics (implies std) |
Feature interactions
simdimpliesstd- Scalar path is always available
encode_to_stringrequiresstdno_stdbuilds exclude any allocation-based helpers
Installation
# Default: no_std, scalar only
[]
= "0.1"
# With SIMD acceleration
[]
= { = "0.1", = ["simd"] }
# Explicit no_std (same as default)
[]
= { = "0.1", = false }
Usage
All APIs operate on caller-provided buffers. No heap allocations occur.
Decode hex to bytes
use decode_to_slice;
let hex = b"deadbeef";
let mut buf = ;
let n = decode_to_slice.unwrap;
assert_eq!;
// Uppercase and mixed-case are accepted
decode_to_slice.unwrap;
decode_to_slice.unwrap;
Decode in-place
Decodes ASCII hex in a mutable buffer into its own first half. No secondary buffer required.
use decode_in_place;
let mut buf = *b"deadbeef";
let n = decode_in_place.unwrap;
assert_eq!;
Decode into a fixed-size array
use decode_to_array;
let bytes: = decode_to_array.unwrap;
assert_eq!;
Encode bytes to hex
use encode_to_slice;
let src = ;
let mut out = ;
encode_to_slice.unwrap; // lowercase
assert_eq!;
encode_to_slice.unwrap; // uppercase
assert_eq!;
Length helpers
use ;
assert_eq!; // 8 hex chars -> 4 bytes
assert_eq!; // 4 bytes -> 8 hex chars
Error handling
use ;
let mut buf = ;
// Odd-length input
assert_eq!;
// Output buffer too small
assert_eq!;
// Invalid character: exact byte index reported
let err = decode_to_slice.unwrap_err;
assert!;
All errors include precise context. InvalidByte reports the zero-based index of the
first invalid byte in the source slice.
SIMD acceleration
Enable the simd feature to use a SIMD-accelerated decoder built on std::simd:
= { = "0.1", = ["simd"] }
The SIMD path processes 32 hex bytes per iteration using Simd<u8, 32>. It is fully
transparent: the public API, error types, and error index semantics are identical to the
scalar path. Remaining tail bytes fall back to scalar automatically.
Safety
- Scalar path contains no
unsafe - SIMD paths use architecture intrinsics behind feature gates
- No panics on valid input
- All bounds are checked
- Error indices are deterministic and reproducible
Security & correctness philosophy
fast-hex-lite is designed with a conservative correctness-first mindset suitable for
cryptography-adjacent and infrastructure workloads.
Deterministic semantics
- All decoding paths (scalar and SIMD) share identical observable behavior.
- Error indices are guaranteed to point to the first invalid byte.
- Mixed-case input does not change control flow or error semantics.
- No whitespace normalization or implicit acceptance of non-hex characters.
No partial mutation guarantees
decode_to_sliceanddecode_in_placenever partially mutate the destination buffer on error.- If an error is returned, the caller's output buffer remains unchanged.
No hidden allocations
- No heap allocation occurs in the default configuration.
- All APIs operate on caller-provided memory.
encode_to_stringis explicitly opt-in and requiresstd.
SIMD is an optimization, not a different implementation
- SIMD is gated behind a feature flag.
- Scalar fallback is always available.
- All SIMD logic is covered by the same tests and error contracts.
- Tail handling is verified to match scalar semantics byte-for-byte.
Audit-friendly design
- Error types are explicit and structured.
- No UB-prone pointer arithmetic in scalar code.
- SIMD intrinsics are isolated and architecture-gated.
- High test coverage across scalar and SIMD paths (~99% line coverage).
The goal is predictable, verifiable behavior under all inputs — including malformed or adversarial data — rather than maximum theoretical throughput at the cost of clarity or guarantees.
Testing & Coverage
The crate is validated with:
cargo testcargo test --features simdcargo clippy --all-targets --all-features -- -D warnings
Coverage is measured using llvm-cov.
Current coverage:
- Total line coverage: ~99%
- Functions: 100%
- Scalar and SIMD paths both tested
- All error variants covered
- No-partial-write guarantees validated
- Full 0x00–0xFF roundtrip tests
Benchmarks
Measured on Apple M3 Pro (macOS, cargo bench --features simd).
Numbers are median Criterion throughput values.
Throughput is over decoded output bytes for decode, input bytes for encode and validate, and decoded output bytes for decode_in_place.
Decode: scalar (hex to bytes)
| Input | fast-hex-lite lower | fast-hex-lite mixed | hex crate lower | hex crate mixed |
|---|---|---|---|---|
| 32 B | 1.67 GiB/s | 1.66 GiB/s | 663 MiB/s | 696 MiB/s |
| 256 B | 1.57 GiB/s | 1.58 GiB/s | 636 MiB/s | 700 MiB/s |
| 4 KB | 1.70 GiB/s | 1.70 GiB/s | 597 MiB/s | 621 MiB/s |
| 64 KB | 1.67 GiB/s | 1.68 GiB/s | 357 MiB/s | 370 MiB/s |
| 1 MB | 1.67 GiB/s | 1.71 GiB/s | 207 MiB/s | 215 MiB/s |
Decode: SIMD (hex to bytes)
| Input | fast-hex-lite lower | fast-hex-lite mixed | hex crate lower | hex crate mixed |
|---|---|---|---|---|
| 32 B | 5.51 GiB/s | 5.49 GiB/s | 628 MiB/s | 681 MiB/s |
| 256 B | 6.10 GiB/s | 6.09 GiB/s | 608 MiB/s | 659 MiB/s |
| 4 KB | 6.03 GiB/s | 6.04 GiB/s | 584 MiB/s | 617 MiB/s |
| 64 KB | 6.14 GiB/s | 6.15 GiB/s | 390 MiB/s | 391 MiB/s |
| 1 MB | 6.09 GiB/s | 6.15 GiB/s | 201 MiB/s | 202 MiB/s |
Encode (bytes to hex)
| Input | fast-hex-lite lower | fast-hex-lite upper | hex crate lower |
|---|---|---|---|
| 32 B | 2.50 GiB/s | 2.20 GiB/s | 2.03 GiB/s |
| 256 B | 2.50 GiB/s | 2.48 GiB/s | 2.01 GiB/s |
| 4 KB | 2.61 GiB/s | 2.59 GiB/s | 2.06 GiB/s |
| 64 KB | 2.60 GiB/s | 2.60 GiB/s | 2.09 GiB/s |
| 1 MB | 2.59 GiB/s | 2.59 GiB/s | 2.09 GiB/s |
decode_in_place
| Input | scalar | simd |
|---|---|---|
| 32 B | 655 MiB/s | 650 MiB/s |
| 256 B | 717 MiB/s | 709 MiB/s |
| 4 KB | 764 MiB/s | 775 MiB/s |
| 64 KB | 765 MiB/s | 770 MiB/s |
| 1 MB | 780 MiB/s | 785 MiB/s |
Mixed-case input carries zero overhead versus lowercase. Decode throughput is stable across all input sizes. The SIMD path delivers ~3.5-3.7x uplift over scalar for decode at large inputs.
no_std support
The crate is #![no_std] by default. No allocator is required. All APIs work on
caller-provided stack arrays or static buffers.
= { = "0.1", = false }
When to use
Use fast-hex-lite when:
- You need deterministic performance
- You run in
no_std - You process large volumes of hex (RPC, blockchain, hashing)
- You want explicit, index-aware error reporting
If you only need convenience APIs with heap allocation and minimal performance sensitivity, the hex crate may be sufficient.
Comparison
| Crate | no_std | Alloc-free | Precise error index | SIMD ARM | SIMD x86 | Notes |
|---|---|---|---|---|---|---|
| fast-hex-lite | ✅ | ✅ | ✅ | ✅ NEON | ✅ SSE2 | Deterministic perf, zero-alloc by default |
| hex | ❌ | ❌ | ❌ | ❌ | ❌ | Convenience-focused |
| faster-hex | ❌ | ⚠ partial | ❌ | ❌ | ✅ AVX/SSE | x86-focused SIMD |
| const-hex | ✅ | ✅ | ❌ | ❌ | ❌ | Optimized for const-eval |
Design goals
fast-hex-lite focuses on:
- Zero heap usage by default
no_stdcompatibility- Deterministic throughput across sizes
- Precise, reproducible error indices
- Cross-architecture SIMD (x86_64 + aarch64)
Unlike x86-only SIMD crates, both Apple Silicon and x86_64 are first-class targets.
Architecture support
| Architecture | Scalar | SIMD |
|---|---|---|
| x86_64 | ✅ | SSE2 |
| aarch64 | ✅ | NEON |
| others | ✅ | ❌ |
Code structure
src/
lib.rs -- public API, Error type, feature gates
decode.rs -- scalar decoder, 256-entry compile-time LUT, in-place decode
encode.rs -- scalar encoder
simd.rs -- SIMD decoder (compiled only with feature `simd`)
benches/
bench.rs -- Criterion benchmarks vs hex crate
MSRV
Rust 1.88, edition 2021. Stable only, no nightly features required.