base64-ng 0.7.0

# base64-ng

`base64-ng` is a `no_std`-first Base64 crate focused on correctness, strict decoding, caller-owned buffers, and a security-heavy release process. The long-term goal is to provide modern hardware acceleration without making unsafe SIMD the foundation of trust.

The crate starts conservative: a small scalar implementation, strict RFC 4648 behavior, and a test/release system modeled after hardened Rust service projects. Streaming is available behind an explicit feature, fuzz harnesses are isolated from the published crate, and future SIMD and Kani work remain gated until they have evidence.

## Current Status

The current public release is `0.7.0`.

Implemented now:

- `no_std` core with optional `alloc` and `std` features.
- Zero external runtime or development dependencies in `Cargo.toml`.
- Standard and URL-safe alphabets.
- Padded and unpadded encoding into caller-provided output buffers.
- Stable compile-time encoding into caller-sized arrays.
- Strict decoding into caller-provided output buffers.
- In-place encoding when the caller provides enough spare capacity.
- Optional `alloc` vector and string helpers.
- In-place decode API built on the same strict scalar decoder.
- Explicit legacy decode APIs that ignore ASCII transport whitespace while
  keeping alphabet and padding validation strict.
- Validation-only APIs for strict and legacy profiles when callers need to
  reject malformed input without materializing decoded bytes.
- Line-wrapped encoding for MIME/PEM-style output and caller-selected wrapping
  policies.
- Strict line-wrapped validation and decoding profiles for MIME/PEM-style
  input.
- Custom alphabet validation helpers for user-defined 64-byte alphabets.
- Named dependency-free profiles for MIME, PEM, bcrypt-style, and
  `crypt(3)`-style Base64.
- Stack-backed encoded output buffers for short values without `alloc`.
- Redacted secret owned buffers for sensitive encoded or decoded bytes when
  `alloc` is enabled.
- Separate `ct` scalar validation and decode module for sensitive payloads
  that avoids secret-indexed lookup tables during Base64 symbol mapping.
- `std::io` streaming encoders and decoders behind the `stream` feature.
- Focused unit and integration tests.
- Isolated `cargo-fuzz` harnesses for decode, in-place decode, and stream
  chunk-boundary behavior.
- Isolated dudect-style timing harness for the constant-time-oriented scalar
  decoder.
- Constant-time assembly evidence generation for reviewer inspection.
- Local check scripts, release gate, dependency policy, audit config, CI, SBOM script, and reproducible build check.

Planned behind admission evidence:

- Admitted AVX2, AVX-512, SSSE3/SSE4.1, ARM NEON, and wasm `simd128`
  fast paths after the SIMD admission evidence is complete. `v0.7` remains
  scalar-only.
- Async streaming wrappers only after the `tokio` feature passes the
  dependency and cancellation-safety admission bar in [docs/ASYNC.md](docs/ASYNC.md).
- Expanded Kani proof harnesses.
- Broader benchmark evidence against the established `base64` crate.

## Trust Dashboard

| Area | Status |
| --- | --- |
| License | `MIT OR Apache-2.0` |
| MSRV | Rust `1.95.0` |
| Runtime dependencies | Zero external crates |
| Unsafe policy | Scalar encode/decode remains safe Rust; audited unsafe is limited to volatile wiping and SIMD prototypes |
| Active backend | Scalar only |
| Strict decoding | Default, canonical, no whitespace |
| Legacy compatibility | Explicit opt-in APIs |
| Constant-time posture | Constant-time-oriented scalar validation/decode with isolated dudect-style timing evidence; no formal cryptographic guarantee |
| Cleanup posture | Best-effort initialized-byte cleanup and redacted secret wrappers |
| Release evidence | fmt, clippy, tests, docs, deny, audit, license, SBOM, reproducibility |

Full adoption details live in [docs/TRUST.md](docs/TRUST.md). Security-control
and CWE mapping lives in [docs/SECURITY_CONTROLS.md](docs/SECURITY_CONTROLS.md).

## Install

```toml
[dependencies]
base64-ng = "0.7.0"
```

The crate is dual-licensed:

```toml
license = "MIT OR Apache-2.0"
```

## Features

| Feature | Default | Purpose |
| --- | --- | --- |
| `alloc` | yes | `Vec` and encoded `String` convenience APIs. |
| `std` | yes | `std::error::Error` support and feature base for I/O. |
| `simd` | no | Future hardware acceleration. |
| `stream` | no | `std::io` streaming wrappers. |
| `tokio` | no | Reserved for future async streaming wrappers; currently inert and dependency-free. |
| `kani` | no | Reserved for verifier harnesses; normal builds do not require Kani. |
| `fuzzing` | no | Reserved for verifier and fuzz harness integration; published crate stays dependency-free. |

Disable defaults for embedded or freestanding use:

```toml
[dependencies]
base64-ng = { version = "0.7.0", default-features = false }
```

## Example

```rust
use base64_ng::{STANDARD, checked_encoded_len};

let input = b"hello";
const ENCODED_CAPACITY: usize = match checked_encoded_len(5, true) {
    Some(len) => len,
    None => panic!("encoded length overflow"),
};
let mut encoded = [0u8; ENCODED_CAPACITY];
let written = STANDARD.encode_slice(input, &mut encoded).unwrap();
assert_eq!(&encoded[..written], b"aGVsbG8=");

let mut decoded = [0u8; 5];
let written = STANDARD.decode_slice(&encoded, &mut decoded).unwrap();
assert_eq!(&decoded[..written], input);
```

In-place encoding:

```rust
use base64_ng::STANDARD;

let mut buffer = [0u8; 8];
buffer[..5].copy_from_slice(b"hello");
let encoded = STANDARD.encode_in_place(&mut buffer, 5).unwrap();
assert_eq!(encoded, b"aGVsbG8=");
```

For sensitive payloads, `encode_slice_clear_tail` and
`encode_in_place_clear_tail` clear unused bytes after the encoded prefix and
clear the caller-owned output buffer on encode error.

Compile-time encoding:

```rust
use base64_ng::{STANDARD, URL_SAFE_NO_PAD};

const HELLO: [u8; 8] = STANDARD.encode_array(b"hello");
const URL_BYTES: [u8; 3] = URL_SAFE_NO_PAD.encode_array(b"\xfb\xff");

assert_eq!(&HELLO, b"aGVsbG8=");
assert_eq!(&URL_BYTES, b"-_8");
```

Stable Rust cannot yet express the encoded length as the return array length
directly, so `encode_array` uses the destination array type supplied by the
caller. A wrong output length fails during const evaluation.

For untrusted length metadata, use checked length calculation:

```rust
use base64_ng::{checked_encoded_len, decoded_len};

assert_eq!(checked_encoded_len(5, true), Some(8));
assert_eq!(decoded_len(b"aGVsbG8=", true).unwrap(), 5);
```

## Validation Without Decoding

Use validation-only APIs when a protocol needs to sanitize input before storing,
routing, or accounting for it:

```rust
use base64_ng::{STANDARD, URL_SAFE_NO_PAD};

assert!(STANDARD.validate(b"aGVsbG8="));
assert!(!STANDARD.validate(b"aGVsbG8"));

STANDARD.validate_result(b"aGVsbG8=").unwrap();

assert!(URL_SAFE_NO_PAD.validate(b"-_8"));
assert!(!URL_SAFE_NO_PAD.validate(b"+/8"));
```

For line-wrapped or spaced legacy inputs, use the explicit legacy profile:

```rust
use base64_ng::STANDARD;

assert!(STANDARD.validate_legacy(b" aG\r\nVsbG8= "));
assert!(!STANDARD.validate_legacy(b" aG-V "));
```

## Line-Wrapped Encoding

Use `LineWrap` when a protocol needs MIME/PEM-style line lengths:

```rust
use base64_ng::{LineEnding, LineWrap, STANDARD};

let wrap = LineWrap::new(4, LineEnding::Lf);
let mut output = [0u8; 9];
let written = STANDARD
    .encode_slice_wrapped(b"hello", &mut output, wrap)
    .unwrap();

assert_eq!(&output[..written], b"aGVs\nbG8=");
```

Built-in policies include `LineWrap::MIME`, `LineWrap::PEM`, and
`LineWrap::PEM_CRLF`. Wrapping inserts line endings between encoded lines and
does not append a trailing line ending after the final line.

Named profiles carry the wrapping policy for common protocols:

```rust
use base64_ng::{MIME, PEM};

assert_eq!(MIME.line_wrap().unwrap().line_len, 76);
assert_eq!(PEM.line_wrap().unwrap().line_len, 64);

let mut encoded = [0u8; 82];
let written = MIME.encode_slice(&[0x5a; 58], &mut encoded).unwrap();
assert_eq!(&encoded[76..78], b"\r\n");
assert!(MIME.validate(&encoded[..written]));
```

The same policy can be used for strict wrapped decoding. Unlike legacy
whitespace decoding, this accepts only the configured line ending and requires
every non-final line to have the configured encoded length:

```rust
use base64_ng::{LineEnding, LineWrap, STANDARD};

let wrap = LineWrap::new(4, LineEnding::Lf);
let mut output = [0u8; 5];
let written = STANDARD
    .decode_slice_wrapped(b"aGVs\nbG8=", &mut output, wrap)
    .unwrap();

assert_eq!(&output[..written], b"hello");
```

## Custom Alphabets

User-defined alphabets can be validated before use:

```rust
use base64_ng::{Alphabet, decode_alphabet_byte, validate_alphabet};

struct DotSlash;

impl Alphabet for DotSlash {
    const ENCODE: [u8; 64] =
        *b"./ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";

    fn decode(byte: u8) -> Option<u8> {
        decode_alphabet_byte(byte, &Self::ENCODE)
    }
}

validate_alphabet(&DotSlash::ENCODE).unwrap();
assert_eq!(DotSlash::decode(b'.'), Some(0));
```

The default `Alphabet::encode` implementation is deliberately conservative for
custom alphabets: it performs a fixed 64-entry scan for every emitted Base64
byte to avoid secret-indexed table lookups. The built-in alphabets override this
with optimized arithmetic mappers. For very large payloads and custom
alphabets, benchmark this tradeoff before using them on untrusted high-volume
traffic.

Built-in non-RFC alphabets are available for explicit interoperability:

```rust
use base64_ng::{BCRYPT, CRYPT};

let mut bcrypt = [0u8; 4];
let written = BCRYPT.encode_slice(&[0xff, 0xff, 0xff], &mut bcrypt).unwrap();
assert_eq!(&bcrypt[..written], b"9999");

let mut crypt = [0u8; 4];
let written = CRYPT.encode_slice(&[0xff, 0xff, 0xff], &mut crypt).unwrap();
assert_eq!(&crypt[..written], b"zzzz");
```

The bcrypt and `crypt(3)` profiles provide alphabets and no-padding behavior
only. They do not parse or verify complete password-hash strings.

## Legacy Whitespace Decoding

Strict decoding rejects whitespace. If an existing protocol allows line-wrapped
or spaced Base64, use the explicit legacy APIs:

```rust
use base64_ng::STANDARD;

let mut output = [0u8; 5];
let written = STANDARD
    .decode_slice_legacy(b" aG\r\nVs\tbG8= ", &mut output)
    .unwrap();

assert_eq!(&output[..written], b"hello");
```

Legacy decoding only ignores ASCII space, tab, carriage return, and line feed.
Alphabet selection, padding placement, trailing data after padding, and
non-canonical trailing bits remain strict.

## Bounded Memory Use

For untrusted payloads, size buffers before decoding or encoding. The checked
helpers let callers reject impossible or oversized metadata before allocating:

```rust
use base64_ng::{STANDARD, checked_encoded_len, decoded_capacity};

let input = b"hello";
let encoded_len = checked_encoded_len(input.len(), true).unwrap();
assert_eq!(encoded_len, 8);

let mut encoded = vec![0u8; encoded_len];
let written = STANDARD.encode_slice(input, &mut encoded).unwrap();
encoded.truncate(written);

let max_decoded = decoded_capacity(encoded.len());
let mut decoded = vec![0u8; max_decoded];
let written = STANDARD.decode_slice(&encoded, &mut decoded).unwrap();
decoded.truncate(written);

assert_eq!(decoded, input);
```

`decode_vec` validates the complete input before allocating decoded output.
Use `decode_slice` or `decode_in_place` when the caller needs hard memory
limits and owns the output buffer.

For sensitive payloads, use `decode_slice_clear_tail` or
`decode_in_place_clear_tail` to clear unused bytes after the decoded prefix. On
decode error these variants clear the caller-owned output buffer before
returning the error. The legacy whitespace profile also provides
`decode_slice_legacy_clear_tail` and `decode_in_place_legacy_clear_tail`.
The `ct` module provides the same clear-tail decode variants for callers using
the constant-time-oriented scalar decoder.

For short values, `encode_buffer` returns a stack-backed `EncodedBuffer`
without requiring the `alloc` feature:

```rust
use base64_ng::{BCRYPT, STANDARD};

let encoded = STANDARD.encode_buffer::<8>(b"hello").unwrap();
assert_eq!(encoded.as_str(), "aGVsbG8=");

let bcrypt = BCRYPT.encode_buffer::<4>(&[0xff, 0xff, 0xff]).unwrap();
assert_eq!(bcrypt.as_bytes(), b"9999");
```

`EncodedBuffer` exposes bytes only through `as_bytes` and `as_str`, redacts the
payload from `Debug`, and clears its backing array when dropped as best-effort
data-retention reduction.

When an owned heap buffer is acceptable but accidental logging is not, use
`encode_secret` and `decode_secret`:

```rust
use base64_ng::STANDARD;

let encoded = STANDARD.encode_secret(b"hello").unwrap();
assert_eq!(encoded.expose_secret(), b"aGVsbG8=");
assert_eq!(format!("{encoded:?}"), r#"SecretBuffer { bytes: "<redacted>", len: 8 }"#);

let decoded = STANDARD.decode_secret(encoded.expose_secret()).unwrap();
assert_eq!(decoded.expose_secret(), b"hello");
assert_eq!(format!("{decoded}"), "<redacted>");
```

`SecretBuffer` clears initialized bytes when dropped, but it does not claim
formal zeroization and cannot clean historical copies outside the wrapper or
allocator spare capacity.

With the default `alloc` feature, vector and string helpers are available:

```rust
use base64_ng::STANDARD;

let encoded = STANDARD.encode_vec(b"hello").unwrap();
assert_eq!(encoded, b"aGVsbG8=");

let encoded_string = STANDARD.encode_string(b"hello").unwrap();
assert_eq!(encoded_string, "aGVsbG8=");

let decoded = STANDARD.decode_vec(&encoded).unwrap();
assert_eq!(decoded, b"hello");
```

With the `stream` feature, `std::io` encoders are available:

```rust
use std::io::{Read, Write};
use base64_ng::{STANDARD, stream::{Decoder, DecoderReader, Encoder, EncoderReader}};

let mut encoder = Encoder::new(Vec::new(), STANDARD);
encoder.write_all(b"he").unwrap();
encoder.write_all(b"llo").unwrap();
let encoded = encoder.finish().unwrap();
assert_eq!(encoded, b"aGVsbG8=");

let mut reader = EncoderReader::new(&b"hello"[..], STANDARD);
let mut encoded = String::new();
reader.read_to_string(&mut encoded).unwrap();
assert_eq!(encoded, "aGVsbG8=");

let mut decoder = Decoder::new(Vec::new(), STANDARD);
decoder.write_all(b"aGVs").unwrap();
decoder.write_all(b"bG8=").unwrap();
let decoded = decoder.finish().unwrap();
assert_eq!(decoded, b"hello");

let mut reader = DecoderReader::new(&b"aGVsbG8="[..], STANDARD);
let mut decoded = Vec::new();
reader.read_to_end(&mut decoded).unwrap();
assert_eq!(decoded, b"hello");
```

URL-safe, no-padding encoding:

```rust
use base64_ng::URL_SAFE_NO_PAD;

let mut encoded = [0u8; 7];
let written = URL_SAFE_NO_PAD.encode_slice(b"hello", &mut encoded).unwrap();
assert_eq!(&encoded[..written], b"aGVsbG8");
```

## Security Model

`base64-ng` treats Base64 as infrastructure code. Fast paths are never allowed to outrun evidence.

Security commitments:

- Stable Rust first. Current toolchain pin: Rust `1.95.0`.
- `no_std` core by default.
- Scalar encode/decode remains safe Rust.
- One audited unsafe helper in `src/lib.rs` performs volatile best-effort
  wiping so cleanup writes are not optimized away.
- Future unsafe SIMD remains isolated under `src/simd.rs`.
- Local checks verify that `allow(unsafe_code)` is confined to the volatile
  wipe helper and SIMD boundary, every unsafe function is inventoried, and
  every unsafe block has a nearby `SAFETY:` explanation. Architecture intrinsics,
  CPU feature detection, and target-feature gates are checked against the same
  boundary.
- [docs/UNSAFE.md](docs/UNSAFE.md) inventories every current unsafe site and
  its safety invariants.
- [docs/ASYNC.md](docs/ASYNC.md) defines the admission bar for any future
  async/Tokio API while the `tokio` feature remains inert.
- [docs/DEPENDENCIES.md](docs/DEPENDENCIES.md) defines the dependency
  admission bar for any future external crate.
- `runtime::backend_report()` exposes the active backend, detected candidate,
  SIMD feature status, and scalar-only security posture for audit logging.
- `runtime::require_backend_policy()` lets deployments assert scalar execution,
  disabled SIMD features, or no detected SIMD candidate.
- `BackendPolicy::HighAssuranceScalarOnly` combines the scalar/no-SIMD
  deployment checks into one assertion.
- Runtime backend, posture, and policy enums expose stable string identifiers
  for CI artifacts, audit logs, and deployment evidence.
- Runtime backend reports and policy failures use stable key/value display
  output for log ingestion.
- Strict decoding rejects malformed padding and trailing data.
- Runtime scalar APIs are expected to return `Result` or `Option` for malformed
  input and size errors instead of panicking.
- Public encoded-length overflow is recoverable through `Result` or `Option`;
  untrusted length metadata should never require a panic.
- Scalar encode avoids input-derived alphabet table indexes, and scalar decode
  uses branch-minimized arithmetic. A separate `ct` module provides a
  constant-time-oriented scalar validation and decode path for callers that
  need a narrower timing target. Its malformed-input errors are intentionally
  non-localized, clear-tail variants clear caller-owned buffers on error, and
  it is not documented as a formally verified cryptographic constant-time API.
- Clear-tail encode/decode variants are available for callers that want
  best-effort cleanup of unused caller-owned buffers without adding a runtime
  dependency.
- Streaming wrappers clear internal pending and queued byte buffers on drop and
  as buffered bytes are consumed, as best-effort retention reduction.
- Legacy compatibility must be opt-in.
- Release gates include formatting, clippy, tests, Miri when installed, docs, dependency policy, audit, license review, isolated fuzz/perf dependency checks, SBOM, and reproducible build checks.
- Future Kani proofs target in-place decoding bounds and scalar decoder invariants.

See [docs/PLAN.md](docs/PLAN.md), [SECURITY.md](SECURITY.md),
[docs/RELEASE_EVIDENCE.md](docs/RELEASE_EVIDENCE.md), and
[docs/CONSTANT_TIME.md](docs/CONSTANT_TIME.md). For the unsafe hardware
acceleration gate, see [docs/SIMD.md](docs/SIMD.md).
For the trust dashboard and CWE/security-control mapping, see
[docs/TRUST.md](docs/TRUST.md) and
[docs/SECURITY_CONTROLS.md](docs/SECURITY_CONTROLS.md).
For panic-free public API policy, see
[docs/PANIC_POLICY.md](docs/PANIC_POLICY.md).
For constant-time-oriented decode verification requirements, see
[docs/CONSTANT_TIME.md](docs/CONSTANT_TIME.md).
For dependency admission rules, see [docs/DEPENDENCIES.md](docs/DEPENDENCIES.md).
For adoption guidance from the established `base64` crate, see
[docs/MIGRATION.md](docs/MIGRATION.md).
For performance evidence guidance, see [docs/BENCHMARKS.md](docs/BENCHMARKS.md).
For fuzz target and corpus policy, see [docs/FUZZING.md](docs/FUZZING.md).

## Local Checks

Run the standard gate:

```sh
scripts/checks.sh
```

Check the zero-external-crate policy directly:

```sh
scripts/validate-dependencies.sh
```

Check reserved feature placeholders directly:

```sh
scripts/check_reserved_features.sh
```

Run the release gate:

```sh
scripts/stable_release_gate.sh
```

Install cross-compilation targets used by the local and CI target checks:

```sh
rustup target add aarch64-unknown-linux-gnu x86_64-unknown-freebsd wasm32-unknown-unknown thumbv7em-none-eabihf
```

Required security tools:

```sh
cargo install --locked cargo-audit
cargo install --locked cargo-license
cargo install --locked cargo-deny
cargo install --locked cargo-sbom --version 0.10.0
```

Optional deep tools:

```sh
cargo install --locked cargo-nextest
cargo install --locked cargo-fuzz
cargo install --locked kani-verifier
```

Verify optional tool installation:

```sh
cargo nextest --version
cargo fuzz --version
cargo kani --version
```

Compile fuzz targets without running a campaign:

```sh
scripts/check_fuzz.sh
```

Validate the committed fuzz corpus policy directly:

```sh
scripts/check_fuzz_corpus.sh
```

Compile and audit the isolated performance harness:

```sh
scripts/check_perf.sh
```

Run the scalar comparison benchmark:

```sh
cargo run --release --manifest-path perf/Cargo.toml
```

Run a target with `cargo-fuzz`:

```sh
cargo +nightly fuzz run decode
cargo +nightly fuzz run in_place
cargo +nightly fuzz run stream_chunks
cargo +nightly fuzz run differential
```

Miri is installed as a nightly Rust component, not as a Cargo package:

```sh
rustup toolchain install nightly --component miri
cargo +nightly miri setup
scripts/check_miri.sh
```

Kani may need a one-time setup after installation:

```sh
cargo kani setup
```

On openSUSE Tumbleweed, install `rustup` first if it is not already present:

```sh
sudo zypper install rustup
```

The local release gate runs Miri automatically when `rustup run nightly cargo
miri` is available. `scripts/check_miri.sh` covers no-default-features scalar
APIs and all-features alloc/stream APIs. The large deterministic sweep tests are
ignored only under Miri because they are already covered by the normal release
gate and are too slow for an interpreter.

## Project Principles

- Keep external crates to the absolute minimum. The current crate dependency graph is only `base64-ng`.
- Correctness first, speed second, unsafe last.
- The scalar implementation is the reference behavior.
- SIMD must prove equivalence to scalar behavior across fuzzed and deterministic inputs.
- Constant-time claims require empirical timing evidence, generated-code
  review, and explicit documented exclusions.
- Compatibility modes must be visible in the type/API surface.
- Release evidence belongs in the repository and CI, not in memory.

## Contributing And Releases

See [CONTRIBUTING.md](CONTRIBUTING.md) for contribution rules and [docs/RELEASE.md](docs/RELEASE.md) for the maintainer release checklist.