utf8-zero 0.8.1

Zero-copy, incremental UTF-8 decoding with error handling
Documentation
# utf8-zero

Zero-copy, incremental UTF-8 decoding with error handling.

Unlike `std::str::from_utf8()`, which requires the entire input up front, this crate is designed
for streaming: bytes can arrive in arbitrary chunks (from a network socket, file reader, etc.)
and the decoder correctly handles multi-byte code points split across chunk boundaries.

The crate provides three levels of API:

* **`utf8::decode()`** — low-level, single-shot decode of a byte slice. Returns the valid
  prefix and either an invalid sequence or an incomplete suffix that can be completed with
  more input.
* **`LossyDecoder`** — a push-based streaming decoder. Feed it chunks of bytes and it calls
  back with `&str` slices, replacing errors with U+FFFD.
* **`BufReadDecoder`** — a pull-based streaming decoder wrapping any `BufRead`, with both
  strict and lossy modes.

### Example

```rust
use utf8::{decode, DecodeError};

let bytes = b"Hello\xC0World";
match decode(bytes) {
    Ok(s) => println!("valid: {s}"),
    Err(DecodeError::Invalid { valid_prefix, invalid_sequence, remaining_input }) => {
        // valid_prefix = "Hello", invalid_sequence = [0xC0], remaining_input = b"World"
        println!("got {:?} before error", valid_prefix);
    }
    Err(DecodeError::Incomplete { valid_prefix, incomplete_suffix }) => {
        // Input ended mid-codepoint — feed more bytes via incomplete_suffix.try_complete()
        println!("need more input after {:?}", valid_prefix);
    }
}
```

## History

* Originally written by [Simon Sapin]https://github.com/SimonSapin as
  [SimonSapin/rust-utf8]https://github.com/SimonSapin/rust-utf8, published
  as the [`utf-8`]https://crates.io/crates/utf-8 crate.
* The upstream repo was [archived]https://github.com/SimonSapin/rust-utf8/commit/218fea2b57b0e4c3de9fa17a376fcc4a4c0d08f3
  and is no longer maintained.
* Used by [ureq]https://github.com/algesten/ureq among others.
  Simon Sapin [suggested]https://github.com/servo/servo/issues/42853#issuecomment-3971787017
  inlining the code into crates that need it rather than republishing.
* Forked here as a standalone repo (not a GitHub fork) to allow continued maintenance.
* Added fuzz testing.
* Modernized code: set Rust edition to 2021, ran `cargo fmt`, fixed lifetime syntax and clippy warnings.
* Added GitHub Actions CI (lint, clippy, tests, Miri on every push/PR; nightly fuzzing).
* Removed defunct bench setup (missing shared modules from upstream).
* Added `#![deny(missing_docs)]` and documented all public items.
* Added `no_std` support for all but `BufReadDecoder`.

## Fuzzing

Fuzz tests use [`cargo-fuzz`](https://github.com/rust-fuzz/cargo-fuzz) (libFuzzer). Three targets cover the main API surface:

* **`fuzz_decode`**`utf8::decode()`, validated against `std::str::from_utf8()`
* **`fuzz_lossy_decoder`**`LossyDecoder` with random chunk splits, validated against `String::from_utf8_lossy()`
* **`fuzz_bufread_decoder`**`BufReadDecoder::read_to_string_lossy()`, validated against `String::from_utf8_lossy()`

To run locally:

```sh
cargo install cargo-fuzz
cargo +nightly fuzz run fuzz_decode
cargo +nightly fuzz run fuzz_lossy_decoder
cargo +nightly fuzz run fuzz_bufread_decoder
```

A GitHub Actions workflow runs all targets nightly.

## Miri

[Miri](https://github.com/rust-lang/miri) runs on every push/PR to validate the `unsafe` code
(three `str::from_utf8_unchecked()` calls). The test suite uses exhaustive input partitioning,
which is exponential, so inputs longer than 10 bytes are skipped under Miri to keep CI fast.

```sh
cargo +nightly miri test
```

## License

MIT OR Apache-2.0