blake3_guts 0.0.0

low-level building blocks for the BLAKE3 hash function
Documentation
# The BLAKE3 Guts API

## Introduction

This [`blake3_guts`](https://crates.io/crates/blake3_guts) sub-crate contains
low-level, high-performance, platform-specific implementations of the BLAKE3
compression function. This API is complicated and unsafe, and this crate will
never have a stable release. Most callers should instead use the
[`blake3`](https://crates.io/crates/blake3) crate, which will eventually depend
on this one internally.

The code you see here (as of January 2024) is an early stage of a large planned
refactor. The motivation for this refactor is a couple of missing features in
both the Rust and C implementations:

- The output side
  ([`OutputReader`]https://docs.rs/blake3/latest/blake3/struct.OutputReader.html
  in Rust) doesn't take advantage of the most important SIMD optimizations that
  compute multiple blocks in parallel. This blocks any project that wants to
  use the BLAKE3 XOF as a stream cipher
  ([[1]]https://github.com/oconnor663/bessie,
  [[2]]https://github.com/oconnor663/blake3_aead).
- Low-level callers like [Bao]https://github.com/oconnor663/bao that need
  interior nodes of the tree also don't get those SIMD optimizations. They have
  to use a slow, minimalistic, unstable, doc-hidden module [(also called
  `guts`)](https://github.com/BLAKE3-team/BLAKE3/blob/master/src/guts.rs).

The difficulty with adding those features is that they require changes to all
of our optimized assembly and C intrinsics code. That's a couple dozen
different files that are large, platform-specific, difficult to understand, and
full of duplicated code. The higher-level Rust and C implementations of BLAKE3
both depend on these files and will need to coordinate changes.

At the same time, it won't be long before we add support for more platforms:

- RISCV vector extensions
- ARM SVE
- WebAssembly SIMD

It's important to get this refactor done before new platforms make it even
harder to do.

## The private guts API

This is the API that each platform reimplements, so we want it to be as simple
as possible apart from the high-performance work it needs to do. It's
completely `unsafe`, and inputs and outputs are raw pointers that are allowed
to alias (this matters for `hash_parents`, see below).

- `degree`
- `compress`
    - The single compression function, for short inputs and odd-length tails.
- `hash_chunks`
- `hash_parents`
- `xof`
- `xof_xor`
    - As `xof` but XOR'ing the result into the output buffer.
- `universal_hash`
    - This is a new construction specifically to support
      [BLAKE3-AEAD]https://github.com/oconnor663/blake3_aead. Some
      implementations might just stub it out with portable code.

## The public guts API

This is the API that this crate exposes to callers, i.e. to the main `blake3`
crate. It's a thin, portable layer on top of the private API above. The Rust
version of this API is memory-safe.

- `degree`
- `compress`
- `hash_chunks`
- `hash_parents`
    - This handles most levels of the tree, where we keep hashing SIMD_DEGREE
      parents at a time.
- `reduce_parents`
    - This uses the same `hash_parents` private API, but it handles the top
      levels of the tree where we reduce in-place to the root parent node.
- `xof`
- `xof_xor`
- `universal_hash`