rehuman 0.1.2

Unicode-safe text cleaning & typographic normalization for Rust
Documentation

rehuman

Unicode-safe text cleaning & normalization for Rust.

Strip invisible characters, normalize typography, and enforce consistent formatting for text sourced from web scraping, user input, or LLMs.

This crate is a Rust rewrite and expansion of humanize-ai-lib by Nordth.

Install

Add the Rust library crate:

[dependencies]
rehuman = "0.1.2" # replace with the latest published version

Install CLI binaries (rehuman, ishuman):

cargo install rehuman

For the latest version(s), clone this repo and run cargo install --path .:

git clone https://github.com/pszemraj/rehuman.git
cd rehuman
cargo install --path .

Binaries will be installed to ~/.cargo/bin by default.[^1]

[^1]: You may need to add ~/.cargo/bin to your PATH if it is not already there; add export PATH="$HOME/.cargo/bin:$PATH" to your shell profile (.bashrc, .zshrc, etc.).

Quick Start

use rehuman::{clean, humanize};

let cleaned = clean("Hello\u{200B}there"); // -> "Hellothere"
let humanized = humanize("“Quote”—and…more"); // -> "\"Quote\"-and...more"
use rehuman::clean;

// Default behavior removes emoji
let cleaned = clean("Thanks 👍"); // -> "Thanks"

By default, keyboard-only mode emits ASCII-safe output. Non-ASCII text is normalized/transliterated when feasible; unmappable characters are removed. Tune this with --non-ascii-policy, --extended-keyboard, and --preserve-joiners (details in docs/api.md and docs/cli.md). For docs/source files where Unicode glyphs matter (for example box-drawing diagrams), use the CLI with --preset code-safe (or --keyboard-only false). For detailed semantics and option behavior, use the API reference links below.

Documentation

Primary docs by concern:

For CLI help at runtime: rehuman --help and ishuman --help.

License

MIT