Stylometry Analyzer
Minimal CLI tool that:
- Combines one or more
.txtfiles, extracts user-authored text, and enforces a minimum size. - Hash-embeds text chunks and queries a local vector DB to classify writing style.
- Prints results to stdout and writes
results.json.
Prerequisites
- Rust toolchain (stable)
- yvdb running locally at
http://127.0.0.1:8080
Usage
cargo run -- --file path\to\file1.txt path\to\file2.txt
Spaces in paths are fine if quoted.
Notes
- stylometry analyzer requires a minimum of 70kb text. Ideally I'd like to up this to 750kb.
- multiple text files can be used and minimal delimiters are used in main.rs to slice out the text so user's voice is remaining alone...mitigate noise, basically.
- On first run, the tool seeds reference embeddings into yvdb; later runs skip seeding.
- Progress prints show the current step and pre-seed status.