sketchir 0.1.2

Sketching primitives for IR: minhash/simhash/LSH-style signatures.
Documentation

sketchir: sketching primitives for IR.

This crate is intended for index-only similarity sketches used in:

  • near-duplicate detection (MinHash / shingles)
  • text fingerprinting (SimHash)
  • approximate similarity search (LSH-style candidate generation)

Scope here is primitives: signatures, basic indexing, deterministic behavior. Higher-level workflows (crawl dedupe pipelines, content extraction, etc.) belong elsewhere.