hermes-tool 0.1.0

Supplementary tools for Hermes - simhash, sorting, and data processing
hermes-tool-0.1.0 is not a library.

Hermes Tool - Supplementary tools for data processing

Overview

This package provides command-line tools for preprocessing and debugging data before indexing with Hermes. Tools operate on JSON Lines (JSONL) streams.

Commands

  • simhash - Calculate SimHash for a text field and add it to each JSON object
  • sort - Sort JSON objects by a specified field
  • term-stats - Compute term statistics for WAND optimization

Examples

Calculate SimHash for title field

cat docs.jsonl | hermes-tool simhash --field title --output title_hash

Sort documents by a field

cat docs.jsonl | hermes-tool sort --field published_at

Pipeline example

zstdcat dump.zst | hermes-tool simhash -f title -o hash | hermes-tool sort -f hash > sorted.jsonl