tokmd-content
Content scanning helpers for tokmd analysis.
Overview
This is a Tier 2 utility crate for file content inspection. It provides functions for reading file contents, detecting text files, computing hashes, counting tags, and calculating entropy.
Installation
[]
= "1.3"
Usage
use ;
use Path;
// Read first 4KB of file
let bytes = read_head?;
// Check if content is text
if is_text_like
// Calculate entropy (for secret detection)
let entropy = entropy_bits_per_byte;
// Hash file for duplicate detection
let hash = hash_file?;
Key Functions
Reading
read_head()- Read first N bytesread_head_tail()- Read balanced head + tailread_lines()- Read lines with limitsread_text_capped()- Read as text with byte limit
Detection
is_text_like()- Check for null bytes and valid UTF-8
Hashing
hash_bytes()- BLAKE3 hash of byteshash_file()- Hash file content (capped)
Analysis
count_tags()- Case-insensitive tag countingentropy_bits_per_byte()- Shannon entropy calculation
Entropy Interpretation
| Range | Interpretation |
|---|---|
| 0.0 | Empty or uniform |
| < 4.0 | Low (plain text) |
| 4.0-6.0 | Medium (source code) |
| 6.0-7.5 | High (compressed/encrypted) |
| > 7.5 | Suspicious (secrets, random) |
Dependencies
blake3- Fast cryptographic hashinganyhow- Error handling
License
MIT OR Apache-2.0