Expand description
Compute match thresholds for license detection rules.
Constants§
- MIN_
MATCH_ HIGH_ LENGTH - Minimum match length for high-value (legalese) token matching.
- MIN_
MATCH_ LENGTH - Minimum match length for token-based matching.
- SMALL_
RULE - Rules shorter than this are considered “small” (exact match only).
- TINY_
RULE - Rules shorter than this are considered “tiny” (very short, special handling).
Functions§
- compute_
thresholds_ occurrences - Compute thresholds considering the occurrence of all tokens.
- compute_
thresholds_ unique - Compute thresholds considering the occurrence of only unique tokens.