Skip to main content

Module thresholds

Module thresholds 

Source
Expand description

Compute match thresholds for license detection rules.

Constants§

MIN_MATCH_HIGH_LENGTH
Minimum match length for high-value (legalese) token matching.
MIN_MATCH_LENGTH
Minimum match length for token-based matching.
SMALL_RULE
Rules shorter than this are considered “small” (exact match only).
TINY_RULE
Rules shorter than this are considered “tiny” (very short, special handling).

Functions§

compute_thresholds_occurrences
Compute thresholds considering the occurrence of all tokens.
compute_thresholds_unique
Compute thresholds considering the occurrence of only unique tokens.