Crate dupe_core

Crate dupe_core 

Source
Expand description

PolyDup Core - Cross-language duplicate code detection engine

This library provides the core functionality for detecting duplicate code across Node.js, Python, and Rust codebases using Tree-sitter parsing, Rabin-Karp/MinHash algorithms, and parallel processing.

Structs§

Baseline
Baseline snapshot for comparing duplicate detection across runs
CloneMatch
Represents a detected duplicate code block
DuplicateMatch
Represents a detected duplicate code fragment
FunctionNode
Represents a parsed function node from source code
Report
Report containing scan results
RollingHash
Rabin-Karp rolling hash for efficient substring comparison
ScanConfig
Configuration used for scanning
ScanStats
Statistics from the scanning process
Scanner
Main scanner for detecting duplicates

Enums§

CloneType
Clone type classification
Token
Normalized token representation

Functions§

compute_rolling_hashes
Computes rolling hashes for a token stream
compute_token_edit_distance
Computes Levenshtein edit distance between two token sequences
compute_token_similarity
Computes token-level similarity between two token sequences using edit distance
detect_duplicates_with_extension
Detects duplicates using rolling hash with greedy extension
detect_type3_clones
Detects Type-3 clones (gap-tolerant) between two token sequences
extract_functions
Extracts all function definitions from the given source code
extract_javascript_functions
Convenience function to extract functions from JavaScript code
extract_python_functions
Convenience function to extract functions from Python code
extract_rust_functions
Convenience function to extract functions from Rust code
find_duplicates
Public API: Find duplicates in the given file paths
find_duplicates_with_config
Public API with custom configuration
normalize
Normalizes source code into a token stream for duplicate detection