Expand description
PolyDup Core - Cross-language duplicate code detection engine
This library provides the core functionality for detecting duplicate code across Node.js, Python, and Rust codebases using Tree-sitter parsing, Rabin-Karp/MinHash algorithms, and parallel processing.
Structs§
- Clone
Match - Represents a detected duplicate code block
- Duplicate
Match - Represents a detected duplicate code fragment
- Function
Node - Represents a parsed function node from source code
- Report
- Report containing scan results
- Rolling
Hash - Rabin-Karp rolling hash for efficient substring comparison
- Scan
Stats - Statistics from the scanning process
- Scanner
- Main scanner for detecting duplicates
Enums§
Functions§
- compute_
rolling_ hashes - Computes rolling hashes for a token stream
- detect_
duplicates_ with_ extension - Detects duplicates using rolling hash with greedy extension
- extract_
functions - Extracts all function definitions from the given source code
- extract_
javascript_ functions - Convenience function to extract functions from JavaScript code
- extract_
python_ functions - Convenience function to extract functions from Python code
- extract_
rust_ functions - Convenience function to extract functions from Rust code
- find_
duplicates - Public API: Find duplicates in the given file paths
- find_
duplicates_ with_ config - Public API with custom configuration
- normalize
- Normalizes source code into a token stream for duplicate detection