PolyDup
Cross-language duplicate code detector powered by Tree-sitter and Rust.
Architecture
Shared Core Architecture: Heavy lifting done in Rust, exposed via FFI bindings.
- dupe-core: Pure Rust library with Tree-sitter parsing, hashing (Rabin-Karp/MinHash), and reporting
- dupe-cli: Standalone Rust CLI tool
- dupe-node: Node.js native addon via napi-rs
- dupe-py: Python extension module via PyO3
Installation
Rust CLI (Recommended)
Install the CLI tool from crates.io:
Or download pre-built binaries from GitHub Releases.
Node.js/npm
Install as a project dependency:
Or globally:
Usage in your project:
const = require;
const duplicates = ;
console.log;
Python/pip
Install from PyPI:
Usage in your project:
=
Building from Source
CLI
Node.js
Python
CLI Usage
Scan directories for duplicate code:
# Basic usage
# Custom threshold and output format
# Adjust block size for granularity
Output Formats
- Text (default): Human-readable colored output
- JSON: Machine-readable format with full details
Options
--threshold: Similarity threshold (0.0-1.0, default: 0.9)--min-block-size: Minimum lines per block (default: 10)--format: Output format (text or json)
License
MIT OR Apache-2.0