PolyDup
Cross-language duplicate code detector powered by Tree-sitter and Rust.
Features
- Blazing Fast: Parallel processing with Rabin-Karp rolling hash algorithm
- Cross-Language: JavaScript, TypeScript, Python, Rust, Vue, Svelte (more coming)
- Accurate: Tree-sitter AST parsing for semantic-aware detection
- Multi-Platform: CLI, Node.js npm package, Python pip package, Rust library
- Configurable: Adjust thresholds and block sizes for your needs
- Efficient: Zero-copy FFI bindings for minimal overhead
Architecture
Shared Core Architecture: All duplicate detection logic lives in Rust, exposed via FFI bindings.
┌─────────────────────────────────────────────┐
│ polydup-core (Rust) │
│ • Tree-sitter parsing │
│ • Rabin-Karp hashing │
│ • Parallel file scanning │
│ • Duplicate detection │
└─────────────────────────────────────────────┘
▲ ▲ ▲
│ │ │
┌─────┴───┐ ┌───┴────┐ ┌─┴─────┐
│ CLI │ │ Node.js│ │ Python│
│ (Rust) │ │(napi-rs)│ │(PyO3) │
└─────────┘ └────────┘ └───────┘
Crates:
- polydup-core: Pure Rust library with Tree-sitter parsing, hashing, and reporting
- polydup-cli: Standalone CLI tool (
cargo install polydup-cli) - polydup-node: Node.js native addon via napi-rs (
npm install @polydup/core) - polydup-py: Python extension module via PyO3 (
pip install polydup)
Installation
Rust CLI (Recommended)
The fastest way to use PolyDup is via the CLI tool:
# Install from crates.io
# Verify installation
# Scan for duplicates
System Requirements:
- Rust 1.70+ (if building from source)
- macOS, Linux, or Windows
Pre-built Binaries:
Download pre-compiled binaries from GitHub Releases:
# macOS (Apple Silicon)
|
# macOS (Intel)
|
# Linux (x86_64)
|
# Windows (x86_64)
# Download from releases page and add to PATH
Node.js/npm
Install as a project dependency or globally:
# Project dependency
# Global installation
Requirements: Node.js 16+ on macOS (Intel/ARM), Windows (x64), or Linux (x64)
Usage:
const = require;
const duplicates = ;
console.log;
duplicates.;
Python/pip
Install from PyPI:
# Using pip
# Using uv (recommended for faster installs)
Requirements: Python 3.8-3.12 on macOS (Intel/ARM), Windows (x64), or Linux (x64)
Usage:
# Scan for duplicates
=
Rust Library
Use the core library in your Rust project:
[]
= "0.1"
use ;
use PathBuf;
Building from Source
CLI
Node.js
Python
CLI Usage
Basic Commands
# Scan a directory
# Scan multiple directories
# Custom threshold (0.0-1.0, higher = stricter)
# Adjust minimum block size (lines)
# JSON output for scripting
Examples
Quick scan for severe duplicates:
Deep scan for similar code:
Scan specific file types:
# PolyDup auto-detects: .rs, .js, .ts, .jsx, .tsx, .py, .vue, .svelte
CI/CD integration:
# Exit with error if duplicates found
||
Output Formats
Text (default): Human-readable colored output with file paths, line numbers, and similarity scores
JSON: Machine-readable format for scripting and tooling integration
|
CLI Options
| Option | Type | Default | Description |
|---|---|---|---|
--threshold |
float | 0.9 | Similarity threshold (0.0-1.0) |
--min-block-size |
int | 10 | Minimum lines per code block |
--format |
text|json | text | Output format |
Supported Languages
- JavaScript/TypeScript:
.js,.jsx,.ts,.tsx - Python:
.py - Rust:
.rs - Vue:
.vue - Svelte:
.svelte
More languages coming soon (Java, Go, C/C++, Ruby, PHP)
Development
Building from Source
Prerequisites:
- Rust 1.70+ (
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh) - Node.js 16+ (for Node.js bindings)
- Python 3.8-3.12 (for Python bindings)
CLI:
Node.js bindings:
Python bindings:
Run tests:
# All tests
# Specific crate
# With coverage
Pre-commit Hooks
Install pre-commit hooks to automatically run linting and tests:
# Install pre-commit (if not already installed)
# Install the git hooks
# Run manually on all files
The hooks will automatically run:
- On commit:
cargo fmt,cargo clippy, file checks (trailing whitespace, YAML/TOML validation) - On push: Full test suite with
cargo test
To skip hooks temporarily:
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Install pre-commit hooks (
pre-commit install) - Make your changes and ensure tests pass (
cargo test --workspace) - Run clippy (
cargo clippy --workspace --all-targets -- -D warnings) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
See CONTRIBUTING.md for detailed guidelines.
License
MIT OR Apache-2.0