Expand description
Native Rust API for jscpd-rs, a 50x+ faster duplicate-code detector for
local development and CI/CD.
jscpd-rs scans a codebase, finds copy-paste fragments across files, writes
console, JSON, SARIF, HTML, XML, CSV, Markdown, badge, and Xcode reports,
and can fail a build when duplication crosses a configured threshold.
It is a native Rust implementation of the common
jscpd command-line workflow:
upstream-style CLI flags, .jscpd.json and package.json#jscpd
configuration, report formats, exit-code behavior, Git blame, and server
snippet checks. The current public benchmark suite records 50x+ speedups on
pinned React, Next.js, and Prometheus cases while using a coverage-first
compatibility gate against upstream jscpd.
This crate exposes the same detector core used by the jscpd and
jscpd-server binaries: option parsing, file discovery, tokenization,
duplicate detection, statistics, and in-memory source checks.
§Quick Start
Scan paths using the same option model as the CLI:
use std::path::PathBuf;
let mut options = jscpd_rs::get_default_options();
options.paths = vec![PathBuf::from("src")];
options.reporters.clear();
options.silent = true;
let result = jscpd_rs::detect_clones_and_statistics(&options)?;
println!("{} clones", result.clones.len());Check prepared in-memory sources without touching the filesystem:
let mut options = jscpd_rs::get_default_options();
options.reporters.clear();
options.min_lines = 2;
options.min_tokens = 5;
let files = vec![
jscpd_rs::SourceFile {
source_id: "a.js".to_string(),
format: "javascript".to_string(),
content: "const a = 1;\nconst b = 2;\nconst c = a + b;\n".to_string(),
},
jscpd_rs::SourceFile {
source_id: "b.js".to_string(),
format: "javascript".to_string(),
content: "const a = 1;\nconst b = 2;\nconst c = a + b;\n".to_string(),
},
];
let result = jscpd_rs::detect_source_files(files, &options);
assert!(!result.clones.is_empty());§Main Entry Points
get_options_from_argsparses upstream-style CLI arguments intoOptions.detect_clonesanddetect_clones_and_statisticsrun discovery, tokenization, duplicate detection, statistics, and optional Git blame.detect_source_filesruns detection against caller-providedSourceFilevalues and is the best entry point for editors, servers, and tests.Tokenizerexposes the native token map generator used by the detector.DetectorandMemoryStoreprovide Rust counterparts for the main upstream core classes.jscpdandjscpd_with_exit_callbackprovide an embeddable argv runner similar to upstreamjscpd(argv, exitCallback?).serveandserve_with_working_directorystart the native REST/MCP server used by thejscpd-serverbinary.
§Compatibility Model
The release gate is coverage-first: for the same inputs and options, this
crate must not miss duplicated source lines reported by upstream jscpd.
Extra Rust findings remain visible in compatibility reports while the
implementation converges on exact parity.
The current 0.x line intentionally keeps the detector native-only. Dynamic npm reporters, stores, listeners, and plugins are not loaded by this crate.
See the README and User Guide for CLI, configuration, reporter, server, and CI examples.
Structs§
- Blamed
Line - Git blame information for one duplicated source line.
- Clone
Match - Pair of duplicated fragments reported as one clone.
- Detection
Result - Complete detector output.
- Detection
Token - Detection token after mode filtering and jscpd-compatible hashing.
- Detector
- Incremental detector facade for native integrations.
- Format
Mappings - Additional format mappings from extensions or exact filenames to formats.
- Fragment
- One duplicated fragment in a source file.
- Jscpd
Outcome - Result of running the native CLI pipeline through the Rust API.
- Location
- One-based source location used in tokens, fragments, and reports.
- Memory
Store - Simple namespace-aware in-memory store compatible with upstream concepts.
- Memory
Store Error - Error returned when a key is missing from a
MemoryStorenamespace. - Options
- Normalized detector options shared by the CLI, server, and Rust API.
- Skipped
Clone - Clone skipped from final output with compatibility/debug messages.
- Source
File - Source content prepared by the caller for in-memory detection.
- Source
Summary - Summary of one analyzed source.
- Source
Token Map - Token map associated with a source identifier and line count.
- Statistic
- Mutable helper for accumulating upstream-style duplication statistics.
- Statistic
Row - Aggregated duplication counters for a source, format, or whole run.
- Statistics
- Duplication statistics for a full detection run.
- Threshold
Exceeded - Error returned when the threshold reporter rejects a duplication result.
- Token
Map - Token map for a single detected format block.
- Tokenizer
- Native tokenizer used by the detector.
Enums§
- Exit
Code - Node-compatible exit-code value preserved from CLI/config input.
- Mode
- Duplicate-detection token filtering mode.
Functions§
- detect_
clones - Detect clones from files discovered through
Options::paths. - detect_
clones_ and_ statistic - Upstream-named alias for
detect_clones_and_statistics. - detect_
clones_ and_ statistics - Detect clones and return both clone matches and aggregate statistics.
- detect_
source_ files - Detect clones in prepared in-memory sources.
- get_
default_ options - Return the upstream-compatible default option set.
- get_
format_ by_ file - Resolve a source format from a path using the built-in extension and filename registry.
- get_
format_ by_ file_ with_ mappings - Resolve a source format from a path with caller-provided extension and filename mappings.
- get_
options_ from_ args - Parse upstream-style command-line arguments into normalized
Options. - get_
supported_ formats - Return the names of all formats known to the synchronized format registry.
- jscpd
- Run
jscpdwith upstream-style argv and return reported clone pairs. - jscpd_
with_ exit_ callback - Run
jscpdwith upstream-style argv and call back with the duplicate exit code. - run_
cli_ args - Run the full CLI pipeline and return clones plus the process exit decision.
- run_
current_ process - Run the full CLI pipeline from the current process arguments.
- serve
- Start the native REST/MCP server using the working directory derived from the first configured scan path.
- serve_
with_ working_ directory - Start the native REST/MCP server with an explicit working directory.
- server_
working_ directory - Return the server working directory implied by CLI options.
- upstream_
stdout_ error - Convert selected internal errors to upstream-style stdout error messages.
Type Aliases§
- Blamed
Lines - Git blame lines keyed by line number.