Expand description
Native Rust API for jscpd-rs, a 50x+ faster duplicate-code detector for
local development and CI/CD.
jscpd-rs scans a codebase, finds copy-paste fragments across files, writes
console, JSON, SARIF, HTML, XML, CSV, Markdown, badge, and Xcode reports,
and can fail a build when duplication crosses a configured threshold.
It is a native Rust implementation of the common
jscpd command-line workflow:
upstream-style CLI flags, .jscpd.json and package.json#jscpd
configuration, report formats, exit-code behavior, Git blame, and server
snippet checks. The current public benchmark suite records 50x+ speedups on
pinned React, Next.js, and Prometheus cases while using a coverage-first
compatibility gate against upstream jscpd.
This crate exposes the same detector core used by the jscpd and
jscpd-server binaries: option parsing, file discovery, tokenization,
duplicate detection, statistics, and in-memory source checks.
§Quick Start
Scan paths using the same option model as the CLI:
use std::path::PathBuf;
let mut options = jscpd_rs::get_default_options();
options.paths = vec![PathBuf::from("src")];
options.reporters.clear();
options.silent = true;
let result = jscpd_rs::detect_clones_and_statistics(&options)?;
println!("{} clones", result.clones.len());Check prepared in-memory sources without touching the filesystem:
let mut options = jscpd_rs::get_default_options();
options.reporters.clear();
options.min_lines = 2;
options.min_tokens = 5;
let files = vec![
jscpd_rs::SourceFile {
source_id: "a.js".to_string(),
format: "javascript".to_string(),
content: "const a = 1;\nconst b = 2;\nconst c = a + b;\n".to_string(),
},
jscpd_rs::SourceFile {
source_id: "b.js".to_string(),
format: "javascript".to_string(),
content: "const a = 1;\nconst b = 2;\nconst c = a + b;\n".to_string(),
},
];
let result = jscpd_rs::detect_source_files(files, &options);
assert!(!result.clones.is_empty());§Main Entry Points
get_options_from_argsparses upstream-style CLI arguments intoOptions.detect_clonesanddetect_clones_and_statisticsrun discovery, tokenization, duplicate detection, statistics, and optional Git blame.detect_source_filesruns detection against caller-providedSourceFilevalues and is the best entry point for editors, servers, and tests.Tokenizerexposes the native token map generator used by the detector.DetectorandMemoryStoreprovide Rust counterparts for the main upstream core classes.jscpdandjscpd_with_exit_callbackprovide an embeddable argv runner similar to upstreamjscpd(argv, exitCallback?).
§Compatibility Model
The release gate is coverage-first: for the same inputs and options, this
crate must not miss duplicated source lines reported by upstream jscpd.
Extra Rust findings remain visible in compatibility reports while the
implementation converges on exact parity.
The first release intentionally keeps the detector native-only. Dynamic npm reporters, stores, listeners, and plugins are not loaded by this crate.
See the README and User Guide for CLI, configuration, reporter, server, and CI examples.
Modules§
Structs§
- Blamed
Line - Git blame information for one duplicated source line.
- Cli
- Clone
Match - Pair of duplicated fragments reported as one clone.
- Detection
Result - Complete detector output.
- Detection
Token - Detection token after mode filtering and jscpd-compatible hashing.
- Detector
- Incremental detector facade for native integrations.
- Format
Mappings - Additional format mappings from extensions or exact filenames to formats.
- Fragment
- One duplicated fragment in a source file.
- Jscpd
Outcome - Location
- One-based source location used in tokens, fragments, and reports.
- Memory
Store - Memory
Store Error - Options
- Normalized detector options shared by the CLI, server, and Rust API.
- Skipped
Clone - Clone skipped from final output with compatibility/debug messages.
- Source
File - Source content prepared by the caller for in-memory detection.
- Source
Summary - Summary of one analyzed source.
- Source
Token Map - Token map associated with a source identifier and line count.
- Statistic
- Statistic
Row - Aggregated duplication counters for a source, format, or whole run.
- Statistics
- Duplication statistics for a full detection run.
- Threshold
Exceeded - Token
Map - Token map for a single detected format block.
- Tokenizer
- Native tokenizer used by the detector.
Enums§
Functions§
- detect_
clones - Detect clones from files discovered through
Options::paths. - detect_
clones_ and_ statistic - Upstream-named alias for
detect_clones_and_statistics. - detect_
clones_ and_ statistics - Detect clones and return both clone matches and aggregate statistics.
- detect_
source_ files - Detect clones in prepared in-memory sources.
- get_
default_ options - Return the upstream-compatible default option set.
- get_
format_ by_ file - Resolve a source format from a path using the built-in extension and filename registry.
- get_
format_ by_ file_ with_ mappings - Resolve a source format from a path with caller-provided extension and filename mappings.
- get_
options_ from_ args - Parse upstream-style command-line arguments into normalized
Options. - get_
supported_ formats - Return the names of all formats known to the synchronized format registry.
- jscpd
- jscpd_
with_ exit_ callback - run_
cli_ args - run_
current_ process - upstream_
stdout_ error
Type Aliases§
- Blamed
Lines - Git blame lines keyed by line number.