Skip to main content

Crate jscpd_rs

Crate jscpd_rs 

Source
Expand description

Native Rust API for jscpd-rs, a 50x+ faster duplicate-code detector for local development and CI/CD.

jscpd-rs scans a codebase, finds copy-paste fragments across files, writes console, JSON, SARIF, HTML, XML, CSV, Markdown, badge, and Xcode reports, and can fail a build when duplication crosses a configured threshold.

It is a native Rust implementation of the common jscpd command-line workflow: upstream-style CLI flags, .jscpd.json and package.json#jscpd configuration, report formats, exit-code behavior, Git blame, and server snippet checks. The current public benchmark suite records 50x+ speedups on pinned React, Next.js, and Prometheus cases while using a coverage-first compatibility gate against upstream jscpd.

This crate exposes the same detector core used by the jscpd and jscpd-server binaries: option parsing, file discovery, tokenization, duplicate detection, statistics, and in-memory source checks.

§Quick Start

Scan paths using the same option model as the CLI:

use std::path::PathBuf;

let mut options = jscpd_rs::get_default_options();
options.paths = vec![PathBuf::from("src")];
options.reporters.clear();
options.silent = true;

let result = jscpd_rs::detect_clones_and_statistics(&options)?;
println!("{} clones", result.clones.len());

Check prepared in-memory sources without touching the filesystem:

let mut options = jscpd_rs::get_default_options();
options.reporters.clear();
options.min_lines = 2;
options.min_tokens = 5;

let files = vec![
    jscpd_rs::SourceFile {
        source_id: "a.js".to_string(),
        format: "javascript".to_string(),
        content: "const a = 1;\nconst b = 2;\nconst c = a + b;\n".to_string(),
    },
    jscpd_rs::SourceFile {
        source_id: "b.js".to_string(),
        format: "javascript".to_string(),
        content: "const a = 1;\nconst b = 2;\nconst c = a + b;\n".to_string(),
    },
];

let result = jscpd_rs::detect_source_files(files, &options);
assert!(!result.clones.is_empty());

§Main Entry Points

§Compatibility Model

The release gate is coverage-first: for the same inputs and options, this crate must not miss duplicated source lines reported by upstream jscpd. Extra Rust findings remain visible in compatibility reports while the implementation converges on exact parity.

The current 0.x line intentionally keeps the detector native-only. Dynamic npm reporters, stores, listeners, and plugins are not loaded by this crate.

See the README and User Guide for CLI, configuration, reporter, server, and CI examples.

Structs§

BlamedLine
Git blame information for one duplicated source line.
CloneMatch
Pair of duplicated fragments reported as one clone.
DetectionResult
Complete detector output.
DetectionToken
Detection token after mode filtering and jscpd-compatible hashing.
Detector
Incremental detector facade for native integrations.
FormatMappings
Additional format mappings from extensions or exact filenames to formats.
Fragment
One duplicated fragment in a source file.
JscpdOutcome
Result of running the native CLI pipeline through the Rust API.
Location
One-based source location used in tokens, fragments, and reports.
MemoryStore
Simple namespace-aware in-memory store compatible with upstream concepts.
MemoryStoreError
Error returned when a key is missing from a MemoryStore namespace.
Options
Normalized detector options shared by the CLI, server, and Rust API.
SkippedClone
Clone skipped from final output with compatibility/debug messages.
SourceFile
Source content prepared by the caller for in-memory detection.
SourceSummary
Summary of one analyzed source.
SourceTokenMap
Token map associated with a source identifier and line count.
Statistic
Mutable helper for accumulating upstream-style duplication statistics.
StatisticRow
Aggregated duplication counters for a source, format, or whole run.
Statistics
Duplication statistics for a full detection run.
ThresholdExceeded
Error returned when the threshold reporter rejects a duplication result.
TokenMap
Token map for a single detected format block.
Tokenizer
Native tokenizer used by the detector.

Enums§

ExitCode
Node-compatible exit-code value preserved from CLI/config input.
Mode
Duplicate-detection token filtering mode.

Functions§

detect_clones
Detect clones from files discovered through Options::paths.
detect_clones_and_statistic
Upstream-named alias for detect_clones_and_statistics.
detect_clones_and_statistics
Detect clones and return both clone matches and aggregate statistics.
detect_source_files
Detect clones in prepared in-memory sources.
get_default_options
Return the upstream-compatible default option set.
get_format_by_file
Resolve a source format from a path using the built-in extension and filename registry.
get_format_by_file_with_mappings
Resolve a source format from a path with caller-provided extension and filename mappings.
get_options_from_args
Parse upstream-style command-line arguments into normalized Options.
get_supported_formats
Return the names of all formats known to the synchronized format registry.
jscpd
Run jscpd with upstream-style argv and return reported clone pairs.
jscpd_with_exit_callback
Run jscpd with upstream-style argv and call back with the duplicate exit code.
run_cli_args
Run the full CLI pipeline and return clones plus the process exit decision.
run_current_process
Run the full CLI pipeline from the current process arguments.
serve
Start the native REST/MCP server using the working directory derived from the first configured scan path.
serve_with_working_directory
Start the native REST/MCP server with an explicit working directory.
server_working_directory
Return the server working directory implied by CLI options.
upstream_stdout_error
Convert selected internal errors to upstream-style stdout error messages.

Type Aliases§

BlamedLines
Git blame lines keyed by line number.