Docs.rs
cpd-tokenizer-0.1.5
cpd-tokenizer 0.1.5
Docs.rs crate page
MIT
Links
crates.io
Source
Owners
kucherenko
Dependencies
cpd-core ^0.1.4
normal
log ^0.4
normal
oxc_allocator ^0.133
normal
oxc_parser ^0.133
normal
oxc_span ^0.133
normal
regex ^1
normal
xxhash-rust ^0.8
normal
serde_json ^1
dev
Versions
31.58%
of the crate is documented
Platform
x86_64-unknown-linux-gnu
Feature flags
docs.rs
About docs.rs
Badges
Builds
Metadata
Shorthand URLs
Download
Rustdoc JSON
Build queue
Privacy policy
Rust
Rust website
The Book
Standard Library API Reference
Rust by Example
The Cargo Guide
Clippy Documentation
Skip to main content
Module tokenizer
cpd_
tokenizer
0.1.5
Module tokenizer
Module Items
Structs
Enums
Functions
In crate cpd_
tokenizer
cpd_tokenizer
Module
tokenizer
Copy item path
Source
Structs
§
Token
Map
A sub-format detection map produced by multi-format tokenizers.
Tokenize
Options
Options for the detection-path tokenizer.
Enums
§
Mode
Functions
§
code_
ignore_
ranges
Compute byte ranges of all regex matches against source text. Used to populate
ignore_ranges
from
ignorePattern
regexes before tokenization, matching v4 semantics where regex patterns match against source text regions (not individual token values).
push_
token
Push a token into the detection output if it passes all filters.
tokenize
Tokenize source code in the given format with the given mode. Returns a Vec
. Never panics on empty input — returns empty Vec.
tokenize_
to_
detection
Tokenize source code for the detection hot path.
tokenize_
to_
detection_
maps
Tokenize source code into one or more format-specific detection maps.