Expand description
Repository scanners. Deterministic surface classification by path and filename.
Walks a target directory, hashes every file with SHA-256, and assigns
each one a SourceClass taxonomy bucket (e.g. ProjectAdr,
EngineeringDoctrinePrinciple, ProjectAgentFile, BlockedSurface).
Classification is content-agnostic by design — it never reads prose
to decide what something is, only path and filename.
Block rules cover runtime exhaust (.cordance/, .git/, .claude/cache/,
node_modules/, target/, …), secret/credential filenames (id_rsa,
.env, secrets.json, Credentials.*), and OS junk (.DS_Store,
Thumbs.db). Case folding keeps SECRET-FOO.TXT blocked alongside
secret-foo.txt on default-case-insensitive NTFS / APFS.
§Golden path
use camino::Utf8PathBuf;
let target = Utf8PathBuf::from(".");
let sources = cordance_scan::scan_repo(&target).expect("scan succeeds");
for record in &sources {
if record.blocked {
eprintln!(
"blocked: {} ({})",
record.path,
record.blocked_reason.as_deref().unwrap_or("?"),
);
} else {
println!(
"{:?}: {} ({} bytes)",
record.class, record.path, record.size_bytes,
);
}
}Modules§
- blocked
- Blocked-surface rules. Distinguishes runtime exhaust (always blocked)
from repo-tracked agent material (allowed, classified as
ProjectAgentFile). - classifier
- Path →
SourceClass. Pure function; no I/O. - hasher
- Stable sha256 of a path’s bytes.
- walker
- Directory walker. Honours
.gitignorevia theignorecrate.
Enums§
Functions§
- classify_
by_ path - Classify a path using only its repo-relative location. No content reads.
- scan_
repo - Top-level scan entrypoint. Delegates to
walker::walk.