Skip to main content

Crate candor_classify

Crate candor_classify 

Source
Expand description

candor-classify — the curated effect classifier (crate+path -> effect), extracted to a STABLE crate so both the nightly rustc_private lint AND a stable backend share ONE source of truth (no drift). Pure string logic; no rustc internals. The effect vocabulary lives in candor-report.

Modules§

policy
The canonical CANDOR_POLICY DSL parser (SPEC §6.2), shared by the nightly gate and candor-query. The canonical CANDOR_POLICY DSL parser (candor-spec SPEC §6.2).

Constants§

CALIBRATED_CRATES
The exact third-party crates classify has effect rules for, and the crate-name PREFIXES it recognizes. This is the single source of truth for “what candor knows”: it is emitted beside the JSON report (<prefix>.calibrated.json) so the Claude Code receipt’s coverage check reads candor’s real coverage instead of a hand-copied list. Keep in lockstep with classify below — the db_crates_are_calibrated and calibrated_crates_are_live tests (in this crate’s tests module) enforce both directions.
CALIBRATED_PREFIXES
CALIBRATION_PROBE_TAILS
Representative path tails (each appended to a crate name) that the calibrated_crates_are_live liveness test probes: at least one must match for every CALIBRATED_CRATES entry, else the entry is dead. Exported as ONE source of truth because the nightly lint crate (src/lib.rs) runs the SAME liveness test — when the two probe lists were duplicated they drifted, and a rule keyed on a distinctive tail (pnet ::datalink::channel, ignore ::WalkBuilder::build_parallel, notify ::RecommendedWatcher::new) added to only one list silently broke the other crate’s cargo test.
DB_CRATES
Database client crates whose execution verbs are I/O (see the DB branch in classify). Module-level so db_crates_are_calibrated can enforce DB_CRATES ⊆ CALIBRATED_CRATES.
PATH_CALIBRATED_CRATES
Crates classify matches by PATH prefix rather than crate-name equality (their effectful modules are recognised, e.g. tokio::net::/async_std::fs::/mio::net::), so they’re absent from CALIBRATED_CRATES (which the liveness test probes by crate name). The coverage check must still treat them as covered — otherwise it would mislabel the most common async crates as blind spots.

Functions§

cap_from_name
capstd_cap
Map a cap-std capability type to the effect it authorises. Holding one of these (e.g. &Dir) is the real, unforgeable right to perform that effect — so candor treats it as a declared capability, exactly like its own &Fs token.
classify
Classify a resolved callee by the crate it belongs to and its full path.
classify_command_head
Refine the Exec cliff (spec §4 ⟨0.5⟩): the effects a literal, statically-known subprocess head implies, matched by basename (/usr/bin/curlcurl). The head’s effects are ADDED to a caller that already carries Exec (a subprocess is still spawned — Exec is never dropped); an unrecognised or dynamically-built head returns &[] and keeps the bare cliff (never guess). A candor engine reads Fs/Env only — spec §7 item 12 (the analyzer self-boundary) guarantees that, so that case is spec-supplied, not curation. The rest is a small curated table under the same under-report rule as the crate classifier. INVARIANT: every head here is an external tool that does NOT run the analysed project’s own code (so make/npm/cargo are deliberately absent — they stay the cliff). The reference engines share this table so the Exec boundary — the one boundary every engine hits — refines identically (the §4-consistency argument).
classify_extra
Project-supplied rules, consulted only when the built-in classify returns None.
is_cmd_builder_method
Whether a subprocess-builder method only MODIFIES the command (.arg, .env, .current_dir) rather than NAMING the program (Command::new, duct::cmd). A WHOLE-CRATE-Exec crate (portable_pty, duct, async_process) classifies every method as Exec, so the head-refinement must skip these: an arg or env-var-name literal that happened to match a head (.env("psql", …), .arg("curl")) would FABRICATE that effect — the §1 under-report rule. The method is the call path’s last segment.
is_cmd_naming_method
Whether a subprocess method NAMES the program (so its first string literal IS the command head to refine): Command::new("curl"), duct::cmd("curl", …). The head-refinement must fire ONLY here — an ALLOWLIST, not “any method except known modifiers”. A whole-crate-Exec crate classifies EVERY method as Exec, so a denylist leaked NON-naming methods that aren’t modifiers — a getter like CommandBuilder::get_env("psql") (reading back an env-var KEY, not a program) fed "psql" to the head classifier and FABRICATED Db (review find). Only new/cmd name a program; everything else (modifiers, getters get_*, custom builder methods) keeps the bare Exec cliff — under-refine (safe) rather than fabricate. std::process::Command is verb-precise so getters never fire Exec there anyway; the allowlist makes the whole-crate-Exec crates safe too.
is_db_query_arg
The masking guard (AS-EFF-008), the Db analog of is_net_establishing: whether a Db-classified call takes the raw SQL QUERY as a string argument (so a missing literal leaves the table structurally INVISIBLE — a runtime-built query — and the surface is incomplete, fail-closed). An ALLOWLIST of the SQL-string-bearing execution/prepare verbs, the SAFE direction: a build-then-execute terminal that takes NO SQL string (sqlx/diesel/sea_orm fetch*/load*/first/ all/one/stream, the document-store find*/insert*/…), and a non-query op (connect/ open/acquire/begin/commit/ping/get_conn), are NOT here — their query is built structurally (never a maskable string literal) so a missing literal must not false-positive. Under-catching an unusual query verb is a missed mask (sound-with-disclosure), never a broken gate. The arg is the method leaf (the path’s last segment).
is_fs_path_arg
The masking guard (AS-EFF-008), the Fs analog of is_net_establishing: whether an Fs-classified call takes the filesystem PATH as a string argument (so a missing literal leaves the path structurally INVISIBLE — a runtime-built path — and the surface is incomplete, fail-closed). An ALLOWLIST of the path-NAMING free functions / constructors (fs::write/read/File::open/…), the SAFE direction: a path-stat METHOD whose path is the RECEIVER (p.metadata(), p.exists()) is invoked method-form and the caller gates on !is_method, so this never sees it; an op on an already-opened handle (file.write_all, mmap.flush, tempfile() — a random name, no path arg) is not here, so a missing literal there never false-positives. Under-catching an unusual path-naming fn is a missed mask (sound-with-disclosure), never a broken gate. The arg is the method/fn leaf (the path’s last segment).
is_net_establishing
The masking guard (AS-EFF-008): a Net call whose method takes the HOST/URL as an argument is “establishing” — a classified Net call here with no captured host literal leaves the endpoint structurally INVISIBLE (a runtime-built host), so the surface is incomplete and the gate must fail closed (else a benign sibling literal masks the runtime endpoint). An ALLOWLIST of connection- establishing verbs — the SAFE direction: a USE-verb on an already-connected socket (stream.write/read/flush, socket.send/recv) is NOT here, so a missing literal there (the host was fixed at connect) never false-positives. Under-catching an unusual establishing verb is a missed mask (sound-with-disclosure), never a broken gate. The arg is the method (path’s last segment).
tables_in_sql
Table names a SQL string literal STATICALLY reaches — the Db analog of the Net host / Exec command / Fs path literal surface (feeds allow Db in <scope> <table>…, AS-EFF-008). Conservative by construction, because a wrong capture here would FABRICATE: the string must open with a SQL statement keyword, and only identifiers in table position are taken — FROM/JOIN anywhere, INTO anywhere, statement-leading UPDATE/TRUNCATE, and TABLE (create/drop/alter), skipping ONLY/IF NOT EXISTS. UPDATE mid-statement is deliberately ignored (FOR UPDATE SKIP LOCKED must not yield a table “skip”). A dynamically-built query yields nothing — the gate’s opaque case — never a guess. Output is lower-cased, quote/backtick-stripped, schema.table kept qualified, deduped. SPEC §2 pins this algorithm token-for-token across engines; the cross-impl vector battery (candor-spec conformance/tables/vectors.json, run.sh Part 4b) enforces the JVM/TS mirrors.