Crate candor_classify

Expand description

candor-classify — the curated effect classifier (crate+path -> effect), extracted to a STABLE crate so both the nightly rustc_private lint AND a stable backend share ONE source of truth (no drift). Pure string logic; no rustc internals. The effect vocabulary lives in candor-report.

Modules§

policy: The canonical CANDOR_POLICY DSL parser (SPEC §6.2), shared by the nightly gate and candor-query. The canonical CANDOR_POLICY DSL parser (candor-spec SPEC §6.2).

Constants§

CALIBRATED_CRATES: The exact third-party crates classify has effect rules for, and the crate-name PREFIXES it recognizes. This is the single source of truth for “what candor knows”: it is emitted beside the JSON report (<prefix>.calibrated.json) so the Claude Code receipt’s coverage check reads candor’s real coverage instead of a hand-copied list. Keep in lockstep with classify below — the db_crates_are_calibrated and calibrated_crates_are_live tests (in this crate’s tests module) enforce both directions.
CALIBRATED_PREFIXES
CALIBRATION_PROBE_TAILS: Representative path tails (each appended to a crate name) that the calibrated_crates_are_live liveness test probes: at least one must match for every CALIBRATED_CRATES entry, else the entry is dead. Exported as ONE source of truth because the nightly lint crate (src/lib.rs) runs the SAME liveness test — when the two probe lists were duplicated they drifted, and a rule keyed on a distinctive tail (pnet ::datalink::channel, ignore ::WalkBuilder::build_parallel, notify ::RecommendedWatcher::new) added to only one list silently broke the other crate’s cargo test.
DB_CRATES: Database client crates whose execution verbs are I/O (see the DB branch in classify). Module-level so db_crates_are_calibrated can enforce DB_CRATES ⊆ CALIBRATED_CRATES.
PATH_CALIBRATED_CRATES: Crates classify matches by PATH prefix rather than crate-name equality (their effectful modules are recognised, e.g. tokio::net::/async_std::fs::/mio::net::), so they’re absent from CALIBRATED_CRATES (which the liveness test probes by crate name). The coverage check must still treat them as covered — otherwise it would mislabel the most common async crates as blind spots.

Functions§

cap_from_name
capstd_cap: Map a cap-std capability type to the effect it authorises. Holding one of these (e.g. &Dir) is the real, unforgeable right to perform that effect — so candor treats it as a declared capability, exactly like its own &Fs token.
classify: Classify a resolved callee by the crate it belongs to and its full path.
classify_command_head: Refine the Exec cliff (spec §4 ⟨0.5⟩): the effects a literal, statically-known subprocess head implies, matched by basename (/usr/bin/curl → curl). The head’s effects are ADDED to a caller that already carries Exec (a subprocess is still spawned — Exec is never dropped); an unrecognised or dynamically-built head returns &[] and keeps the bare cliff (never guess). A candor engine reads Fs/Env only — spec §7 item 12 (the analyzer self-boundary) guarantees that, so that case is spec-supplied, not curation. The rest is a small curated table under the same under-report rule as the crate classifier. INVARIANT: every head here is an external tool that does NOT run the analysed project’s own code (so make/npm/cargo are deliberately absent — they stay the cliff). The reference engines share this table so the Exec boundary — the one boundary every engine hits — refines identically (the §4-consistency argument).
classify_extra: Project-supplied rules, consulted only when the built-in classify returns None.
is_cmd_builder_method: Whether a subprocess-builder method only MODIFIES the command (.arg, .env, .current_dir) rather than NAMING the program (Command::new, duct::cmd). A WHOLE-CRATE-Exec crate (portable_pty, duct, async_process) classifies every method as Exec, so the head-refinement must skip these: an arg or env-var-name literal that happened to match a head (.env("psql", …), .arg("curl")) would FABRICATE that effect — the §1 under-report rule. The method is the call path’s last segment.
is_cmd_naming_method: Whether a subprocess method NAMES the program (so its first string literal IS the command head to refine): Command::new("curl"), duct::cmd("curl", …). The head-refinement must fire ONLY here — an ALLOWLIST, not “any method except known modifiers”. A whole-crate-Exec crate classifies EVERY method as Exec, so a denylist leaked NON-naming methods that aren’t modifiers — a getter like CommandBuilder::get_env("psql") (reading back an env-var KEY, not a program) fed "psql" to the head classifier and FABRICATED Db (review find). Only new/cmd name a program; everything else (modifiers, getters get_*, custom builder methods) keeps the bare Exec cliff — under-refine (safe) rather than fabricate. std::process::Command is verb-precise so getters never fire Exec there anyway; the allowlist makes the whole-crate-Exec crates safe too.
tables_in_sql: Table names a SQL string literal STATICALLY reaches — the Db analog of the Net host / Exec command / Fs path literal surface (feeds allow Db in <scope> <table>…, AS-EFF-008). Conservative by construction, because a wrong capture here would FABRICATE: the string must open with a SQL statement keyword, and only identifiers in table position are taken — FROM/JOIN anywhere, INTO anywhere, statement-leading UPDATE/TRUNCATE, and TABLE (create/drop/alter), skipping ONLY/IF NOT EXISTS. UPDATE mid-statement is deliberately ignored (FOR UPDATE SKIP LOCKED must not yield a table “skip”). A dynamically-built query yields nothing — the gate’s opaque case — never a guess. Output is lower-cased, quote/backtick-stripped, schema.table kept qualified, deduped. SPEC §2 pins this algorithm token-for-token across engines; the cross-impl vector battery (candor-spec conformance/tables/vectors.json, run.sh Part 4b) enforces the JVM/TS mirrors.

Crate candor_classify

Crate candor_classify Copy item path

Modules§

Constants§

Functions§

Crate candor_classify