Skip to main content

Crate mollify_core

Crate mollify_core 

Source
Expand description

§mollify-core

Analysis orchestration. Builds the graph, runs the engines, and assembles the kind-discriminated mollify_types::Report envelopes. Engines: dead-code, dependency hygiene, architecture (cycles/layers/contracts/policies), complexity + hotspots, duplication, type-health, security, cohesion, commented-code, coverage, and supply-chain — all folded into audit.

Modules§

agents
Agent-integration installer.
apihygiene
API-hygiene checks. Currently: private-type leaks — a public function or method whose signature references a private (_Name) type the caller cannot name. fallow’s “private type leak” signal, brought to Python.
arch
Architecture engine: circular dependency detection (Tarjan SCC) plus named layer presets — ordered layers from .mollifyrc where a layer may import same/lower layers but importing a higher layer is a layer-violation. (layered/bulletproof use this directly; hexagonal / feature-sliced map onto forbidden/independence contracts — future.)
baseline
Regression baselines: snapshot the set of finding fingerprints, then on a later run report only what’s new relative to that snapshot. This is the “no new issues” CI gate (complementary to git-attribution --gate new-only): it works without git and survives file moves, because fingerprints are content-derived (RESEARCH.md §2.11 — evidence-preserving).
cohesion
Class-cohesion engine (LCOM*, Henderson-Sellers). Measures how much a class’s methods share instance attributes; a class whose methods touch disjoint attribute sets is doing several unrelated jobs and is a split candidate.
commented
Commented-out-code detection (eradicate / flake8-eradicate E800). Flags comment lines whose stripped text parses as Python code (import, def, return, assignments, control flow) rather than prose. Tool directives (noqa, type:, mypy:, TODO, mollify:, shebangs) are never flagged. Orthogonal to reachability — it’s about dead text, not dead symbols.
complexity
Complexity engine. Flags functions whose cyclomatic or cognitive complexity exceeds a threshold. (Churn × complexity hotspot ranking — the unfilled FOSS Python niche — is planned via git log --numstat; PLAN.md §3.5.)
config
.mollifyrc.json configuration: severity overrides (per rule or category), ignore globs, and complexity thresholds. Absent config → sensible defaults.
coverage
Runtime-coverage merge — the “cold path” signal. Cross-references the static function map against a coverage.py JSON report (coverage json): a function that is statically reachable but has zero executed lines is a strong delete/triage candidate. This is fallow’s paid differentiator, here free (RESEARCH.md §6) — Python makes it cheap (PEP 669 / SlipCover).
deadcode
Dead-code engine: reachability-based unused files and unused top-level symbols, with confidence tiers (RESEARCH.md §4 / PLAN.md §4).
deps
Dependency-hygiene engine: declared-but-unused and imported-but-undeclared distributions. Parses pyproject.toml (PEP 621 + Poetry + PEP 735 groups).
dupes
Duplication engine — exact token-clone detection via suffix array + LCP.
explain
mollify explain <rule> — human-readable semantics for a rule id, with no analysis run. Keeps the “evidence, not decisions” contract legible: every rule states what it proves, its confidence ceiling, and how to act on it.
fingerprint
Stable, deterministic finding fingerprints: <rule>:<8 hex>.
fix
Safe auto-fix: removes only confidence: certain, auto_fixable unused symbols and unused imports (never files, never lower-confidence findings). Dry-run by default at the CLI; this module computes a plan and can apply it.
git
Git integration for the PR gate. Computes changed files (working tree + staged + optionally vs a base ref) and changed line ranges so findings can be attributed introduced-vs-inherited at line granularity (parsed from git diff --unified=0), with file-level as the fallback.
hotspots
Churn × complexity hotspot ranking — a refactor-priority signal that is genuinely unfilled in FOSS Python tooling (RESEARCH.md §8.3). A file that is both complex and frequently changed is where bugs cluster.
installed
Installed-environment introspection. When a virtualenv is present, reads *.dist-info metadata from site-packages to (a) map import names to distributions accurately (beyond the static alias table) and (b) know which distributions are actually installed — which lets deps distinguish a transitive dependency (installed but undeclared) from a genuinely missing one (not installed at all). Best-effort: absent venv → None.
known
Built-in knowledge: the Python standard-library top-level module set, and a curated import-name → distribution-name alias table (the cv2opencv-python long tail). A maintained alias table is a durable moat (RESEARCH.md §3.5).
members
Unused class members (methods + class-level attributes) and unused enum members — vulture’s signature signal and fallow’s largest static analysis area, brought to Python.
metrics
Code-metrics engine (radon / wily parity): raw size counts, per-function complexity rollups, and the Maintainability Index. Unlike the other engines this emits measurements, not findings — a MetricsReport.
plugins
Framework awareness — the dominant false-positive killer for Python dead-code analysis (RESEARCH.md §4). A symbol registered with a framework via a decorator (a Flask/FastAPI route, a Celery task, a pytest fixture, a Django signal receiver, a click/typer command, a Pydantic validator, …) is reached even with zero in-repo callers.
policy
Declarative rule packs (policies). A policy bans an import and/or a call, optionally scoped to path substrings. Unlike the heuristic engines this is pure data → deterministic, no false-positive guessing: a banned import that literally appears is a Certain violation. Modeled on fallow’s policy packs but expressed in Python terms (RESEARCH.md §5).
sarif
SARIF 2.1.0 output for code-scanning platforms (GitHub, GitLab).
security
Security engine — a deterministic candidate producer (bandit-style). It emits syntactic candidates; it never decides exploitability (the candidate/verifier split — RESEARCH.md §2.11). Maps parser SecurityHits to findings with per-rule confidence.
suffix
Linear-time suffix array (SA-IS) + LCP (Kasai), over an integer alphabet.
supplychain
Supply-chain analysis: cross-reference pinned/locked dependency versions against a local advisory database and flag versions that fall in a known vulnerable range (vulnerable-dependency).
trace
mollify trace <module> — the static dependency neighborhood of a module: what it imports (callees, “down”) and what imports it (callers, “up”). A lightweight, deterministic answer to “what breaks if I touch this?” built straight from the import graph (fallow’s trace, in Python terms).
typehealth
Type-health engine — annotation coverage for public functions. A Python-specific signal with no fallow analog (RESEARCH.md §8: clean white space). Flags fully-untyped public functions (params, but zero annotations and no return type).
version
A pragmatic PEP 440 subset for matching package versions against advisory constraint ranges. Not a full PEP 440 implementation: it handles release segments (1.2.3), an optional pre-release tag (a/b/rc), and the operators == != < <= > >= ~=. Epochs, local versions, and === are out of scope (documented; we degrade to “no match” rather than guess).

Structs§

Inspection
A per-file evidence bundle: the matched module, its findings, and its import neighborhood. Shared by mollify inspect (CLI) and the mollify_inspect MCP tool.

Constants§

DEFAULT_ADVISORY_DB
The default advisory DB path checked by audit when present.

Functions§

analyze_text
File-local diagnostics from an in-memory buffer (no disk, no graph) — the live LSP path for textDocument/didChange. Covers the intra-file rules (security, unused variables/parameters, complexity, commented-out code); cross-file rules (dead exports, deps, architecture) are produced by the full audit on save. Returns sorted findings, honoring inline suppressions.
apply_suppressions
Drop findings silenced by an inline # mollify: ignore[<rule>] comment on the finding’s line (or a bare # mollify: ignore matching any rule).
arch_report
mollify arch — circular dependencies (boundary presets later).
audit_report
mollify audit — the unified pass across all engines. Produces a quality score over the combined findings.
build_graph
Build the graph for a project root once, to be shared across engines.
complexity_report
mollify complexity / mollify health — complexity hotspots.
coverage_report
mollify coverage — cold-path analysis from a coverage.py JSON report.
dead_code_report
mollify dead-code — reachability-based unused files/symbols.
deps_report
mollify deps — dependency hygiene.
dupes_report
mollify dupes — duplication / clone families.
graph_export
Export the module import graph as Graphviz DOT or Mermaid flowchart.
inspect
Build the evidence bundle for a single file.
into_report
Wrap a findings report in the right Report variant for a given category.
list_topology
Topology listing for mollify list / mollify_list.
security_report
mollify security — security candidates (deterministic; review before acting).
supply_chain_report
mollify supply-chain — match pinned/locked dependency versions against a local advisory database (vulnerable-dependency). The DB is an input file, so analysis stays deterministic and offline.
supply_chain_report_with
Like supply_chain_report but against an already-loaded advisory set (e.g. fetched live by the CLI). Keeps the network out of mollify-core.
types_report
mollify types — type-annotation health + API-hygiene (private-type leaks).