Expand description
§mollify-core
Analysis orchestration. Builds the graph, runs the engines, and assembles the
kind-discriminated mollify_types::Report envelopes. Engines: dead-code,
dependency hygiene, architecture (cycles/layers/contracts/policies),
complexity + hotspots, duplication, type-health, security, cohesion,
commented-code, coverage, and supply-chain — all folded into audit.
Modules§
- agents
- Agent-integration installer.
- apihygiene
- API-hygiene checks. Currently: private-type leaks — a public function or
method whose signature references a private (
_Name) type the caller cannot name. fallow’s “private type leak” signal, brought to Python. - arch
- Architecture engine: circular dependency detection (Tarjan SCC) plus
named layer presets — ordered layers from
.mollifyrcwhere a layer may import same/lower layers but importing a higher layer is alayer-violation. (layered/bulletproofuse this directly; hexagonal / feature-sliced map onto forbidden/independence contracts — future.) - baseline
- Regression baselines: snapshot the set of finding fingerprints, then on a
later run report only what’s new relative to that snapshot. This is the
“no new issues” CI gate (complementary to git-attribution
--gate new-only): it works without git and survives file moves, because fingerprints are content-derived (RESEARCH.md §2.11 — evidence-preserving). - cohesion
- Class-cohesion engine (LCOM*, Henderson-Sellers). Measures how much a class’s methods share instance attributes; a class whose methods touch disjoint attribute sets is doing several unrelated jobs and is a split candidate.
- commented
- Commented-out-code detection (eradicate / flake8-eradicate E800). Flags
comment lines whose stripped text parses as Python code (
import,def,return, assignments, control flow) rather than prose. Tool directives (noqa,type:,mypy:,TODO,mollify:, shebangs) are never flagged. Orthogonal to reachability — it’s about dead text, not dead symbols. - complexity
- Complexity engine. Flags functions whose cyclomatic or cognitive complexity
exceeds a threshold. (Churn × complexity hotspot ranking — the unfilled FOSS
Python niche — is planned via
git log --numstat; PLAN.md §3.5.) - config
.mollifyrc.jsonconfiguration: severity overrides (per rule or category), ignore globs, and complexity thresholds. Absent config → sensible defaults.- coverage
- Runtime-coverage merge — the “cold path” signal. Cross-references the static
function map against a
coverage.pyJSON report (coverage json): a function that is statically reachable but has zero executed lines is a strong delete/triage candidate. This is fallow’s paid differentiator, here free (RESEARCH.md §6) — Python makes it cheap (PEP 669 / SlipCover). - deadcode
- Dead-code engine: reachability-based unused files and unused top-level symbols, with confidence tiers (RESEARCH.md §4 / PLAN.md §4).
- deps
- Dependency-hygiene engine: declared-but-unused and imported-but-undeclared
distributions. Parses
pyproject.toml(PEP 621 + Poetry + PEP 735 groups). - dupes
- Duplication engine — exact token-clone detection via suffix array + LCP.
- explain
mollify explain <rule>— human-readable semantics for a rule id, with no analysis run. Keeps the “evidence, not decisions” contract legible: every rule states what it proves, its confidence ceiling, and how to act on it.- fingerprint
- Stable, deterministic finding fingerprints:
<rule>:<8 hex>. - fix
- Safe auto-fix: removes only
confidence: certain,auto_fixableunused symbols and unused imports (never files, never lower-confidence findings). Dry-run by default at the CLI; this module computes a plan and can apply it. - git
- Git integration for the PR gate. Computes changed files (working tree +
staged + optionally vs a base ref) and changed line ranges so findings
can be attributed introduced-vs-inherited at line granularity (parsed from
git diff --unified=0), with file-level as the fallback. - hotspots
- Churn × complexity hotspot ranking — a refactor-priority signal that is genuinely unfilled in FOSS Python tooling (RESEARCH.md §8.3). A file that is both complex and frequently changed is where bugs cluster.
- installed
- Installed-environment introspection. When a virtualenv is present, reads
*.dist-infometadata fromsite-packagesto (a) map import names to distributions accurately (beyond the static alias table) and (b) know which distributions are actually installed — which letsdepsdistinguish a transitive dependency (installed but undeclared) from a genuinely missing one (not installed at all). Best-effort: absent venv →None. - known
- Built-in knowledge: the Python standard-library top-level module set, and a
curated import-name → distribution-name alias table (the
cv2→opencv-pythonlong tail). A maintained alias table is a durable moat (RESEARCH.md §3.5). - members
- Unused class members (methods + class-level attributes) and unused enum members — vulture’s signature signal and fallow’s largest static analysis area, brought to Python.
- metrics
- Code-metrics engine (radon / wily parity): raw size counts, per-function
complexity rollups, and the Maintainability Index. Unlike the other
engines this emits measurements, not findings — a
MetricsReport. - plugins
- Framework awareness — the dominant false-positive killer for Python dead-code analysis (RESEARCH.md §4). A symbol registered with a framework via a decorator (a Flask/FastAPI route, a Celery task, a pytest fixture, a Django signal receiver, a click/typer command, a Pydantic validator, …) is reached even with zero in-repo callers.
- policy
- Declarative rule packs (policies). A policy bans an import and/or a call,
optionally scoped to path substrings. Unlike the heuristic engines this is
pure data → deterministic, no false-positive guessing: a banned import that
literally appears is a
Certainviolation. Modeled on fallow’s policy packs but expressed in Python terms (RESEARCH.md §5). - sarif
- SARIF 2.1.0 output for code-scanning platforms (GitHub, GitLab).
- security
- Security engine — a deterministic candidate producer (bandit-style).
It emits syntactic candidates; it never decides exploitability (the
candidate/verifier split — RESEARCH.md §2.11). Maps parser
SecurityHits to findings with per-rule confidence. - suffix
- Linear-time suffix array (SA-IS) + LCP (Kasai), over an integer alphabet.
- supplychain
- Supply-chain analysis: cross-reference pinned/locked dependency versions
against a local advisory database and flag versions that fall in a known
vulnerable range (
vulnerable-dependency). - trace
mollify trace <module>— the static dependency neighborhood of a module: what it imports (callees, “down”) and what imports it (callers, “up”). A lightweight, deterministic answer to “what breaks if I touch this?” built straight from the import graph (fallow’strace, in Python terms).- typehealth
- Type-health engine — annotation coverage for public functions. A Python-specific signal with no fallow analog (RESEARCH.md §8: clean white space). Flags fully-untyped public functions (params, but zero annotations and no return type).
- version
- A pragmatic PEP 440 subset for matching package versions against
advisory constraint ranges. Not a full PEP 440 implementation: it handles
release segments (
1.2.3), an optional pre-release tag (a/b/rc), and the operators== != < <= > >= ~=. Epochs, local versions, and===are out of scope (documented; we degrade to “no match” rather than guess).
Structs§
- Inspection
- A per-file evidence bundle: the matched module, its findings, and its import
neighborhood. Shared by
mollify inspect(CLI) and themollify_inspectMCP tool.
Constants§
- DEFAULT_
ADVISORY_ DB - The default advisory DB path checked by
auditwhen present.
Functions§
- analyze_
text - File-local diagnostics from an in-memory buffer (no disk, no graph) — the
live LSP path for
textDocument/didChange. Covers the intra-file rules (security, unused variables/parameters, complexity, commented-out code); cross-file rules (dead exports, deps, architecture) are produced by the full audit on save. Returns sorted findings, honoring inline suppressions. - apply_
suppressions - Drop findings silenced by an inline
# mollify: ignore[<rule>]comment on the finding’s line (or a bare# mollify: ignorematching any rule). - arch_
report mollify arch— circular dependencies (boundary presets later).- audit_
report mollify audit— the unified pass across all engines. Produces a quality score over the combined findings.- build_
graph - Build the graph for a project root once, to be shared across engines.
- complexity_
report mollify complexity/mollify health— complexity hotspots.- coverage_
report mollify coverage— cold-path analysis from a coverage.py JSON report.- dead_
code_ report mollify dead-code— reachability-based unused files/symbols.- deps_
report mollify deps— dependency hygiene.- dupes_
report mollify dupes— duplication / clone families.- graph_
export - Export the module import graph as Graphviz DOT or Mermaid
flowchart. - inspect
- Build the evidence bundle for a single file.
- into_
report - Wrap a findings report in the right
Reportvariant for a given category. - list_
topology - Topology listing for
mollify list/mollify_list. - security_
report mollify security— security candidates (deterministic; review before acting).- supply_
chain_ report mollify supply-chain— match pinned/locked dependency versions against a local advisory database (vulnerable-dependency). The DB is an input file, so analysis stays deterministic and offline.- supply_
chain_ report_ with - Like
supply_chain_reportbut against an already-loaded advisory set (e.g. fetched live by the CLI). Keeps the network out ofmollify-core. - types_
report mollify types— type-annotation health + API-hygiene (private-type leaks).