Crate nyx_scanner

Expand description

Multi-language static vulnerability scanner.

Tree-sitter parsing, petgraph CFGs, SSA-based dataflow, and cross-file taint analysis with a capability-based sanitizer system. Supports Rust, C, C++, Java, Go, PHP, Python, Ruby, TypeScript, and JavaScript.

This crate is both the nyx binary and a library for programmatic scanning. Most internal modules are public for testing and downstream tooling, but the stable contract is scan_no_index plus the types it returns.

For a description of how the analysis pipeline works, see the how-it-works handbook. Per-detector documentation lives on the taint, cfg_analysis, state, patterns, and auth_analysis module pages.

§Entry points

scan_no_index runs a full two-pass scan over a directory tree and returns a flat list of commands::scan::Diag values. It does not touch a SQLite index; every file is analysed from disk on each call.

use nyx_scanner::{scan_no_index, utils::Config};
use std::path::Path;

let config = Config::default();
let findings = scan_no_index(Path::new("/path/to/project"), &config).unwrap();
for diag in &findings {
    println!("{} at {}:{}", diag.id, diag.path, diag.line);
}

For incremental rescanning backed by a SQLite index, use commands::scan::scan_with_index_parallel directly.

§Key types

Type	Purpose
`utils::config::Config`	Top-level scanner config (load from `nyx.conf` or construct in code)
`commands::scan::Diag`	A single finding: location, severity, rule ID, structured evidence
`evidence::Evidence`	Source/sink spans, flow steps, sanitizer annotations, engine notes
`evidence::Confidence`	Low / Medium / High confidence tag
`labels::Cap`	Bitflag capability set describing what a taint flow can reach
`symbol::Lang`	Supported language enum
`symbol::FuncKey`	Canonical cross-file function identity

§Reading findings

Each commands::scan::Diag carries:

path, line, col — source location of the sink
id — rule identifier (e.g. taint-unsanitised-flow, cfg-auth-gap)
severity — Critical / High / Medium / Low / Info
confidence — Low / Medium / High; capped at Medium when an engine budget was hit
rank_score — deterministic attack-surface score for truncation ordering
evidence — optional evidence::Evidence with source/sink spans, flow steps, and engine_notes::EngineNote values describing precision loss

Engine notes communicate when a bound was hit. A finding carrying EngineNote::OriginsTruncated or EngineNote::SccBudgetExhausted is still real, but the engine had less information than it would have had without the cap.

§Module map

Module	Role
`ast`	Tree-sitter parsing and two-pass analysis dispatch
`cfg`	CFG construction from ASTs
`ssa`	SSA lowering and optimization passes
`taint`	Forward SSA taint analysis
`cfg_analysis`	Structural CFG checks (auth gaps, resource leaks, error paths)
`state`	Resource lifecycle and state-machine analysis
`patterns`	Pattern-based AST checks
`auth_analysis`	Missing authorization / ownership checks
`callgraph`	Whole-program call graph and SCC analysis
`summary`	Per-function summaries for cross-file resolution
`labels`	Source, sanitizer, and sink rule registries per language
`symex`	Symbolic execution for witness generation and path feasibility
`abstract_interp`	Interval and string bounds propagation for sink suppression
`constraint`	Path constraint solving and infeasible-path pruning
`evidence`	Finding provenance and confidence types
`suppress`	Inline `nyx:ignore` directive handling
`output`	JSON and SARIF serialization
`database`	SQLite index pool and schema
`walk`	Filesystem traversal with batched delivery

Modules§

abstract_interp: Abstract interpretation framework.
ast: Tree-sitter parsing and two-pass analysis for all supported languages.
auth_analysis: Missing authorization and ownership checks (Rust-primary).
callgraph: Whole-program call graph built from pass-1 function summaries.
cfg: Intra-procedural control-flow graph construction.
cfg_analysis: CFG structural analysis: dominator-based checks over intra-procedural CFGs.
cli: Command-line interface definition via clap.
commands: Subcommand handlers and top-level dispatch.
constraint: Path constraint solving for infeasible path pruning.
convergence_telemetry: Convergence-loop telemetry: per-batch and per-file JSONL sidecar.
database: SQLite connection pool and schema for the incremental index.
engine_notes: Provenance notes attached to findings when the engine has hit an internal budget, widening, or lowering cap.
errors: Error types used throughout the scanner.
evidence: Structured evidence and confidence types for scan diagnostics.
fmt: Console output formatting for scan diagnostics.
interop: Explicit cross-language call-graph bridge edges.
labels: Per-language source, sanitizer, and sink rule registries.
output: Finding serialization and output routing.
patterns: AST pattern matching: tree-sitter queries over dangerous structural shapes.
pointer: Field-sensitive Steensgaard alias / points-to analysis.
rank: Attack surface ranking for scan diagnostics.
rust_resolve: Rust-specific module-path derivation and use declaration resolution.
server
ssa: SSA IR, lowering, and optimization passes.
state: State-model analysis: resource lifecycle and authentication state tracking.
summary: Per-function summaries for cross-file taint analysis.
suppress: Inline per-finding suppression via source-code comments.
symbol: Core language and function identity types.
symex: Symbolic execution targeting: candidate selection and constraint analysis for taint findings.
taint: Forward SSA taint analysis: the primary vulnerability detection engine.
utils: Shared utilities and configuration.
walk: Filesystem walker with batched path delivery.

Functions§

scan_no_index: Run a two-pass scan over root without an incremental index.