big-code-analysis-cli
bca analyzes source code and emits per-file structured metrics,
aggregated reports, AST dumps, node lookups, and more.
Migrating from the flag-style CLI? The CLI is now subcommand-driven. See the migration guide for old-form -> new-form mappings of every flag.
Installation
Usage
The global options describe what to walk (paths, includes/excludes, parallelism, language overrides). The command picks what to do with each file, with command-specific options as needed.
Commands
| Command | Purpose |
|---|---|
metrics |
Per-file metric output (-O json/yaml/toml/cbor, -o DIR). |
ops |
Per-file operand/operator output (same formats as metrics). |
report <FORMAT> |
Aggregated report. markdown today; html reserved. |
dump |
AST dump to stdout. |
find <NODE>... |
Find nodes of one or more types. |
count <NODE>... |
Count nodes of one or more types. |
functions |
List functions/methods and their spans. |
strip-comments |
Remove comments from source files (--in-place). |
preproc |
Build preprocessor-data JSON for C/C++ analysis. |
list-metrics [names|descriptions] |
List computable metrics. |
Run bca <COMMAND> --help for command-specific options.
Global options
-p, --paths <FILE>...— input files or directories.-I, --include [<GLOB>...]— include files matching pattern.-X, --exclude [<GLOB>...]— exclude files matching pattern.-j, --num-jobs <N>— worker threads.-l, --language-type <LANG>— force a language instead of inferring.--ls <LINE_START>/--le <LINE_END>— line range (used bydump,find).-w, --warning— print warnings (skipped files, unrecognized languages).--no-skip-generated— disable auto-skip of files marked as generated (see Skipping generated code).--report-skipped— log askipped (generated): <path>line to stderr for every file the generated-code detector excludes.--preproc-data <FILE>— consume an existing preproc JSON during C/C++ analysis. Build one withbca preproc.
Global options work both before and after the subcommand.
Building with a subset of languages
The shipped bca binary compiles every supported tree-sitter grammar
in. The big-code-analysis-cli crate pins the library's
all-languages feature set explicitly, so passing
--no-default-features or a custom --features list to
cargo build -p big-code-analysis-cli does not drop grammars
from the resulting binary — feature selection on the CLI crate is
not honoured (see #252 for the rationale: dropping a
grammar silently from a user-facing binary would surface as
"language X stopped working" rather than a build error).
Consumers who need a reduced feature set should embed the
big-code-analysis library in their own Rust code and control
feature selection in their own Cargo.toml. See the library's
per-language Cargo features chapter for the full
list of features and a worked example.
Examples
Per-file JSON metrics:
Aggregated markdown quality report:
AST dump for one file:
List all metrics with one-line descriptions:
Skipping generated code
Generated bindings (protobuf stubs, OpenAPI clients, lex/yacc output,
build-system plumbing) inflate metrics for code no human will refactor.
By default, bca scans the first ~50 lines / 5 KiB of each file for a
generated-code marker and skips matches before parsing.
Recognized markers (case-insensitive):
@generated— Facebook / Meta convention; also emitted by buck2, rustfmt, prettier, and many code generators.DO NOT EDIT— Go's// Code generated by … DO NOT EDIT.is the canonical form; the bare phrase is also widely copied (Bazel, protoc, OpenAPI clients).GENERATED CODE— Lizard's marker, recognized for compatibility.
A marker phrase that appears only deep in the file body (past the scan window) does not trigger the skip.
To restore the previous behavior and analyze everything, pass
--no-skip-generated. To audit which files were excluded, pass
--report-skipped; the CLI logs skipped (generated): <path> to stderr
for each file.