big-code-analysis-cli 1.1.0

Tool to compute and export code metrics
Documentation
# big-code-analysis-cli

`bca` analyzes source code and emits per-file structured metrics,
aggregated reports, AST dumps, node lookups, and more.

> **Migrating from the flag-style CLI?** The CLI is now subcommand-driven.
> See the [migration guide]../big-code-analysis-book/src/migration.md
> for old-form -> new-form mappings of every flag.

## Installation

```sh
cd big-code-analysis-cli/
cargo build
```

## Usage

```sh
bca [GLOBAL OPTIONS] <COMMAND> [COMMAND OPTIONS]
```

The global options describe *what to walk* (paths, includes/excludes,
parallelism, language overrides). The command picks *what to do* with each
file, with command-specific options as needed.

## Commands

| Command | Purpose |
| --- | --- |
| `metrics` | Per-file metric output (`-O json/yaml/toml/cbor`, `-o DIR`). |
| `ops` | Per-file operand/operator output (same formats as `metrics`). |
| `report <FORMAT>` | Aggregated report. `markdown` today; `html` reserved. |
| `dump` | AST dump to stdout. |
| `find <NODE>...` | Find nodes of one or more types. |
| `count <NODE>...` | Count nodes of one or more types. |
| `functions` | List functions/methods and their spans. |
| `strip-comments` | Remove comments from source files (`--in-place`). |
| `preproc` | Build preprocessor-data JSON for C/C++ analysis. |
| `list-metrics [names\|descriptions]` | List computable metrics. |

Run `bca <COMMAND> --help` for command-specific options.

## Global options

- `-p, --paths <FILE>...` — input files or directories.
- `-I, --include [<GLOB>...]` — include files matching pattern.
- `-X, --exclude [<GLOB>...]` — exclude files matching pattern.
- `-j, --num-jobs <N>` — worker threads.
- `-l, --language-type <LANG>` — force a language instead of inferring.
- `--ls <LINE_START>` / `--le <LINE_END>` — line range (used by `dump`,
  `find`).
- `-w, --warning` — print warnings (skipped files, unrecognized
  languages).
- `--no-skip-generated` — disable auto-skip of files marked as generated
  (see [Skipping generated code]#skipping-generated-code).
- `--report-skipped` — log a `skipped (generated): <path>` line to stderr
  for every file the generated-code detector excludes.
- `--preproc-data <FILE>` — consume an existing preproc JSON during C/C++
  analysis. Build one with `bca preproc`.

Global options work both before and after the subcommand.

## Building with a subset of languages

The shipped `bca` binary compiles every supported tree-sitter grammar
in. The `big-code-analysis-cli` crate pins the library's
`all-languages` feature set explicitly, so passing
`--no-default-features` or a custom `--features` list to
`cargo build -p big-code-analysis-cli` does **not** drop grammars
from the resulting binary — feature selection on the CLI crate is
not honoured (see [#252][issue-252] for the rationale: dropping a
grammar silently from a user-facing binary would surface as
"language X stopped working" rather than a build error).

Consumers who need a reduced feature set should embed the
`big-code-analysis` library in their own Rust code and control
feature selection in their own `Cargo.toml`. See the library's
[per-language Cargo features][cargo-features] chapter for the full
list of features and a worked example.

[cargo-features]: https://dekobon.github.io/big-code-analysis/library/cargo-features.html
[issue-252]: https://github.com/dekobon/big-code-analysis/issues/252

## Examples

Per-file JSON metrics:

```sh
bca --paths ./src metrics -O json -o ./out/
```

Aggregated markdown quality report:

```sh
bca --paths "$PWD" --num-jobs $(nproc) \
    report markdown --top 20 --strip-prefix "$PWD/"
```

AST dump for one file:

```sh
bca --paths ./file.rs dump
```

List all metrics with one-line descriptions:

```sh
bca list-metrics descriptions
```

## Skipping generated code

Generated bindings (protobuf stubs, OpenAPI clients, lex/yacc output,
build-system plumbing) inflate metrics for code no human will refactor.
By default, `bca` scans the first ~50 lines / 5 KiB of each file for a
generated-code marker and skips matches before parsing.

Recognized markers (case-insensitive):

- `@generated` — Facebook / Meta convention; also emitted by buck2,
  rustfmt, prettier, and many code generators.
- `DO NOT EDIT` — Go's `// Code generated by … DO NOT EDIT.` is the
  canonical form; the bare phrase is also widely copied (Bazel, protoc,
  OpenAPI clients).
- `GENERATED CODE` — Lizard's marker, recognized for compatibility.

A marker phrase that appears only deep in the file body (past the scan
window) does not trigger the skip.

To restore the previous behavior and analyze everything, pass
`--no-skip-generated`. To audit which files were excluded, pass
`--report-skipped`; the CLI logs `skipped (generated): <path>` to stderr
for each file.