big-code-analysis

big-code-analysis is a hard fork of the rust-code-analysis project. This project is an unapologetic vibe-coded fork that seeks to add as many features and functions as fast as possible.

Nonetheless, it is still a Rust library to analyze and extract information from source code written in many different programming languages. It is based on a parser generator tool and an incremental parsing library called Tree Sitter.

A command line tool called bca is provided to interact with the API of the library in an easy way.

This tool can be used to:

Call big-code-analysis API
Print nodes and metrics information
Export metrics in different formats
Generate a Markdown or HTML quality-metrics report (bca report markdown / bca report html)

In addition, we provide a bca-web tool to use the library through a REST API.

Live example reports

bca runs against its own source on every push to main and publishes the result alongside the documentation:

HTML hotspot report: https://dekobon.github.io/big-code-analysis/reports/index.html
Markdown PR/MR comment: https://dekobon.github.io/big-code-analysis/reports/report.md

The wiring lives in .github/workflows/pages.yml. For downstream projects, the CI integration recipe is the canonical adoption guide — it documents the recommended pinned-release install path (with BCA_VERSION + sha256 pin) plus a cargo install alternative. The in-tree pages.yml workflow builds bca from the current checkout because main may carry CLI artifact schemas that no released bca supports yet — see the schema-compatibility note in the recipe before copying that pattern.

Usage

big-code-analysis supports many types of programming languages and computes a great variety of metrics. You can find up to date documentation at Documentation.

On the Commands page, there is a list of commands that can be run to get information about metrics, nodes, and other general data provided by this software.

Using as a library

big-code-analysis is published on crates.io and can be embedded directly. The crate is on the 1.x line and ships under a written stability contract: the public API surface is held stable across patch and minor bumps, and breaking shape changes are reserved for the next major bump. Metric values may still drift across minor bumps when a grammar pin moves or a metric definition is fixed — see STABILITY.md for the full versioning contract, MSRV policy, escape hatches, and exactly what we do and do not promise within 1.x.

For task-oriented walkthroughs — quick start, in-memory analysis, walking FuncSpace results, and error handling — see the Using as a Library section of the book.

Python bindings (PyO3) live in big-code-analysis-py/ and ship the same metric pipeline as a Python package. See the book's Python Bindings section for the install matrix, batch / async / SARIF recipes, and the full error taxonomy.

Per-language Cargo features

Every tree-sitter grammar is gated behind a per-language Cargo feature. The default feature set is all-languages, so a bare

big-code-analysis = "2.0.0"

pulls every grammar in (matching the library's historical behaviour and what the bca / bca-web binaries ship). Library consumers that only need a subset of languages can opt out of the defaults and re-enable just the grammars they want:

big-code-analysis = { version = "2.0.0", default-features = false, features = ["rust", "typescript"] }

Supported language features: bash, c, cpp, csharp, elixir, go, groovy, irules, java, javascript, kotlin, lua, mozcpp, mozjs, perl, php, python, ruby, rust, tcl, typescript. The irules feature adds F5 iRules (a Tcl dialect; extensions .irule / .irules). The cpp feature backs the Cpp LANG variant with the upstream community tree-sitter-cpp grammar and, with the Ccomment and Preproc C-family helper variants, pulls in tree-sitter-cpp, bca-tree-sitter-ccomment, and bca-tree-sitter-preproc together. The c feature (added in #721) backs the dedicated C LANG variant with upstream tree-sitter-c and owns .c; it shares the same ccomment / preproc C-family helpers. .h stays on Cpp. The opt-in mozcpp feature adds the Mozcpp LANG variant — the Mozilla/Gecko C++ dialect (vendored bca-tree-sitter-mozcpp fork) — which owns no file extensions and is selected only by name (--language mozcpp, manifest, or API), mirroring mozjs for .jsm. Since #720 cpp no longer pulls bca-tree-sitter-mozcpp (a breaking dep-set change for --no-default-features consumers).

The LANG enum keeps every variant defined regardless of the active feature set; selecting a [LANG] variant whose feature is off returns Err(MetricsError::LanguageDisabled(LANG)) from every dispatch entry point (analyze, metrics_from_tree, action, get_ops, the deprecated get_function_spaces* shims, and LANG::tree_sitter_language). The set of compiled-in variants is queryable via LANG::is_enabled.

Building

The repository ships a Makefile that wraps every common build, test, lint, and docs task. Run make help for the full list, and make check-tools to verify the optional tools are installed.

make build           # debug build of the entire workspace
make build-release   # optimised release build

If you prefer to run cargo directly, or want to build a single crate:

cargo build                              # library only
cargo build -p big-code-analysis-cli     # CLI only
cargo build -p big-code-analysis-web     # web server only
cargo build --workspace                  # everything in one shot

Testing

make test           # cargo test --workspace --all-features --lib --bins --tests
make test-doc      # cargo test --workspace --all-features --doc
make pre-commit    # full local gate: fmt-check, clippy, tests, udeps, lint families

make pre-commit is the recommended gate before committing — it is equivalent to what CI runs. If GNU Make 4 or any of the optional tools are unavailable, the raw cargo invocation still works:

cargo test --workspace --all-features --verbose

Updating insta tests

We use insta, to update the snapshot tests you should install cargo insta

make insta-review   # cargo insta test --review

Will run the tests, generate the new snapshot references and let you review them.

Updating grammars

Have a look at Update grammars guide to learn how to update languages grammars.

Contributing

If you want to contribute to the development of this software, have a look at the guidelines contained in our Developers Guide.

Licenses

Mozilla-defined grammars are released under the MIT license.
big-code-analysis, big-code-analysis-cli and big-code-analysis-web are released under the Mozilla Public License v2.0.

big-code-analysis 2.0.0