Expand description
big-code-analysis is a library to analyze and extract information from source codes written in many different programming languages.
You can find the source code of this software on GitHub, while issues and feature requests can be posted on the respective GitHub Issue Tracker.
§Quick start
Most callers want the recommended entry points exposed in
prelude:
use big_code_analysis::prelude::*;
let source = b"fn main() {}";
let space = analyze(
Source::new(LANG::Rust, source),
MetricsOptions::default(),
).expect("Rust source parses");
println!("cognitive sum: {}", space.metrics.cognitive.cognitive_sum());§Supported Languages
Each grammar is gated behind a per-language Cargo feature; the
default all-languages feature enables every grammar so the
historical “every language compiled in” behaviour is preserved.
Library consumers that only need a subset can opt out of the
defaults — see Per-language Cargo features in the book.
- Bash (
bash) - C/C++ (
cpp, also exposes the internalCcomment/Preprochelpers) - C# (
csharp) - Elixir (
elixir) - Go (
go) - Groovy (
groovy) - Java (
java) - JavaScript (
javascript) - JavaScript, Firefox-internal “MozJS” (
mozjs) - Kotlin (
kotlin) - Lua (
lua) - Perl (
perl) - PHP (
php) - Python (
python) - Ruby (
ruby) - Rust (
rust) - Tcl (
tcl) - TSX (
typescript) - TypeScript (
typescript)
§Supported Metrics
- ABC: it measures the size of a source code based on assignments, branches, and conditions.
- CC: it calculates the code complexity examining the control flow of a program. Both standard and modified flavours are exposed: the modified variant collapses all case/match arms inside a single switch/match/when/select into one decision point.
- Cognitive Complexity: it measures how difficult it is to understand a unit of code.
- SLOC: it counts the number of lines in a source file.
- PLOC: it counts the number of physical lines (instructions) contained in a source file.
- LLOC: it counts the number of logical lines (statements) contained in a source file.
- CLOC: it counts the number of comments in a source file.
- BLANK: it counts the number of blank lines in a source file.
- HALSTEAD: it is a suite that provides a series of information, such as the effort required to maintain the analyzed code, the size in bits to store the program, the difficulty to understand the code, an estimate of the number of bugs present in the codebase, and an estimate of the time needed to implement the software.
- MI: it is a suite that allows to evaluate the maintainability of a software.
- NOM: it counts the number of functions and closures in a file/trait/class.
- NEXITS: it counts the number of possible exit points from a method/function.
- NARGS: it counts the number of arguments of a function/method.
- NPA: it counts the number of public attributes of a class.
- NPM: it counts the number of public methods of a class.
- WMC: it is the sum of the complexities of all methods in a class.
Re-exports§
pub use crate::output::CSV_EXTENSION;pub use crate::output::CSV_HEADER;pub use crate::output::Dump;pub use crate::output::DumpCfg;pub use crate::output::OffenderRecord;pub use crate::output::Severity;pub use crate::output::TOOL_ID;pub use crate::output::dump_node;pub use crate::output::dump_ops;pub use crate::output::dump_ops;pub use crate::output::dump_root;pub use crate::output::write_checkstyle;pub use crate::output::write_clang_warning;pub use crate::output::write_csv;pub use crate::output::write_msvc_warning;pub use crate::output::write_sarif;pub use ::tree_sitter;
Modules§
- metrics
- Per-metric implementations.
- output
- Output formatters: CSV, SARIF, Checkstyle, clang/MSVC warning
lines, and AST/metric pretty-dumps used by
bcaand the offender reporters. - prelude
- Recommended entry points for the 90% case.
Structs§
- Ast
- Parse-once, compute-many handle.
- AstCallback
- Type tag identifying the AST extraction action; carries no data.
- AstCfg
- Configuration options for retrieving the nodes of an
AST. - AstNode
- Information on an
ASTnode. - AstPayload
- The payload of an
Astrequest. - AstResponse
- The response of an
ASTrequest. - Bash
Code - Per-language code type tag for Bash; carries no data.
- Ccomment
Code - Per-language code type tag for Ccomment; carries no data.
- Code
Metrics - All metrics data.
- Comment
Rm - Type tag identifying the comment-removal action; carries no data.
- Comment
RmCfg - Configuration options for removing comments from a code.
- Concurrent
Runner - A runner to process files concurrently.
- Count
- Count of different types of nodes in a code.
- Count
Cfg - Configuration options for counting different types of nodes in a code.
- CppCode
- Per-language code type tag for Cpp; carries no data.
- Csharp
Code - Per-language code type tag for Csharp; carries no data.
- Cursor
- An
ASTcursor. - Elixir
Code - Per-language code type tag for Elixir; carries no data.
- Files
Data - Data related to files.
- Find
- Type tag identifying the node-find action; carries no data.
- FindCfg
- Configuration options for finding different types of nodes in a code.
- Func
Space - Function space data.
- Function
- Type tag identifying the function-extraction action; carries no data.
- Function
Cfg - Configuration options for detecting the span of each function in a code.
- Function
Span - Function span data.
- GoCode
- Per-language code type tag for Go; carries no data.
- Groovy
Code - Per-language code type tag for Groovy; carries no data.
- Java
Code - Per-language code type tag for Java; carries no data.
- Javascript
Code - Per-language code type tag for Javascript; carries no data.
- Kotlin
Code - Per-language code type tag for Kotlin; carries no data.
- LuaCode
- Per-language code type tag for Lua; carries no data.
- Metric
Set - Bitfield of selected metrics.
- Metrics
- Type tag identifying the metric-computation action; carries no data.
- Metrics
Cfg - Configuration options for computing the metrics of a code.
- Metrics
Options - Per-traversal options for [
metrics_with_options]. - Mozjs
Code - Per-language code type tag for Mozjs; carries no data.
- Node
- An
ASTnode. - Ops
- All operands and operators of a space.
- OpsCfg
- Configuration options for retrieving all the operands and operators in a code.
- OpsCode
- Type tag identifying the operator/operand extraction action; carries no data.
- Parse
Metric Error - Error returned by
Metric::from_strwhen the input is not a recognised metric name. - Perl
Code - Per-language code type tag for Perl; carries no data.
- PhpCode
- Per-language code type tag for Php; carries no data.
- Preproc
Code - Per-language code type tag for Preproc; carries no data.
- Preproc
File - Preprocessor data of a
C/C++file. - Preproc
Results - Preprocessor data of a series of
C/C++files. - Python
Code - Per-language code type tag for Python; carries no data.
- Ruby
Code - Per-language code type tag for Ruby; carries no data.
- Rust
Code - Per-language code type tag for Rust; carries no data.
- Source
- In-memory source bundle handed to
analyze. - TclCode
- Per-language code type tag for Tcl; carries no data.
- TsxCode
- Per-language code type tag for Tsx; carries no data.
- Typescript
Code - Per-language code type tag for Typescript; carries no data.
Enums§
- Concurrent
Errors - Series of errors that might happen when processing files concurrently.
- LANG
- The list of supported languages.
- Metric
- One metric computed by the analysis walker.
- Metric
Kind - Stable metric identifier set that suppression markers can name.
- Metrics
Error - Error returned by the library’s metric-computation entry points.
- Space
Kind - The list of supported space kinds.
- Suppression
Policy - Whether downstream consumers (threshold checking, audit logging) should honor parsed suppression markers.
- Suppression
Scope - Which metrics a suppression marker covers.
Traits§
- Callback
- A trait for callback functions.
- Language
Info - Static identification of a language code tag.
Functions§
- action
- Runs a function, which implements the
Callbacktrait, on a code written in one of the supported languages. - analyze
- Compute every metric for a
Source. - fix_
includes - Constructs a dependency graph of the include directives
in a
C/C++file. - get_
from_ emacs_ mode - Detects the language associated to the input
Emacsmode. - get_
from_ ext - Detects the language associated to the input file extension.
- get_
function_ spaces Deprecated - Returns all function spaces data of a code.
- get_
function_ spaces_ with_ options Deprecated - Returns all function spaces data of a code, applying the
per-traversal flags in
options(e.g.exclude_tests: trueto elide Rust#[cfg(test)]/#[test]subtrees from every metric). Equivalent toget_function_spaceswhenoptionsis the default. - get_
language_ for_ file - Detects the language of a code using the extension of a file.
- get_
macros - Returns the macros contained in a
C/C++file. - get_ops
- Returns all operators and operands of each space in a code.
- guess_
language - Guesses the language of a code.
- is_
generated - Returns
truewhenbuflooks like generated code: its leading window (first ~50 lines or first 5 KiB, whichever is smaller) contains a known marker phrase. Matching is case-insensitive for the marker and never allocates on the negative path. - metrics_
from_ tree - Returns all function spaces data of a code, reusing a
caller-supplied
tree_sitter::Treeinstead of running the bundled parser. - preprocess
- Extracts preprocessor data from a
C/C++file and inserts these data in aPreprocResultsobject. - read_
file - Reads a file, normalising all CR-only and CRLF line endings to LF.
- read_
file_ with_ eol - Reads a file, normalising all CR-only and CRLF line endings to LF, and ensures
the buffer ends with exactly one
\n. ReturnsNonefor files ≤ 3 bytes or files that appear to be non-UTF-8. - write_
file - Writes data to a file.
Type Aliases§
- Bash
Parser - The
Bashlanguage parser. - Ccomment
Parser - The
Ccommentlanguage parser. - CppParser
- The
Cpplanguage parser. - Csharp
Parser - The
Csharplanguage parser. - Elixir
Parser - The
Elixirlanguage parser. - GoParser
- The
Golanguage parser. - Groovy
Parser - The
Groovylanguage parser. - Java
Parser - The
Javalanguage parser. - Javascript
Parser - The
Javascriptlanguage parser. - Kotlin
Parser - The
Kotlinlanguage parser. - LuaParser
- The
Lualanguage parser. - Mozjs
Parser - The
Mozjslanguage parser. - Perl
Parser - The
Perllanguage parser. - PhpParser
- The
Phplanguage parser. - Preproc
Parser - The
Preproclanguage parser. - Python
Parser - The
Pythonlanguage parser. - Ruby
Parser - The
Rubylanguage parser. - Rust
Parser - The
Rustlanguage parser. - Span
- Start and end positions of a node in a code in terms of rows and columns.
- TclParser
- The
Tcllanguage parser. - TsxParser
- The
Tsxlanguage parser. - Typescript
Parser - The
Typescriptlanguage parser.