cargo-cola 1.0.2

Security static analyzer for Rust. Analyzes MIR to detect vulnerabilities. (Requires nightly)
cargo-cola-1.0.2 is not a library.

Rust-cola

Version License

Security static analyzer for Rust code.

Note: Requires nightly Rust. Target code must compile to extract MIR (Mid-level Intermediate Representation).

Installation

rustup toolchain install nightly

git clone https://github.com/Opus-the-penguin/Rust-cola.git

cd Rust-cola

cargo build --release

Binary: target/release/cargo-cola

The CLI is named cargo-cola; once built, invoke cargo-cola ... for scans.

Linux only: Requires OpenSSL development libraries (libssl-dev on Debian/Ubuntu, openssl-devel on Fedora/RHEL). macOS and Windows use native TLS and need no extra dependencies.

Note: The examples/ directory contains intentionally vulnerable code patterns for testing Rust-COLA's detection capabilities. These crates may have unmaintained or vulnerable dependencies by design and are not part of the distributed tool.

Usage

Rust-cola works with LLM-assisted triage.

  1. Run the scan on your target project:

    cargo-cola --crate-path <PATH>
    

    The scan emits an artifact set at out/cola/, including an LLM prompt.

  2. Point your LLM at out/cola/llm-prompt.md:

    • VS Code + Copilot: Reference the file in chat
    • Claude/ChatGPT: Upload or paste the file contents
  3. The LLM follows the prompt instructions and saves its report to out/cola/security-report.md.

Without LLM (air-gapped/data residency): Use --no-llm-prompt for standalone analysis. The raw-report.md contains all findings but without LLM false-positive filtering.

Options

Flag Description
--crate-path <PATH> Target crate or workspace (default: .)
--out-dir <PATH> Output directory (default: out/cola)
--config <PATH> Path to configuration file (YAML format)
--report <PATH> Generate standalone heuristic report
--sarif <PATH> Custom SARIF output path
--llm-prompt <PATH> Path for LLM prompt file
--llm-endpoint <URL> LLM API endpoint
--llm-model <NAME> Model name (e.g., gpt-4, llama3)
--llm-temperature <FLOAT> Sampling temperature (default: 0.0)
--exclude-tests <bool> Exclude test code (default: true)
--with-audit Run cargo-audit to check dependencies
--rulepack <PATH> Additional rules from YAML
--rules Print loaded rule metadata
--fail-on-findings <bool> Exit with code 1 when findings are produced (default: true)

Run cargo-cola --help for the full list.

VS Code Copilot Chat workflow

If you prefer to stay inside VS Code and let Copilot Chat drive the scan, a lightweight flow is:

  • run cargo-cola scan on /<PATH> to point at the target crate/workspace.
  • emit artifacts to /<PATH>/out/cola/ (or specify --out-dir <PATH> explicitly).
  • read and execute instructions in /<PATH>/out/cola/llm-prompt.md once the artifacts land.
  • save final security report in /<PATH>/out/cola/security-report.md so the scan and triage output live together.

This skips the manual paste workflow and keeps the Scan -> LLM triage -> report loop inside one chat context.

Workflow

Source Code -> MIR -> Rule Engine -> Taint* -> Raw Findings -> LLM Triage -> Security Report

Taint stage runs only for rules that opt into MIR data-flow today; pattern-only rules skip straight to Raw Findings.

The LLM triage step applies a structured analysis:

Verify -> Guards -> Prune -> Exploit -> Impact -> Severity -> Fix

Output Artifacts

By default, all artifacts are written to out/cola/ relative to your current working directory (not the target crate). Existing files are timestamped to avoid overwriting.

File Description
manifest.json Metadata and paths for all generated artifacts
mir.json MIR extraction (functions, blocks, statements)
ast.json AST extraction (modules, functions, structs)
hir.json HIR extraction for researchers (optional, requires --features hir-driver)
raw-findings.json Raw findings from all rules (pre-LLM validation)
raw-findings.sarif Raw SARIF 2.1.0 output with all findings (includes codeContext and suppressions for audit trail)
raw-report.md Standalone report without LLM validation
llm-prompt.md Prompt file for LLM triage
security-report.md LLM-generated report (created after LLM triage)

Rules

126 rules grouped by vulnerability category:

Category Rules Examples
Memory Safety 24 Transmute misuse, uninitialized memory, Box leaks, raw pointer escapes, slice safety, self-referential structs, returned refs to locals, UnsafeCell aliasing, lazy init poison, use-after-free
Concurrency 21 Mutex across await, blocking in async, Send/Sync violations, executor starvation, closure escaping refs, cancellation safety, async drop correctness, panic in Drop, task panic propagation
Input Validation 15 Env vars, stdin, unicode, deserialization, division by untrusted, timestamp overflow, binary deser, regex DoS, integer overflow, allocation size
FFI 11 Allocator mismatch, CString pointer misuse, packed fields, panic in FFI, WASM linear memory OOB, WASM host trust, WASM capability leaks
Web Security 14 TLS validation, CORS, cookies, passwords in logs, Content-Length, template injection, unsafe Send across async
Injection 10 SQL injection, command injection, path traversal, SSRF, log injection
Resource Management 10 File permissions, open options, infinite iterators, unbounded allocations
Code Quality 9 Dead stores, assertions, crate-wide allow, RefCell, commented code, unwrap in hot paths
Cryptography 8 Weak hashes (MD5/SHA1), weak ciphers, hardcoded keys, timing attacks, PRNG bias
Supply Chain 4 RUSTSEC advisories, yanked crates, auditable dependencies, proc-macro side effects

Detection Levels

Rules use different analysis techniques with varying precision:

Level Method Precision Rules %
Heuristic Pattern matching on MIR text Good 63 50%
Structural MIR statement/terminator analysis Better 26 21%
Dataflow Intra-function value tracking High 32 25%
Interprocedural Cross-function taint tracking Highest 5 4%

See the Rule Development Guide for custom rules and YAML rulepacks.

Interprocedural Analysis

Some rules (SQL injection, path traversal, SSRF, etc.) track data flow across function calls. This is memory-intensive on large codebases, so analysis has built-in limits configurable via YAML:

cargo-cola --config cargo-cola.yaml --crate-path .

Example cargo-cola.yaml:

analysis:
  max_path_depth: 8          # Maximum call chain depth (default: 8)
  max_flows_per_source: 200  # Flows per source function (default: 200)
  max_visited: 1000          # Functions visited per exploration (default: 1000)
  max_total_flows: 5000      # Total inter-procedural flows (default: 5000)
  max_functions_for_ipa: 10000  # Skip IPA for crates larger than this (default: 10000)

See examples/cargo-cola.yaml for a complete example.

Etymology

cola = COde Lexical Analyzer. Also, cola removes rust.

License

MIT - See LICENSE for details.

Contributing

Please file issues with feedback or suggestions.