colgrep 1.0.2

Quick Start

Install:

# macOS / Linux
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/lightonai/next-plaid/releases/latest/download/colgrep-installer.sh | sh

# Windows (PowerShell)
powershell -c "irm https://github.com/lightonai/next-plaid/releases/latest/download/colgrep-installer.ps1 | iex"

Search:

colgrep "database connection pooling"

The first run builds the index automatically. No setup, no config, no dependencies.

Search Modes

ColGREP supports three search modes: semantic, regex, and hybrid (both combined).

Semantic Search

Find code by meaning, even when keywords don't match exactly:

colgrep "function that retries HTTP requests"
colgrep "error handling in API layer"
colgrep "authentication middleware" ./src

Regex Search

Use -e for traditional pattern matching (ERE syntax by default):

colgrep -e "async fn\s+\w+"
colgrep -e "TODO|FIXME|HACK"
colgrep -e "impl\s+Display" --include="*.rs"

Hybrid Search

Combine regex filtering with semantic ranking. Regex narrows the candidates, semantics ranks them:

# Find async functions, rank by "error handling"
colgrep -e "async fn" "error handling"

# Find Result types, rank by "database operations"
colgrep -e "Result<" "database operations" --include="*.rs"

# Find TODOs, rank by relevance to "security"
colgrep -e "TODO" "security concerns"

CLI Reference

Search Options

Flag	Long	Description
`-e`	`--pattern`	Regex pre-filter (ERE syntax)
`-E`	`--extended-regexp`	ERE mode (default, kept for grep compat)
`-F`	`--fixed-strings`	Treat `-e` as literal string
`-w`	`--word-regexp`	Whole-word match for `-e`
`-k`	`--results`	Number of results (default: 15)
`-n`	`--lines`	Context lines to show (default: 6)
`-l`	`--files-only`	List matching files only
`-c`	`--content`	Show full function/class content
`-r`	`--recursive`	Recursive (default, for grep compat)
`-y`	`--yes`	Auto-confirm indexing
	`--json`	JSON output
	`--code-only`	Skip docs/config files
	`--include`	Filter by glob (e.g., `"*.rs"`)
	`--exclude`	Exclude files by glob
	`--exclude-dir`	Exclude directories
	`--model`	Override ColBERT model
	`--no-pool`	Disable embedding pooling
	`--pool-factor`	Set pool factor (default: 2)

Filtering

# By file extension
colgrep --include="*.py" "database query"
colgrep --include="*.{ts,tsx}" "React component"

# By path pattern
colgrep --include="src/**/*.rs" "config parsing"
colgrep --include="**/tests/**" "test helper"

# Exclude files or directories
colgrep --exclude="*.test.ts" "component"
colgrep --exclude-dir="vendor" --exclude-dir="node_modules" "import"

# Search specific paths
colgrep "error handling" ./src/api ./src/auth

# Code-only (skip markdown, yaml, json, etc.)
colgrep --code-only "authentication logic"

Glob pattern syntax:

Pattern	Matches
`*.py`	All Python files
`*.{ts,tsx}`	TypeScript and TSX files
`src/*/.rs`	Rust files under `src/`
`/tests/`	Files in any `tests/` directory
`*_test.go`	Go test files

Output Modes

# Default: filepath:lines with context
colgrep "authentication"

# Files only (like grep -l)
colgrep -l "database queries"

# Full content with syntax highlighting
colgrep -c "authentication handler" -k 5

# JSON for scripting
colgrep --json "auth" | jq '.[] | .unit.file'

Subcommands

Command	Description
`colgrep status`	Show index status for current project
`colgrep clear`	Clear index for current project
`colgrep clear --all`	Clear all indexes
`colgrep set-model <ID>`	Change the default ColBERT model
`colgrep settings`	View or modify configuration
`colgrep --stats`	Show search statistics for all indexes

Configuration

# Show current config
colgrep settings

# Set default results count
colgrep settings --k 20

# Set default context lines
colgrep settings --n 10

# Use INT8 quantized model (faster inference)
colgrep settings --int8

# Use FP32 full precision (more accurate)
colgrep settings --fp32

# Set embedding pool factor (2 = 50% smaller index, 1 = full precision)
colgrep settings --pool-factor 2

# Set parallel encoding sessions (default: CPU count, max 16)
colgrep settings --parallel 8

# Set batch size per session (default: 1 for CPU, 64 for CUDA)
colgrep settings --batch-size 2

# Enable verbose output by default
colgrep settings --verbose

# Reset a value to default (pass 0)
colgrep settings --k 0 --n 0

Change Model

# Temporary (single query)
colgrep "query" --model lightonai/LateOn-Code

# Permanent (clears existing indexes)
colgrep set-model lightonai/LateOn-Code

# Private HuggingFace model
HF_TOKEN=hf_xxx colgrep set-model myorg/private-model

Config stored at ~/.config/colgrep/config.json.

Agent Integrations

Agent	Install	Uninstall
Claude Code	`colgrep --install-claude-code`	`colgrep --uninstall-claude-code`
OpenCode	`colgrep --install-opencode`	`colgrep --uninstall-opencode`
Codex	`colgrep --install-codex`	`colgrep --uninstall-codex`

Restart your agent after installing.

Claude Code Integration

The Claude Code integration installs session and task hooks that:

Inject colgrep usage instructions into the agent's system prompt
Check index health before activating (skips if >3000 chunks need indexing or index is desynced)
Propagate colgrep instructions to spawned sub-agents via task hooks

This means Claude Code automatically uses colgrep as its primary search tool when the index is ready.

Complete Uninstall

Remove colgrep from all AI tools, clear all indexes, and delete all data:

colgrep --uninstall

How It Works

flowchart LR
    A["Source files"] --> B["Tree-sitter\nParse AST"]
    B --> C["5-Layer Analysis"]
    C --> D["Structured Text"]
    D --> E["ColBERT Encoder\nLateOn-Code-edge\n17M params"]
    E --> F["PLAID Index\nQuantized\nMemory-mapped"]
    F --> G["Search"]

    style A fill:#4a90d9,stroke:#357abd,color:#fff
    style B fill:#50b86c,stroke:#3d9956,color:#fff
    style C fill:#50b86c,stroke:#3d9956,color:#fff
    style D fill:#50b86c,stroke:#3d9956,color:#fff
    style E fill:#e8913a,stroke:#d07a2e,color:#fff
    style F fill:#e8913a,stroke:#d07a2e,color:#fff
    style G fill:#9b59b6,stroke:#8445a0,color:#fff

1. Parse

Tree-sitter parses source files into ASTs and extracts code units: functions, methods, classes, constants, and raw code blocks (module-level statements not covered by other units). This gives 100% file coverage.

2. Analyze (5 Layers)

Each code unit is enriched with five layers of analysis:

Layer	Extracts	Example
AST	Signature, parameters, return type, docstring, parent class	`def fetch(url: str) -> Response`
Call Graph	Outgoing calls + reverse `called_by`	`Calls: range, client.get`
Control Flow	Loops, branches, error handling, cyclomatic complexity	`has_error_handling: true`
Data Flow	Variable declarations and assignments	`Variables: i, e`
Dependencies	Imports used within the function	`Uses: client, RequestError`

3. Build Structured Text

Each unit is converted to a structured text representation before embedding. This gives the model richer signal than raw code alone:

Function: fetch_with_retry
Signature: def fetch_with_retry(url: str, max_retries: int = 3) -> Response
Description: Fetches data from a URL with retry logic.
Parameters: url, max_retries
Returns: Response
Calls: range, client.get
Variables: i, e
Uses: client, RequestError
Code:
def fetch_with_retry(url: str, max_retries: int = 3) -> Response:
    """Fetches data from a URL with retry logic."""
    for i in range(max_retries):
        try:
            return client.get(url)
        except RequestError as e:
            if i == max_retries - 1:
                raise e
File: src / utils / http client http_client.py

File paths are normalized for better semantic matching: separators become spaces, snake_case and CamelCase are split (e.g., HttpClient → http client).

4. Encode with ColBERT

The ColBERT model produces multi-vector embeddings: ~300 token-level vectors of dimension 128 per code unit (instead of a single vector). At query time, each query token finds its best match across all document tokens (MaxSim scoring). This preserves fine-grained information that single-vector models lose.

The default model is LateOn-Code-edge (17M parameters), optimized for code search and fast enough to run on CPU.

5. Index with PLAID

The PLAID algorithm compresses multi-vector embeddings with product quantization (2-bit or 4-bit) and stores them in a memory-mapped index. Embedding pooling (default factor: 2) further reduces index size by ~50%. Indexes support incremental updates so only changed files are re-encoded.

6. Search

The search pipeline:

Encode the query with ColBERT (single ONNX session, fast)
Pre-filter by metadata if --include, --exclude, --exclude-dir or --code-only are set (SQLite)
If -e pattern is provided: regex filter candidates, then score semantically
MaxSim scoring against the PLAID index
Demote test functions by -1 unless the query mentions "test"
Find representative lines using weighted token matching with a sliding window

Index Management

# Check index status
colgrep status

# Clear current project index
colgrep clear

# Clear all indexes
colgrep clear --all

# Show statistics
colgrep --stats

Indexes are stored outside the project directory:

Platform	Location
Linux	`~/.local/share/colgrep/indices/`
macOS	`~/Library/Application Support/colgrep/indices/`
Windows	`%APPDATA%\colgrep\indices\`

Each project gets a directory named {project}-{hash8}. Inside:

index/ — PLAID vector index + SQLite metadata
state.json — File hashes for incremental updates
project.json — Canonical project path

ColGREP automatically detects and repairs index/metadata desync from interrupted operations.

Supported Languages

Code (25 languages, tree-sitter AST parsing)

Language	Extensions
Python	`.py`
TypeScript	`.ts`, `.tsx`
JavaScript	`.js`, `.jsx`, `.mjs`
Go	`.go`
Rust	`.rs`
Java	`.java`
C	`.c`, `.h`
C++	`.cpp`, `.cc`, `.cxx`, `.hpp`, `.hxx`
C#	`.cs`
Ruby	`.rb`
Kotlin	`.kt`, `.kts`
Swift	`.swift`
Scala	`.scala`, `.sc`
PHP	`.php`
Lua	`.lua`
Elixir	`.ex`, `.exs`
Haskell	`.hs`
OCaml	`.ml`, `.mli`
R	`.r`, `.rmd`
Zig	`.zig`
Julia	`.jl`
SQL	`.sql`
Vue	`.vue`
Svelte	`.svelte`
HTML	`.html`, `.htm`

Text & Config (11 formats, document-level extraction)

Format	Extensions
Markdown	`.md`
Plain text	`.txt`, `.rst`
AsciiDoc	`.adoc`
Org	`.org`
YAML	`.yaml`, `.yml`
TOML	`.toml`
JSON	`.json`
Dockerfile	`Dockerfile`
Makefile	`Makefile`
Shell	`.sh`, `.bash`, `.zsh`
PowerShell	`.ps1`

Installation

Pre-built Binaries (Recommended)

# macOS / Linux
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/lightonai/next-plaid/releases/latest/download/colgrep-installer.sh | sh

# Windows (PowerShell)
powershell -c "irm https://github.com/lightonai/next-plaid/releases/latest/download/colgrep-installer.ps1 | iex"

Cargo

cargo install colgrep

Build from Source

git clone https://github.com/lightonai/next-plaid.git
cd next-plaid
cargo install --path colgrep

Build Features

Feature	Platform	Description
`accelerate`	macOS	Apple Accelerate for vector operations
`coreml`	macOS	CoreML for model inference
`openblas`	Linux	OpenBLAS for vector operations
`cuda`	Linux/Windows	NVIDIA CUDA for model inference
`tensorrt`	Linux	NVIDIA TensorRT for model inference
`directml`	Windows	DirectML for model inference

# macOS with Apple Accelerate + CoreML (recommended for M-series)
cargo install --path colgrep --features "accelerate,coreml"

# Linux with OpenBLAS
cargo install --path colgrep --features openblas

# Linux with CUDA
cargo install --path colgrep --features cuda

# Combine features
cargo install --path colgrep --features "openblas,cuda"

# Debian/Ubuntu
sudo apt install libopenblas-dev

# Fedora/RHEL
sudo dnf install openblas-devel

# Arch
sudo pacman -S openblas

Then build with cargo install --path colgrep --features openblas.

ONNX Runtime

ONNX Runtime is downloaded automatically on first use. No manual installation required.

Lookup order:

ORT_DYLIB_PATH environment variable
Python environments (pip/conda/venv)
System paths
Auto-download to ~/.cache/onnxruntime/

Python SDK

The colgrep-parser package exposes the tree-sitter parser and 5-layer analysis as a Python library (built with PyO3/maturin). No ONNX Runtime or index needed -- it's the parsing layer only.

pip install git+https://github.com/lightonai/next-plaid.git#subdirectory=colgrep/python-sdk

from colgrep_parser import parse_code

code = '''
def fetch_with_retry(url: str, max_retries: int = 3) -> Response:
    """Fetches data from a URL with retry logic."""
    for i in range(max_retries):
        try:
            return client.get(url)
        except RequestError as e:
            if i == max_retries - 1:
                raise e
'''

units = parse_code(code, "http_client.py")
for unit in units:
    print(unit.description())

Key functions:

Function	Description
`parse_code(code, filename)`	Parse source, auto-detect language
`parse_code(code, filename, merge=True)`	Merge all units into one (deduped metadata)
`parse_code_with_language(code, filename, lang)`	Parse with explicit language
`detect_language(filename)`	Detect language from filename
`supported_languages()`	List all supported languages

Each CodeUnit exposes all 5 analysis layers: name, signature, docstring, parameters, return_type, calls, called_by, variables, imports, complexity, has_loops, has_branches, has_error_handling, code, and more.

See python-sdk/README.md for the full API reference.

Environment Variables

Variable	Description
`ORT_DYLIB_PATH`	Path to ONNX Runtime library
`XDG_DATA_HOME`	Override data directory
`XDG_CONFIG_HOME`	Override config directory
`HF_TOKEN`	HuggingFace token for private models
`HUGGING_FACE_HUB_TOKEN`	Alternative HF token variable

License

Apache-2.0