🦇 batless
Machine-Readable Code Analysis for AI and Automation
Symbol indexes, token-estimated context, semantic chunks — structured output that AI assistants can't produce themselves
🎯 Why batless?
AI assistants like Claude Code have native tools for reading files, searching, and listing directories. What they don't have is structured analysis output:
# Symbol index — navigate code without loading full content
|
# Token estimation — gate context decisions before loading a file
|
# Compressed context — language-aware comment and blank stripping
# Semantic chunks — split large files at declaration boundaries
# Content hash — detect changes without loading content
|
These are the outputs batless is built for. For plain file viewing, use cat, bat, or your editor.
Core guarantee: batless will NEVER wait for user input or block your pipeline.
🚀 Quick Start
Installation
Option A: Pre-built Binaries (Fastest)
# Linux (x86_64)
|
# macOS (Intel)
|
# macOS (Apple Silicon)
|
Option B: Via Cargo
Option C: Homebrew (macOS/Linux)
Basic Usage
# Symbol index — structure without loading full content
# Multi-file symbol index — walk directory, one NDJSON line per file
|
# Raw AST — full tree-sitter parse tree for deep structural analysis
|
# Token estimation — check size before loading into AI context
|
# Compressed AI context
# Semantic streaming chunks for large files
# Plain text (for piping to other tools)
# Get version info as JSON
🌟 What Makes batless Special
🏆 Feature Comparison
| Feature | batless |
bat |
cat / built-in Read |
|---|---|---|---|
| Never Blocks | ✅ Guaranteed | ❌ Uses pager | ✅ |
Symbol Index (--mode=index) |
✅ AST-backed | ❌ | ❌ |
Raw AST (--mode=ast) |
✅ tree-sitter | ❌ | ❌ |
| Multi-file Index (directory) | ✅ NDJSON walk | ❌ | ❌ |
| LLM Token Estimation | ✅ Per-profile | ❌ | ❌ |
| Semantic Chunking | ✅ tree-sitter | ❌ | ❌ |
| Comment/Blank Stripping | ✅ Language-aware | ❌ | ❌ |
| Content Hash | ✅ SHA-256 | ❌ | ❌ |
| JSON Output | ✅ First-class | ❌ | ❌ |
| Syntax Highlighting | ❌ Use bat |
✅ Rich | ❌ |
| Interactive Human Use | ❌ Not the goal | ✅ | ✅ |
🚀 Core Capabilities
Non-Blocking Guarantees
- 🚫 NEVER uses a pager - no
less, nomore, no blocking - ⚡ NEVER waits for input - always streams output immediately
- 🔄 NEVER hangs in pipes - safe for
|,>, and subprocess calls - 📊 ALWAYS returns quickly - even on huge files (streaming architecture)
Language Support
- 🔍 Language auto-detection with manual override (
--language) - 🌳 AST-backed analysis for Rust, Python, JavaScript, TypeScript (regex fallback for others)
- 🌐 Universal plain output — works with any text-based file format
Smart Output Modes
- 📊 Multiple output modes: plain, JSON, summary, index, ast
- 📏 Smart limiting by lines (
--max-lines) and/or bytes (--max-bytes) - 💾 Memory efficient - true streaming, never loads full files
- 🎯 Predictable behavior - same output in terminal or pipe
- 🧠 Dual-view summaries -
linesalways retains the full file whilesummary_linescarries the condensed view - 🔢 Token-aware JSON -
token_countreflects the full file even when the sampledtokensarray is capped (~2K entries) andtokens_truncatedtells you when sampling kicked in
Built for Automation
- 🤖 AI-optimized JSON output with metadata, tokens, and summaries
- 📋 Summary mode extracts functions, classes, imports only
- 🔤 Token extraction for LLM context processing
- 🚫 Clean defaults - no decorations unless requested
- 📦 Single ~2MB binary with minimal dependencies
🚫 What batless is NOT
batless has a focused design philosophy. It intentionally does NOT provide:
Features We Don't Implement (By Design)
| Feature | Why Not? | Use Instead |
|---|---|---|
| Pattern Search | That's grep's job |
grep -rn "pattern" path/ |
| Arbitrary Line Ranges | Beyond our scope | sed -n '10,50p' file |
| File Globbing | Shell handles this | batless *.py (shell expands) |
| Interactive Paging | We're non-blocking | Use bat or less |
| Git Integration | Keep it simple | Use git diff or bat |
| File Management | Not a file browser | ls, find, fd |
| Text Editing | Viewer only | Use your editor |
Common Misconceptions
❌ "batless is a drop-in replacement for bat" ✅ Reality: batless is purpose-built for automation and AI, not interactive use
❌ "batless should add grep-like search"
✅ Reality: Unix philosophy - do one thing well. Use grep for searching
❌ "batless needs more features like bat" ✅ Reality: Less is more. Our constraints are features for automation
When NOT to Use batless
- 👤 Interactive code review: Use
bat- it has better human-focused features - 🔍 Searching code: Use
grep,rg(ripgrep), orag(silver searcher) - 📝 Editing files: Use your favorite editor
- 📊 Complex analysis: Use language-specific tools (pylint, rust-analyzer, etc.)
- 🎨 Pretty printing: Use
batwith its full decoration suite
Our Philosophy
Do ONE thing well: produce structured, machine-readable code analysis that
AI assistants can't generate themselves. For everything else — plain viewing,
searching, interactive use — there's already a better tool.
📖 Usage Examples
Basic File Viewing
# Syntax highlighted output
# Plain text (no colors)
# With line numbers
# Limit output
AI & Automation Workflows
# JSON output for LLM processing
|
# Extract code structure only
# CI/CD context generation
# Machine-readable metadata
JSON structure tips:
linesalways contains the full file content (even when--summaryis enabled), whilesummary_linescarries the condensed view. The payload now exposestotal_lines_exact,token_count, andtokens_truncatedso downstream tools can distinguish between fully processed files and sampled metadata.
Pipeline Integration
# Use as PAGER replacement
PAGER="batless --plain"
# Process multiple files
# Combine with grep
|
# Stream stdin
|
Custom Profiles
# Use AI-optimized profile
# Interactive configuration wizard
# List available profiles
🎨 Configuration
Themes
batless supports multiple color themes for syntax highlighting:
# List available themes
# Use specific theme
Available Themes
batless currently includes 7 carefully curated themes:
- InspiredGitHub - Clean, GitHub-inspired light theme
- Solarized (dark) - Popular dark theme with excellent contrast
- Solarized (light) - Light variant of the Solarized theme
- base16-eighties.dark - Retro 80s-inspired dark theme
- base16-mocha.dark - Warm, chocolate-toned dark theme
- base16-ocean.dark - Cool, oceanic dark theme
- base16-ocean.light - Light variant of the ocean theme
Try different themes to find the one that works best for your workflow:
# Try each theme with your code
Note: Theme examples are available in docs/themes/ and can be regenerated with
./scripts/generate-theme-showcase.sh
Language Detection
# Auto-detect (default)
# Force specific language
# List supported languages
Custom Profiles
Create custom profiles in ~/.batless/profiles/:
# ~/.batless/profiles/my-profile.toml
= "my-profile"
= 1000
= "medium"
= true
Use with:
Shell Completions
batless includes built-in shell completion support for bash, zsh, fish, and PowerShell.
Bash
# Generate and install completions
# Or for system-wide installation
# Then reload your shell or source the completion file
Zsh
# Generate and install completions
# Add to your ~/.zshrc (if not already present)
fpath=(/.zsh/completions )
&&
# Then reload your shell
Fish
# Generate and install completions
# Completions are automatically loaded in new fish shells
PowerShell
# Generate and add to your profile
batless --generate-completions powershell | Out-String | Invoke-Expression
# Or save to your profile for persistence
batless --generate-completions powershell >> $PROFILE
🔧 CLI Options
Output Modes
--mode <MODE>- Output mode:plain,json,summary,index,ast--plain- Plain text output (equivalent to--mode=plain)--mode=json- Structured JSON output for automation--mode=summary- Extract only key code structures--mode=index- Machine-readable symbol table (kind, name, line ranges, visibility); pass a directory to walk it and emit one NDJSON line per file--mode=ast- Raw tree-sitter parse tree as JSON (Rust, Python, JavaScript, TypeScript, TSX;"root": nullfor other languages)
Limiting Output
--max-lines <N>- Limit output to N lines--max-bytes <N>- Limit output to N bytes--lines <START:END>- Select specific line range (e.g.,10:50,:100,50:)
Display Options
-n, --number- Show line numbers (cat -n compatibility)-b, --number-nonblank- Number non-blank lines only (cat -b compatibility)--language <LANG>- Force specific language syntax
AI/Automation Features
--include-identifiers- Include extracted code identifiers in JSON output (--include-tokensstill works as alias)--with-line-numbers- JSONlinesarray uses{"n": N, "text": "..."}objects instead of plain strings--hash- Include SHA-256 content hash in JSON output (for change detection)--strip-comments- Strip comment-only lines from output--strip-blank-lines- Strip blank lines from output--chunk-strategy <STRATEGY>- Streaming chunk strategy:line(default) orsemantic(splits at top-level declaration boundaries for Rust/Python/JS/TS)--summary- Add code summary to JSON output--profile <PROFILE>- Use AI-optimized profile (claude20K lines,claude-max150K lines,copilot,chatgpt,gemini,assistant)--custom-profile <PATH>- Load custom profile from file
JSON Output Fields
When using --mode=json, the output includes:
| Field | Type | Description |
|---|---|---|
file |
string | File path |
language |
string|null | Detected language |
lines |
array | File lines (strings, or {"n","text"} objects with --with-line-numbers) |
total_lines |
integer | Line count in original file |
total_lines_exact |
boolean | Whether total_lines covers the full file |
total_bytes |
integer | File size in bytes |
truncated |
boolean | Whether output was truncated |
encoding |
string | Detected encoding |
summary_lines |
array|null | Summary items {line, line_number, end_line, kind} |
identifiers |
array|null | Extracted code identifiers (with --include-identifiers) |
identifier_total |
integer|null | Total identifier count |
file_hash |
string|null | SHA-256 hex digest (with --hash) |
estimated_llm_tokens |
integer|null | Heuristic LLM token estimate (when profile active) |
token_model |
string|null | Model used for token estimation |
compression_ratio |
number|null | original/stripped lines ratio (with --strip-* flags) |
When using --mode=index, the output includes:
| Field | Type | Description |
|---|---|---|
file |
string | File path |
language |
string|null | Detected language |
symbol_count |
integer | Number of symbols found |
symbols |
array | Symbol table entries |
symbols[].kind |
string | function, struct, class, impl, trait, etc. |
symbols[].name |
string | Symbol identifier name |
symbols[].line_start |
integer | 1-based start line |
symbols[].line_end |
integer|null | 1-based end line |
symbols[].signature |
string | First declaration line |
symbols[].visibility |
string|null | pub, private, export, local |
When using --mode=ast, the output includes:
| Field | Type | Description |
|---|---|---|
file |
string | File path |
language |
string|null | Detected language |
mode |
string | "ast" |
parser |
string | "tree-sitter-rust" etc., or "none" for unsupported languages |
total_lines |
integer | Line count |
total_bytes |
integer | File size in bytes |
root |
object|null | Root parse tree node; null when parser is "none" |
root.type |
string | Node kind (e.g., "source_file", "module") |
root.start |
[row, col] | 0-based start position |
root.end |
[row, col] | 0-based end position |
root.text |
string|null | Node text for leaf nodes (≤256 chars) |
root.children |
array|null | Child nodes (same shape, max depth 64) |
root.is_error |
boolean|null | Present and true for error recovery nodes |
Configuration
--list-languages- Show all supported languages
Utility
--version- Show version information--version-json- Machine-readable version metadata--help- Show detailed help information
🤖 AI Assistant Integration
batless is designed to work seamlessly with AI coding assistants:
Claude Code
# Use batless in Claude Code workflows
GitHub Copilot CLI
# Generate context for Copilot
|
ChatGPT / Other LLMs
# Generate structured context
See docs/AI_INTEGRATION.md for detailed integration guides.
🏗️ Architecture
batless is built with:
- Rust - Memory safety and performance
- syntect - Syntax highlighting engine
- Streaming architecture - Memory-efficient processing
- Modular design - Clean separation of concerns
See docs/ARCHITECTURE.md for technical details.
🤝 Contributing
We welcome contributions! Please see:
- CONTRIBUTING.md - Contribution guidelines
- CODE_OF_CONDUCT.md - Community standards
- docs/PHILOSOPHY_AND_SCOPE.md - Project philosophy
Development Setup
# Clone repository
# Build
# Run tests
# Run with example
📊 Performance
- Startup time: <5ms typical on modern hardware
- Binary size: ~2MB (minimal dependencies)
- Memory usage: Constant (streaming architecture)
- Throughput: Limited only by syntax highlighting speed
Note: Performance varies by hardware. Benchmarks on typical developer workstation.
📜 License
MIT License - see LICENSE for details.
🔗 Links
- Documentation: docs/
- Changelog: CHANGELOG.md
- Releases: GitHub Releases
- Issues: GitHub Issues
- Crates.io: crates.io/crates/batless
🙏 Acknowledgments
Built with ❤️ for automation, AI assistants, and modern CLI workflows