# Glimpse Development Guide
A blazingly fast tool for peeking at codebases. Perfect for loading your codebase into an LLM's context.
## Task Tracking
Check `.todo.md` for current tasks and next steps. Keep it updated:
- Mark items `[x]` when completed
- Add new tasks as they're discovered
- Reference it before asking "what's next?"
## Commits
Use `jj` for version control. Always commit after completing a phase:
```bash
jj commit -m "feat: add glimpse-code crate scaffolding"
```
Use conventional commit prefixes:
- `feat` - new feature
- `fix` - bug fix
- `refactor` - restructure without behavior change
- `chore` - maintenance, dependencies, config
- `docs` - documentation only
- `test` - adding or updating tests
## Build Commands
```bash
cargo build # debug build
cargo build --release # release build
cargo run -- <args> # run with arguments
cargo run -- . # analyze current directory
cargo run -- --help # show help
```
## Test Commands
```bash
cargo test # run all tests
cargo test test_name # run single test by name
cargo test test_name -- --nocapture # run test with stdout
cargo test -- --test-threads=1 # run tests sequentially
```
## Lint & Format
```bash
cargo fmt # format all code
cargo fmt -- --check # check formatting (CI)
cargo clippy # run linter
cargo clippy -- -D warnings # fail on warnings (CI)
```
## Project Structure
```
glimpse/
├── src/
│ ├── main.rs # binary entry point
│ ├── lib.rs # library root
│ ├── cli.rs # CLI arg parsing
│ ├── analyzer.rs # directory processing
│ ├── output.rs # output formatting
│ ├── core/ # config, tokenizer, types, source detection
│ ├── fetch/ # git clone, url/html processing
│ ├── tui/ # file picker
│ └── code/ # code analysis (extract, graph, index, resolve)
├── tests/ # integration tests
├── languages.yml # language definitions for source detection
├── registry.toml # tree-sitter grammar registry
└── build.rs # generates language data from languages.yml
```
## Code Style
### No Comments
Code should be self-documenting. The only acceptable documentation is:
- Brief `///` docstrings on public API functions that aren't obvious
- `//!` module-level docs when necessary
```rust
// BAD: explaining what code does
// Check if the file is a source file
if is_source_file(path) { ... }
// BAD: inline comments
let name = path.file_name(); // get the filename
// GOOD: self-documenting code, no comments needed
if is_source_file(path) { ... }
// GOOD: docstring for non-obvious public function
/// Extract interpreter from shebang line and exec pattern
fn extract_interpreter(data: &str) -> Option<String> { ... }
```
### Import Order
Group imports in this order, separated by blank lines:
1. `std` library
2. External crates (alphabetical)
3. Internal crates - prefer `super::` over `crate::` when possible
```rust
use std::fs;
use std::path::{Path, PathBuf};
use anyhow::Result;
use serde::{Deserialize, Serialize};
use super::types::FileEntry; // preferred for sibling modules
use crate::config::Config; // only when super:: won't reach
```
### Error Handling
- Use `anyhow::Result` for fallible functions
- Propagate errors with `?` operator
- Use `.expect("message")` only when failure is a bug
- Never use `.unwrap()` outside of tests
- Use `anyhow::bail!` for early returns with errors
### Naming Conventions
- `snake_case` for functions, methods, variables, modules
- `PascalCase` for types, traits, enums
- `SCREAMING_SNAKE_CASE` for constants
- Prefer descriptive names over abbreviations
- Boolean functions: `is_`, `has_`, `can_`, `should_`
### Type Definitions
- Derive common traits: `Debug`, `Clone`, `Serialize`, `Deserialize`
- Put derives in consistent order
- Use `pub` sparingly - only what's needed
```rust
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FileEntry {
pub path: PathBuf,
pub content: String,
pub size: u64,
}
```
### Function Style
- Keep functions focused and small
- Use early returns for guard clauses
- Prefer iterators and combinators over loops when clearer
- Use `impl Trait` for return types when appropriate
### Testing
- Tests live in `#[cfg(test)] mod tests` at bottom of file
- Use descriptive test names: `test_<what>_<condition>`
- Use `tempfile` for filesystem tests
- Group related assertions
### Patterns to Follow
- Use `Option` combinators: `.map()`, `.and_then()`, `.unwrap_or()`
- Use `Result` combinators: `.map_err()`, `.context()`
- Prefer `&str` over `String` in function parameters
- Use `impl AsRef<Path>` for path parameters when flexible
- Use builders for complex configuration
### Patterns to Avoid
- Comments explaining what code does (code should be obvious)
- Deeply nested code (use early returns)
- Magic numbers (use named constants)
- `clone()` when borrowing works
- `Box<dyn Error>` (use `anyhow::Error`)
- Panicking in library code
## Adding New Language Support
To add support for a new programming language, you need to:
1. Find the tree-sitter grammar repository
2. Add the language configuration to `registry.toml`
3. Write and test tree-sitter queries
4. Verify the language exists in `languages.yml` (for file detection)
### Step 1: Find the Tree-Sitter Grammar
**Always verify the grammar repo from the [tree-sitter wiki](https://github.com/tree-sitter/tree-sitter/wiki/List-of-parsers)** before using it. Look for:
- Official parsers under `tree-sitter/` org
- Community parsers under `tree-sitter-grammars/` org
- Language-specific orgs (e.g., `nix-community/tree-sitter-nix`, `fwcd/tree-sitter-kotlin`)
Clone the grammar repo to examine its structure:
```bash
# Using repo-explorer or manually
git clone https://github.com/<org>/tree-sitter-<lang>
```
Key files to examine:
- `grammar.js` - the grammar definition
- `src/node-types.json` - all node types in the AST
- `queries/tags.scm` - existing tag queries (if any)
- `queries/highlights.scm` - syntax highlighting queries (helpful reference)
### Step 2: Add to registry.toml
Add a new `[[language]]` section to `registry.toml`:
```toml
[[language]]
name = "mylang"
extensions = ["ml", "mli"]
repo = "https://github.com/org/tree-sitter-mylang"
branch = "master"
symbol = "tree_sitter_mylang" # C symbol name from bindings
color = "#HEXCOLOR" # from languages.yml
definition_query = """
# Query for function/method definitions
"""
call_query = """
# Query for function calls
"""
import_query = """
# Query for imports/includes
"""
[language.lsp]
binary = "mylang-lsp"
args = []
```
### Step 3: Write Tree-Sitter Queries
Queries use S-expression syntax. Key captures:
- `@name` - the function/symbol name (required)
- `@body` - the function body
- `@doc` - documentation comments
- `@qualifier` - object/module for qualified calls (e.g., `obj` in `obj.method()`)
- `@path` - import path
- `@function.definition` / `@reference.call` / `@import` - node type markers
#### Definition Query Pattern
```scheme
(
(comment)* @doc
.
(function_definition
name: (identifier) @name
body: (_) @body) @function.definition
)
```
#### Call Query Pattern
```scheme
(call_expression
function: [
(identifier) @name
(member_expression
object: (_) @qualifier
property: (identifier) @name)
]) @reference.call
```
#### Import Query Pattern
```scheme
(import_statement
source: (string) @path) @import
```
### Step 4: Test Queries
**Always test queries before committing.** Use the tree-sitter CLI:
```bash
# Install tree-sitter CLI if needed
nix-shell -p tree-sitter nodejs python3
# Navigate to the grammar repo
cd /path/to/tree-sitter-mylang
# Generate the parser (if needed)
tree-sitter generate
# Write your query to a .scm file
cat > queries/test-definition.scm << 'EOF'
(function_definition
name: (identifier) @name) @function.definition
EOF
# Test against a sample file
tree-sitter query queries/test-definition.scm sample.ml
```
Create a comprehensive test file that covers:
- Simple function definitions
- Functions with various argument patterns
- Nested functions
- Method definitions (if applicable)
- Different call patterns (simple, qualified, chained)
- Various import styles
Example test output:
```
sample.ml
pattern: 0
capture: 1 - function.definition, start: (5, 2), end: (5, 24)
capture: 0 - name, start: (5, 2), end: (5, 12), text: `myFunction`
```
### Step 5: Verify Language Detection
Ensure the language exists in `languages.yml` with correct:
- `extensions` - file extensions
- `type: programming`
- `language_id` - unique ID
Most common languages are already in `languages.yml` (sourced from GitHub Linguist).
### Step 6: Build and Test
```bash
# Build glimpse
cargo build
# Test on a real file
cargo run -- code path/to/file.ml:function_name
# Test with callers
cargo run -- code path/to/file.ml:function_name --callers -d 2
```
### Query Writing Tips
1. **Use alternatives `[...]`** for multiple patterns:
```scheme
(call_expression
function: [
(identifier) @name
(member_expression property: (identifier) @name)
])
```
2. **Use predicates** for filtering:
```scheme
((identifier) @name
(#eq? @name "import"))
((identifier) @name
(#match? @name "^fetch.*"))
```
3. **Handle optional nodes** with `?`:
```scheme
(function_definition
name: (identifier) @name
parameters: (parameters)? @params)
```
4. **Anchor with `.`** for adjacent siblings:
```scheme
(comment)* @doc
.
(function_definition) @function
```
5. **Examine the AST** when queries don't match:
```bash
tree-sitter parse sample.ml
```
### Common Pitfalls
- **Wrong node names**: Always check `src/node-types.json` for exact names
- **Missing field names**: Some nodes use positional children, not named fields
- **Nested structures**: Languages with currying or chaining need multiple patterns
- **External scanner**: Some grammars have custom scanners in `src/scanner.c`
### LSP Configuration
#### Finding Download URLs
1. Check the LSP's GitHub releases page for pre-built binaries
2. Use the GitHub API to get exact asset names:
```bash
curl -s https://api.github.com/repos/OWNER/REPO/releases/latest | jq '.assets[].name'
```
3. Identify the URL pattern and platform-specific target names
#### Install Methods (in priority order)
| URL download | `url_template` | None | rust-analyzer, lua-language-server |
| npm | `npm_package` | npm or bun | pyright, typescript-language-server |
| go | `go_package` | go toolchain | gopls |
| cargo | `cargo_crate` | cargo | nil |
If no install method is configured, users must install the LSP manually.
Some LSPs are bundled with their language toolchain and cannot be auto-installed:
- `ruby-lsp` - installed via `gem install ruby-lsp`
- `sourcekit-lsp` - comes with Xcode/Swift toolchain
- `metals` - installed via Coursier (`cs install metals`) or SDKMAN
#### Configuration Options
```toml
[language.lsp]
binary = "lsp-server" # executable name
args = ["--stdio"] # CLI arguments
# URL-based download (preferred when binaries available)
version = "1.0.0"
url_template = "https://github.com/org/repo/releases/download/{version}/lsp-{version}-{target}.tar.gz"
archive = "tar.gz" # or "zip", "gz", "tar.xz"
binary_path = "bin/server" # path within archive (optional)
# Package manager installs (fallback when no binaries)
npm_package = "pkg-name" # install via npm/bun
go_package = "pkg/path@latest" # install via go
cargo_crate = "crate-name" # install via cargo
[language.lsp.targets] # map rust target triple to release asset name
"x86_64-unknown-linux-gnu" = "linux-x64"
"aarch64-unknown-linux-gnu" = "linux-arm64"
"x86_64-apple-darwin" = "darwin-x64"
"aarch64-apple-darwin" = "darwin-arm64"
```