code-search-cli 0.3.3

Intelligent code search tool for tracing text (UI text, function names, variables) to implementation code
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
# Code Search (cs)

**Intelligent code search tool for tracing text to implementation code**

## Problem Statement

Modern IDEs like JetBrains suite provide powerful code navigation features through context menus - find definition, find references, etc. However, these IDEs are resource-intensive and not always practical for quick searches or lightweight environments.

Developers frequently need to search for text in their codebase, whether it's:
- **UI text** from user bug reports: "The 'Add New' button isn't working"
- **Function names** for understanding code flow: "What does `processPayment` call?"
- **Variable names** for refactoring: "Where is `userId` used?"
- **Error messages** for debugging: "Where does 'Invalid token' come from?"

For UI text specifically, the search is even more complex when i18n is involved:

1. Search for the text: `rg 'add new' -F`
2. Manually scan results to find translation files (e.g., `en.yml`)
3. Open the translation file and locate the key: `add_new: 'add new'`
4. Examine the YAML structure to find the full key path: `invoice.labels.add_new`
5. Search for the key usage: `rg 'invoice.labels.add_new' -F`
6. Finally locate the implementation in `components/invoices.ts`

This manual process is time-consuming, error-prone, and interrupts the development flow.

## Solution

`code-search` (abbreviated as `cs`) is a lightweight CLI tool that automates code discovery workflows:
- **Smart text search**: Find any text (UI text, code, error messages) in your codebase
- **i18n-aware tracing**: Automatically follows references from UI text through translation files to implementation
- **Call graph tracing**: Trace function calls forward or backward to understand code flow

### i18n Text Tracing

```bash
$ cs 'add new'

'add new'
   |
   |-> 'add_new: add new' at line 56 of en.yml
                |
                |-> 'invoice.labels.add_new' as the structure
                         |
                         |-> I18n.t('invoice.labels.add_new') at line 128 of components/invoices.ts
```

### Call Graph Tracing

Trace function calls forward (what does this function call?) or backward (who calls this function?):

```bash
# Forward trace - what does bar() call?
$ cs 'bar' --trace

bar
|-> zoo1 (utils.ts:45)
|-> zoo2 (helpers.ts:23)
|-> zoo3 (api.ts:89)

# Backward trace - who calls bar()?
$ cs 'bar' --traceback

blah1 -> foo1 -> bar
blah2 -> foo2 -> bar

# Control trace depth (default: 3, max: 10)
$ cs 'bar' --trace --depth 5
```

## Key Features

- **Universal Text Search**: Find any text - UI text, function names, variable names, error messages
- **Smart Translation Tracing**: Finds translation keys (e.g., `t('invoice.add')`) and traces them back to their definition in YAML files.
- **Direct Text Search**: Searches for the text directly in the codebase in addition to translation keys, ensuring all occurrences are found.
- **Call Graph Tracing**: Traces function calls forward (what does this call?) and backward (who calls this?).
- **i18n Format Support**: Understands YAML/JSON translation file structures
- **Pattern Recognition**: Identifies common i18n patterns (I18n.t, t(), $t, etc.) and function definitions
- **Tree Visualization**: Clear visual representation of the reference chain
- **Depth Control**: Configurable trace depth to prevent explosion in large codebases
- **Cycle Detection**: Handles recursive/circular calls without hanging
- **Lightweight**: Uses ripgrep library for fast performance (no external dependencies)
- **No IDE Required**: Works in any terminal environment

## Use Cases

- **Bug Triage**: Quickly locate implementation from user-reported issues (UI text or error messages)
- **Code Exploration**: Find where functions, variables, or constants are defined and used
- **Call Flow Analysis**: Understand what a function does by tracing its calls
- **Impact Analysis**: Find all callers of a function before refactoring
- **i18n Workflow**: Trace UI text through translation files to verify correct implementation
- **Debugging**: Locate error message sources or trace variable usage
- **Onboarding**: Help new developers understand code organization and data flow
- **Quick Navigation**: Fast code navigation without heavy IDE overhead

## Supported Patterns

### Translation File Formats
- YAML (Rails i18n, Ruby)
- JSON (JavaScript/TypeScript i18n)
- Properties files (Java)

### i18n Function Patterns
- Ruby: `I18n.t('key')`, `t('key')`
- JavaScript/TypeScript: `i18n.t('key')`, `$t('key')`, `t('key')`
- React: `useTranslation()`, `<Trans>`
- Vue: `$t('key')`, `{{ $t('key') }}`

## Custom File Extensions

By default, the tool recognizes common code file extensions (`.ts`, `.tsx`, `.js`, `.jsx`, `.vue`, `.rb`, `.py`, `.java`, `.php`, `.rs`, `.go`, `.cpp`, `.c`, `.cs`, `.kt`, `.swift`). 

For projects with custom file extensions, use the `--include-extensions` flag:

```bash
# Include files with custom extensions
cs "search text" --include-extensions html.ui,vue.custom

# Multiple extensions (comma-separated)
cs "search text" --include-extensions erb.rails,blade.php,twig.html

# Extensions with or without leading dot work the same
cs "search text" --include-extensions .html.ui,.vue.custom
```

This is particularly useful for:
- Custom framework file extensions (e.g., `.html.ui` for UI frameworks)
- Template engines with compound extensions (e.g., `.erb.rails`, `.blade.php`)
- Domain-specific file types (e.g., `.vue.custom`, `.component.ts`)

## Partial Key Matching

The tool automatically finds common i18n patterns where developers cache namespaces:

```javascript
// Common pattern: Cache namespace to avoid repetition
const labels = I18n.t('invoice.labels');
const addButton = labels.t('add_new');
const editButton = labels.t('edit');

// Deeper namespace caching
const invoiceNS = I18n.t('invoice');
const addLabel = invoiceNS.labels.t('add_new');
```

When searching for "add new", the tool finds:
- **Translation file**: `invoice.labels.add_new: "add new"`
- **Namespace usage**: `I18n.t('invoice.labels')` (parent namespace)
- **Relative key usage**: `labels.t('add_new')` (child key)

This works by generating strategic partial keys:
- **Full key**: `invoice.labels.add_new`
- **Without first segment**: `labels.add_new` (matches `labels.t('add_new')`)
- **Without last segment**: `invoice.labels` (matches `I18n.t('invoice.labels')`)

### Example Usage

```bash
# Basic i18n search
$ cs "add new"
=== Translation Files ===
config/locales/en.yml:4:invoice.labels.add_new: "add new"

=== Code References ===
app/components/invoices.ts:14:I18n.t('invoice.labels.add_new')
components/InvoiceManager.vue:3:{{ $t('invoice.labels.add_new') }}

# Include custom file types
$ cs "add new" --include-extensions html.ui,erb.rails
=== Translation Files ===
config/locales/en.yml:4:invoice.labels.add_new: "add new"

=== Code References ===
app/components/invoices.ts:14:I18n.t('invoice.labels.add_new')
components/InvoiceManager.vue:3:{{ $t('invoice.labels.add_new') }}
templates/invoice.html.ui:23:i18n('invoice.labels.add_new')
views/invoice.erb.rails:45:<%= t('invoice.labels.add_new') %>
```

## Roadmap

### Phase 1: Core Functionality
- [x] Project setup and architecture
- [x] YAML translation file parsing
- [x] Text-to-key mapping
- [x] Key-to-code tracing
- [x] Basic tree visualization

### Phase 2: Call Graph Tracing
- [x] Function definition detection (JS, Ruby, Python, Rust)
- [x] Forward call tracing (`--trace`)
- [x] Backward call tracing (`--traceback`)
- [x] Depth limiting and cycle detection

### Phase 3: Enhanced Features ✅ Complete
- [x] JSON translation support
- [x] Multiple i18n pattern detection
- [x] Tree-sitter for improved accuracy

### Phase 4: Advanced Features
- [ ] Interactive navigation
- [x] Multi-language project support
- [x] Caching for performance
- [ ] Editor integration (VSCode, Vim)

## Architecture

Built on a foundation of proven tools:
- **ripgrep library**: Embedded fast text searching (no external installation required)
- **Regex patterns**: Function definition and call detection
- **YAML/JSON parsers**: Translation file processing
- **Tree builders**: Visual output formatting

## Performance

Typical search performance on real-world projects:

- **Small projects** (<100 files): 10-20ms
- **Medium projects** (100-1000 files): 20-70ms
- **Large projects** (1000+ files): 70-200ms

Optimizations:
- Embedded ripgrep library (no external process overhead)
- Two-tier caching (in-memory LRU + persistent backend)
- Smart file filtering with early exit on no-match files
- Tree-sitter AST parsing for accurate function detection

Benchmarked on real-world codebases including Discourse open-source project (389 YAML files, ~50MB total).

### Advanced Optimization: Bottom-Up Parsing

The codebase includes a sophisticated **bottom-up parsing optimization** for YAML/JSON translation files:

- **Binary search algorithm** for O(log n) parent key finding
- **Ancestor caching** to avoid redundant tree traversal
- **20-100x speedup potential** on large files with targeted queries

**Status**: Fully implemented with comprehensive tests, temporarily disabled while debugging a key path construction issue. See `BOTTOM_UP_PARSING.md` for technical details.

This optimization represents significant engineering effort and will be enabled in a future release once the edge case is resolved.

## Installation

### Homebrew (macOS/Linux)
```bash
brew tap weima/code-search https://github.com/weima/code-search
brew install cs
```

### NPM (Cross-platform)
```bash
npm install -g code-search-cli
```

### Cargo (Rust)
```bash
cargo install code-search-cli
```

### From Binary
Download the pre-compiled binary for your platform from [Releases](https://github.com/weima/code-search/releases).

## Usage

### Basic Usage

```bash
# i18n text tracing (default mode)
cs "button text"

# Search in specific directory
cs "button text" /path/to/project

# Case-sensitive search
cs "ExactText" -s
cs "ExactText" --case-sensitive

# Case-insensitive search (explicit)
cs "text" -i
cs "text" --ignore-case
```

### Call Graph Tracing

```bash
# Forward call tracing (what does this function call?)
cs "functionName" --trace

# Backward call tracing (who calls this function?)
cs "functionName" --traceback

# Both directions
cs "functionName" --trace-all

# Custom depth (default: 3, max: 10)
cs "functionName" --trace --depth 5
```

### Search Options

```bash
# Word boundary matching (whole words only)
cs "function" -w
cs "function" --word-regexp

# Regular expression search
cs "fn \w+\(" --regex

# Glob pattern filtering (search only in matching files)
cs "text" -g "*.ts"
cs "text" --glob "*.js" --glob "*.tsx"

# Exclude patterns from search
cs "text" --exclude test,spec,mock

# Include custom file extensions
cs "text" --include-extensions html.ui,vue.custom
```

### File Search

```bash
# Search for files by name only (skip content search)
cs "filename" -f
cs "filename" --file-only
```

### Cache Management

```bash
# Clear the search result cache
cs --clear-cache
```

### Output Options

```bash
# Simple machine-readable output (no progress indicators)
cs "text" --simple

# Verbose output with detailed parse error messages
cs "text" --verbose
```

### Examples

```bash
# Find UI text with custom file types
cs "Add New" --include-extensions html.ui,erb.rails

# Case-sensitive search excluding test files
cs "ClassName" -s --exclude test,spec

# Search TypeScript files only
cs "interface" -g "*.ts"

# Find function calls with regex
cs "handleClick.*(" --regex -g "*.tsx"

# Trace function calls to depth 5
cs "processPayment" --trace --depth 5

# Simple output for AI agents (no progress indicators)
cs "error message" --simple
```

### Help

```bash
cs --help
```

## Testing

The project maintains high test coverage to ensure reliability:

- **234+ passing tests** across all test suites
- **Unit tests**: Core parsing, search logic, and algorithm correctness
- **Integration tests**: End-to-end workflows and real-world scenarios
- **Benchmark tests**: Performance validation with Criterion framework

Run tests locally:
```bash
# All tests
cargo test

# With output
cargo test -- --nocapture

# Benchmarks
cargo bench
```

Test fixtures include real-world translation files from production projects (Rails, React, Vue, Discourse) to ensure robust handling of edge cases.

## Contributing

Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

For technical implementation details, see:
- [BOTTOM_UP_PARSING.md]BOTTOM_UP_PARSING.md - Advanced optimization algorithm
- [CACHING.md]CACHING.md - Two-tier caching architecture
- [STRATEGY.md]STRATEGY.md - Project vision and roadmap

## License

Apache License 2.0 - See [LICENSE](LICENSE) file for details.