blockwatch 0.2.25

Language agnostic linter that keeps your code and documentation in sync and valid
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
# BlockWatch

[![Build Status](https://github.com/mennanov/blockwatch/actions/workflows/rust.yml/badge.svg)](https://github.com/mennanov/blockwatch/actions)
[![codecov](https://codecov.io/gh/mennanov/blockwatch/graph/badge.svg?token=LwUfGTZ551)](https://codecov.io/gh/mennanov/blockwatch)
[![Crates.io](https://img.shields.io/crates/v/blockwatch)](https://crates.io/crates/blockwatch)
[![Downloads](https://img.shields.io/crates/d/blockwatch)](https://crates.io/crates/blockwatch)

BlockWatch is a linter that keeps your code, documentation, and configuration in sync and enforces strict formatting and
validation rules.

<p>
  <img src="demo.gif" alt="BlockWatch Demo">
</p>

It helps you avoid broken docs and messy config files by enforcing rules directly in your comments. You can link code to
documentation, auto-sort lists, ensure uniqueness, and even validate content with Regex, AI, or custom Lua scripts.

It works with almost any language (Rust, Python, JS, Go, Markdown, YAML, etc.) and can run on your entire repo or just
your VCS diffs.

## Features

[//]: # (<block name="available-validators">)

- **Drift Detection**: Link a block of code to its documentation. If you change the code but forget the docs, BlockWatch
  alerts you.
- **Strict Formatting**: Enforce sorted lists (`keep-sorted`) and unique entries (`keep-unique`) so you don't have to
  nitpick in code reviews.
- **Content Validation**: Check lines against Regex patterns (`line-pattern`) or enforce block size limits (
  `line-count`).
- **AI Rules**: Use natural language to validate code or text (e.g., "Must mention 'banana'").
- **Lua Scripting**: Write custom validation logic in Lua scripts (`check-lua`).
- **Flexible**: Run it on specific files, glob patterns, or just your unstaged changes.

[//]: # (</block>)

## Installation

### Homebrew (macOS/Linux)

```shell
brew tap mennanov/blockwatch
brew install blockwatch
```

### From Source (Rust)

```shell
cargo install blockwatch
```

### Prebuilt Binaries

Check the [Releases](https://github.com/mennanov/blockwatch/releases) page for prebuilt binaries.

## Quick start example

1. Add a special `block` tag in the comments in any supported file ([See *Supported Languages*]#supported-languages)
   like this:

   ```python
   user_ids = [
       # <block keep-sorted keep-unique>
       "cherry",
       "apple",
       "apple",
       "banana",
       # </block>
   ]
   ```

2. Run `blockwatch`:

   ```shell
   blockwatch
   ```

   BlockWatch will fail and tell you that the list is not sorted and has duplicate entries.

3. Fix the order and uniqueness:

   ```python
   user_ids = [
       # <block keep-sorted keep-unique>
       "apple",
       "banana",
       "cherry",
       # </block>
   ]
   ```

4. Run `blockwatch` again:

   ```shell
   blockwatch
   ```

   Now it passes!

## How It Works

You define rules using HTML-like tags inside your comments.

### Linking Code Blocks (`affects`)

This ensures that if you change some block of code, you're forced to look at the other blocks too.

**src/lib.rs**:

```rust
// <block affects="README.html:supported-langs">
pub enum Language {
    Rust,
    Python,
}
// </block>
```

**README.html**:

```html
<!-- <block name="supported-langs"> -->
<ul>
    <li>Rust</li>
    <li>Python</li>
</ul>
<!-- </block> -->
```

If you modify the enum in `src/lib.rs`, BlockWatch will fail until you touch the corresponding block `supported-langs`
in `README.html` as well.

### Enforce Sort Order (`keep-sorted`)

Keep lists alphabetized. Default is `asc` (ascending).

```python
# <block keep-sorted>
"apple",
"banana",
"cherry",
# </block>
```

If the list is not sorted alphabetically, BlockWatch will fail until you fix the order.

#### Sort by Regex

You can sort by a specific part of the line using a regex capture group named `value`.

```python
items = [
    # <block keep-sorted="asc" keep-sorted-pattern="id: (?P<value>\d+)">
    "id: 1  apple",
    "id: 2  banana",
    "id: 10 orange",
    # </block>
]
```

#### Numeric Sort (`keep-sorted-format`)

By default, values are compared lexicographically (as strings). This means `"10"` sorts before `"2"` because `"1" < "2"`
character-by-character. Use `keep-sorted-format="numeric"` to compare values as numbers instead.

```python
numbers = [
    # <block keep-sorted keep-sorted-format="numeric">
    2
    10
    20
    # </block>
]
```

This works with `keep-sorted-pattern` to extract numeric values from lines with mixed content:

```python
items = [
    # <block keep-sorted keep-sorted-format="numeric" keep-sorted-pattern="id: (?P<value>\d+)">
    "id: 2  banana",
    "id: 10 orange",
    "id: 20 apple",
    # </block>
]
```

Without `keep-sorted-format="numeric"`, the example above would fail because `"10"` is lexicographically less than
`"2"`.

### Enforce Unique Lines (`keep-unique`)

Prevent duplicates in a list.

```python
# <block keep-unique>
"user_1",
"user_2",
"user_3",
# </block>
```

#### Uniqueness by Regex

Just like sorting, you can check uniqueness based on a specific regex match.

```python
ids = [
    # <block keep-unique="^ID:(?P<value>\d+)">
    "ID:1 Alice",
    "ID:2 Bob",
    "ID:1 Carol",  # Violation: ID:1 is already used
    # </block>
]
```

### Regex Validation (`line-pattern`)

Ensure every line matches a specific regex pattern.

```python
slugs = [
    # <block line-pattern="^[a-z0-9-]+$">
    "valid-slug",
    "another-one",
    # </block>
]
```

### Enforce Line Count (`line-count`)

Enforce the number of lines in a block.
Supported operators: `<`, `>`, `<=`, `>=`, `==`.

```python
# <block line-count="<=5">
"a",
"b",
"c"
# </block>
```

### Validate with AI (`check-ai`)

Use an LLM to validate logic or style.

```html
<!-- <block check-ai="Must mention the company name 'Acme Corp'"> -->
<p>Welcome to Acme Corp!</p>
<!-- </block> -->
```

#### Targeted AI Checks

Use `check-ai-pattern` to send only specific parts of the text to the LLM.

```python
prices = [
    # <block check-ai="Prices must be under $100" check-ai-pattern="\$(?P<value>\d+)">
    "Item A: $50",
    "Item B: $150",  # Violation
    # </block>
]
```

#### Supported environment variables

[//]: # (<block name="check-ai-env-vars">)

- `BLOCKWATCH_AI_API_KEY`: API Key.
- `BLOCKWATCH_AI_MODEL`: Model name (default: `gpt-5-nano`).
- `BLOCKWATCH_AI_API_URL`: Custom OpenAI compatible API URL (optional).

[//]: # (</block>)

### Validate with Lua Scripts (`check-lua`)

Run custom validation logic using a Lua script. The script must define a global `validate(ctx, content)` function that
returns `nil` if validation passes or a string error message if it fails.

```python
colors = [
    # <block check-lua="scripts/validate_colors.lua">
    'red',
    'green',
    'blue',
    # </block>
]
```

**scripts/validate_colors.lua**:

```lua
function validate(ctx, content)
    if content:find("purple") then
        return "purple is not an allowed color"
    end
    return nil
end
```

The `validate` function receives two arguments:

- `ctx` — a table with the following fields:
    - `ctx.file` — the source file path.
    - `ctx.line` — the line number of the block's start tag.
    - `ctx.attrs` — a table of all block attributes.
- `content` — the trimmed text content of the block.

<!-- <block name="lua-safety-modes"> -->

#### Lua safety mode

By default, Lua scripts run in a **sandboxed** mode with only the `coroutine`, `table`, `string`, `utf8`, and `math`
standard libraries available. The `io`, `os`, and `package` libraries are **not** loaded, preventing file system access,
command execution, and loading of external modules.

You can change the security level by setting the `BLOCKWATCH_LUA_MODE` environment variable:

```shell
# Allow IO and OS libraries (memory-safe, but with file/system access)
BLOCKWATCH_LUA_MODE=safe blockwatch

# Allow all libraries including C module loading (unsafe)
BLOCKWATCH_LUA_MODE=unsafe blockwatch
```

| `BLOCKWATCH_LUA_MODE` | Libraries available                                                   | Security Level                      |
|-----------------------|-----------------------------------------------------------------------|-------------------------------------|
| `sandboxed` (default) | `coroutine`, `table`, `string`, `utf8`, `math`                        | Most secure - No file/OS access     |
| `safe`                | All memory-safe libraries (including `io`, `os`, `package`)           | Memory-safe - Allows file/OS access |
| `unsafe`              | All Lua standard libraries with no restrictions (including C modules) | Unsafe - Full system access         |

<!-- </block> -->

## Usage

### Run Locally

Validate all blocks in your project:

```shell
# Check everything
blockwatch

# Check specific files
blockwatch "src/**/*.rs" "**/*.md"

# Ignore stuff
blockwatch "**/*.rs" --ignore "**/generated/**"
```

> **Tip:** Glob patterns should be quoted to avoid shell expanding them.

### Check Only What Changed

Pipe a git diff to BlockWatch to validate only the blocks you touched. This is perfect for pre-commit hooks.

```shell
# Check unstaged changes
git diff --patch | blockwatch

# Check staged changes
git diff --cached --patch | blockwatch

# Check changes in a specific file only
git diff --patch path/to/file | blockwatch

# Check changes and some other (possibly unchanged) files
git diff --patch | blockwatch "src/always_checked.rs" "**/*.md"
```

### Listing Blocks

You can list all blocks that BlockWatch finds without running any validation. This is useful for auditing your blocks or
debugging your configuration.

```shell
# List all blocks in the current directory
blockwatch list

# List blocks in specific files
blockwatch list "src/**/*.rs" "**/*.md"

# List only blocks affected by current changes
git diff | blockwatch list
```

The output is a JSON object.

#### Example Output

[//]: # (<block name="list-output-example">)

```json
{
  "README.md": [
    {
      "name": "available-validators",
      "line": 18,
      "column": 10,
      "is_content_modified": false,
      "attributes": {
        "name": "available-validators"
      }
    }
  ]
}
```

[//]: # (</block>)

### CI Integration

#### Pre-commit Hook

Add this to `.pre-commit-config.yaml`:

```yaml
- repo: local
  hooks:
    - id: blockwatch
      name: blockwatch
      entry: bash -c 'git diff --patch --cached --unified=0 | blockwatch'
      language: system
      stages: [ pre-commit ]
      pass_filenames: false
```

#### GitHub Action

Add this to `.github/workflows/your_workflow.yml`:

```yaml
- uses: mennanov/blockwatch-action@v1
```

## Supported Languages

BlockWatch supports comments in:

[//]: # (<block name="supported-grammar" keep-sorted="asc">)

- Bash
- C#
- C/C++
- CSS
- Go (with `go.mod`, `go.sum` and `go.work` support)
- HTML
- Java
- JavaScript
- Kotlin
- Makefile
- Markdown
- PHP
- Python
- Ruby
- Rust
- SQL
- Swift
- TOML
- TypeScript
- XML
- YAML

[//]: # (</block>)

## CLI Options

[//]: # (<block name="cli-docs">)

- **List Blocks**: `blockwatch list` outputs a JSON report of all found blocks.
- **Extensions**: Map custom extensions: `blockwatch -E cxx=cpp`
- **Disable Validators**: `blockwatch -d check-ai`
- **Enable Validators**: `blockwatch -e keep-sorted`
- **Ignore Files**: `blockwatch --ignore "**/generated/**"`

[//]: # (</block>)

## Known Limitations

- Deleted blocks are ignored.
- Files with unsupported grammar are ignored.

## Contributing

Contributions are welcome! A good place to start is
by [adding support for a new grammar](https://github.com/mennanov/blockwatch/pull/2).

### Run Tests

```shell
cargo test
```