php-lsp 0.2.0 - Docs.rs

# Test Suite Strengthening Plan

Goal: reach rust-analyzer-grade test coverage for every LSP feature in php-lsp, with fast, non-brittle, diff-friendly tests.

## Principle: two tiers

rust-analyzer splits tests into two tiers with different cost models:

1. **Unit tier** — co-located `#[cfg(test)] mod tests` in each feature module, driving the handler directly on an in-memory `Analysis` facade. Hundreds per feature, milliseconds each.
2. **Slow tier** — `tests/slow/` drives the real LSP server through an in-memory channel. One happy-path test per feature. Guarded by `PHP_LSP_SKIP_SLOW=1`.

Today php-lsp is ~all slow tier. Flipping this ratio is the whole plan.

---

## Phase 1 — Foundation (1–2 days)

### 1.1 Wire up `tests/common/fixture.rs`

Already parses `//- /path`, `$0`, `// ^^^ error: …`. No call sites. Export through `common/mod.rs`.

Port the three primitives from rust-analyzer `crates/test-utils/src/lib.rs`:

- `extract_offset(&str) -> (Position, String)` — single `$0`
- `extract_range(&str) -> (Range, String)` — `$0…$0`
- `extract_annotations(&str) -> Vec<(Range, String)>` — **generic** caret extraction, not diagnostic-specific

Make `extract_annotations` return raw `(Range, message)` pairs. Each feature decides how to interpret the message (`def`, `ref`, `error: …`, `hint: …`, etc.).

### 1.2 Add `#[track_caller]` to every `check_*` helper

Failures then point to the test, not the helper. Single largest debugging-friction win.

### 1.3 Introduce an in-process `Analysis` facade

```rust
// src/test_harness.rs (cfg(test) or cfg(feature = "test-utils"))
pub struct Analysis { backend: Backend, workspace: TempDir }
impl Analysis {
    pub fn from_fixture(src: &str) -> Self;
    pub fn hover(&self, path: &str, pos: Position) -> Option<Hover>;
    pub fn diagnostics(&self, path: &str) -> Vec<Diagnostic>;
    pub fn references(&self, path: &str, pos: Position) -> Vec<Location>;
    // …one method per feature, calling the handler function directly
}
```

No tokio runtime, no debounce, no JSON-RPC. This is the rust-analyzer `ide::Analysis` pattern. All Phase 3 unit tests target this.

---

## Phase 2 — `expect_test` as the default (1 day)

Already a dev-dep, used in ~20 files. Make it the standard. Eliminate manual JSON path extraction (`resp["result"].as_array()…`).

For each feature, write one renderer producing deterministic text:

```rust
fn render_references(locs: &[Location], files: &FileMap) -> String;
fn render_hover(h: &Hover) -> String;
fn render_workspace_edit(we: &WorkspaceEdit, files: &FileMap) -> String;
```

- Paths relativized to the workspace root.
- Ranges as `1:5..1:12`.
- One entity per line; stable ordering.

Tests become:

```rust
let out = analysis.references_at("main.php", "foo");
expect![[r#"
    Greeter.php 3:9..3:12 (decl)
    main.php 5:3..5:6
"#]].assert_eq(&out);
```

`UPDATE_EXPECT=1 cargo test` rewrites in place.

---

## Phase 3 — Per-feature unit tests (the bulk of the work)

For each LSP feature, add `#[cfg(test)] mod tests` at the bottom of `src/<feature>.rs`, backed by Phase 1 + 2.

### Target coverage matrix

| Feature | Minimum scenarios |
|---|---|
| `hover` | function, method, static method, property, class, interface, enum, enum case, constant, namespaced symbol, variable (via `TypeMap`), parameter, `use`-imported symbol, docblock `@param`/`@return`, missing symbol |
| `definition` / `declaration` / `type_definition` | each above axis × cross-file × PSR-4-resolved × stubbed built-in × `vendor/` |
| `references` | declaration site, `use` statement, method call, static call, property access, interface → implementing methods, trait → using classes, enum case usage, `new` expression |
| `rename` | `prepareRename` bounds + invalid sites, function, method (across subclasses), property, class (triggers `file_rename`), namespace |
| `file_rename` | PSR-4 folder move, single-file rename, collision rollback |
| `completion` | `->`, `::`, `\` prefix, `use` import completion, keyword, local variable, parameter, fuzzy-ranked order, docblock `@param`/`@return` |
| `diagnostics` | every `mir_php` diagnostic kind, `relatedInformation` link present, severity correct (caret DSL) |
| `code_actions` | each generate/implement/phpdoc/type/promote/extract/inline/organize scenario via `check_assist(before, after)`. Also assert deferred vs eager (`command`/`data` vs inline `edit`) |
| `signature_help` | nested calls, variadic, trailing comma, active parameter index, docblock overloads |
| `call_hierarchy` / `type_hierarchy` | interface → impl, trait → using class, abstract → concrete, multi-level inheritance |
| `inlay_hints` | parameter names, promoted constructor params, return types, disabled settings |
| `semantic_tokens` | full / range / delta correctness, every token kind |
| `folding` / `selection_range` | nested classes/functions, heredoc, match arms |
| `document_symbols` | methods, nested classes, traits, enums, anonymous classes |
| `document_highlight` | variable scope isolation, property vs variable |

### Example: caret annotations let one test assert many things

```rust
check_references(r#"
//- /Greeter.php
<?php class Greeter {
    public function hello() {}
          //^^^^^ def
}

//- /main.php
<?php $g = new Greeter();
$g->hel$0lo();
 //^^^^^ ref
"#);
```

One test, two annotations, zero hand-written `Position { line, character }`.

---

## Phase 4 — Cut the slow tier to essentials

Keep `tests/e2e_*.rs` but shrink. For each feature: **one** happy-path test proving the full JSON-RPC round trip (capabilities → `did_open` → request → response). Edge cases live in Phase 3.

Rationale: rust-analyzer has ~20 slow-tests for the entire server. You have ~150 e2e tests; 95% of what they assert doesn't need the server.

### Harness additions

- `await_indexed()` — block on a specific notification (add a `php-lsp/indexingFinished` custom notification, or reuse the first successful `workspace/symbol` after scan). Kills the `sleep(50ms)` in robustness.
- `skip_slow_tests()` — gate honoring `PHP_LSP_SKIP_SLOW=1`, checked at the top of every slow test.
- Move slow tests into `tests/slow/` as their own binary so `cargo test --lib` skips them.

### Migration discipline

Never delete an e2e test before its unit-tier replacement exists and passes.

---

## Phase 5 — Coverage for things currently unmeasured

### 5.1 Salsa cache-hit assertions

Instrument `RootDatabase` with query counters. Tests:

```rust
let analysis = Analysis::from_fixture(…);
let before = analysis.query_stats();
analysis.edit("A.php", …);       // edit unrelated file
analysis.hover("B.php", pos);
let after = analysis.query_stats();
assert_eq!(after.parse_cache_hits - before.parse_cache_hits, n_files - 1);
```

Catches regressions like "did_change invalidates every file". Mirrors rust-analyzer `analysis-stats`.

### 5.2 `ministubs.php`

Single flag-gated fixture:

```php
// ^^ iterator
interface Iterator { … }
// ^^ pdo
class PDO { … }
```

Fixture header: `//- ministubs: iterator,pdo`. Loader strips unflagged sections. Unit tests get only the stubs they need — much faster than loading the full bundled stubs.

### 5.3 Property-based tests (`proptest`)

Small, high-payoff surface:

- `word_at` never panics on arbitrary UTF-8 input.
- `canonicalize_workspace_edit` is idempotent.
- FQN resolution round-trips: identifier → resolve via `use` imports → stable FQN.

### 5.4 Cross-file incremental correctness

Currently one ignored test (`server does not proactively republish diagnostics when a dependency changes`). Fix the gap, then make it the canonical test:

```rust
check_incremental(r#"
//- /A.php
<?php function foo(int $x) {}
//- /B.php
<?php foo("str");
      //^^^^^^^^ error: expected int
"#, change: |a| a.edit("A.php", "<?php function foo(string $x) {}"),
    expect_after: r#"
//- /B.php
<?php foo("str");
"#);
```

---

## Phase 6 — Performance regression gate

`benches/` exists but nothing blocks on it. Add:

- **`iai-callgrind`** benches for: cold workspace scan (symfony-demo), hot request after scan, hover after 1000 edits. Deterministic — works in CI.
- **`cargo xtask stats`** subcommand mirroring `rust-analyzer analysis-stats`: prints salsa cache sizes, memoization hit rate, diagnostic counts per fixture. Snapshot and diff in CI.

---

## Migration order

1. Phase 1.1 + 1.2 (foundation). **Zero existing test changes.**
2. Phase 1.3 + Phase 2 on `hover` as the pilot. Prove the pattern.
3. Phase 3 feature by feature. Parallelizable. Start with **references, diagnostics, code_actions** — highest coverage gains.
4. Phase 4 only after Phase 3 has replaced the coverage. Never delete an e2e test before its unit-tier replacement lands.
5. Phase 5 / 6 as follow-ups.

## Expected outcomes

- ~10× test count at ~3× wall time (current time is dominated by JSON-RPC and workspace scans, not analysis).
- Near-zero hand-written `Position { line, character }`. Carets + `$0` cover it.
- Failures point at tests, not helpers. `#[track_caller]` everywhere.
- One fixture syntax across both tiers — the same string literal runs in either.
- `UPDATE_EXPECT=1` becomes the refactor primitive: rename a diagnostic message, run with the flag, commit the diff.

## Explicit non-goals

- **`insta`** — `expect_test` inline blocks are strictly better for this codebase's size. No separate `.snap` directory.
- **Full `test-utils` crate extraction** — overkill until there are multiple consuming crates.
- **Mock parser / FS** — rust-analyzer doesn't mock either; fixture tempdirs are fast enough.

---

## Reference patterns (rust-analyzer)

- `crates/test-utils/src/fixture.rs` — `//- /path` DSL parser
- `crates/test-utils/src/lib.rs` — `extract_offset`, `extract_range`, `extract_annotations`
- `crates/test-utils/src/minicore.rs` — flag-gated std shim
- `crates/ide/src/hover/tests.rs` — canonical `check(fixture, expect!)` shape
- `crates/ide-diagnostics/src/tests.rs` — `check_diagnostics` / `check_fix` via caret annotations
- `crates/ide-assists/src/tests.rs` — `check_assist(assist_id, before, after)` for code actions
- `crates/rust-analyzer/tests/slow-tests/{main.rs,support.rs}` — in-process server via `Connection::memory()`, `wait_until_workspace_is_loaded`