# Plan — Drive line coverage from 72 % to 90 %
> Status: **Phases 1–4 shipped.** Landed at 86 % line coverage (planned 90 %; shortfall is macOS-specific `system::*` paths and unsafe `libc::statvfs` that can't be reached from a non-privileged test harness without trait-based mocking of the syscall itself).
> Owner: TBD.
> Scope: SDK only. Doesn't include the `netwatch-agent` integration suite.
## Why
Today's 72 % line coverage came almost entirely from extracting pure parsers and exercising them with captured-output fixtures. That's the cheap, high-value tier. The remaining 28 % falls into three buckets that need different techniques:
1. **Pure parsers still inline with I/O** (`disk.rs`). Same recipe as `system.rs` / `config.rs` — extraction is mechanical and gets us 5–8 percentage points.
2. **Branches in tested files that no fixture currently exercises** (uncovered error paths in `connections.rs`, `network_intel.rs`, `health.rs`). Add fixtures.
3. **Code that touches the OS by design** (`platform/linux.rs`, `system::measure_cpu_usage`, `system::measure_cpu_per_core`, `disk::collect_disk_io`). Either accept a coverage haircut here, or introduce a thin seam (`trait FsReader` injected via `&dyn`) so a test can stand in for `/sys` and `/proc/stat`.
This plan turns each bucket into concrete, sized work.
## Current baseline (per file, as of `ab1e306`)
| `collectors/process_bandwidth.rs` | 99 % | (already done) |
| `collectors/traffic.rs` | 94 % | (already done) |
| `collectors/config.rs` | 85 % | (already done) |
| `collectors/connections.rs` | 78 % | bucket 2 |
| `collectors/network_intel.rs` | 74 % | bucket 2 |
| `collectors/health.rs` | 65 % | bucket 2 (some bucket 3) |
| `collectors/system.rs` | 62 % | bucket 3 (CPU sampling) |
| `collectors/disk.rs` | 0 % | bucket 1 |
| `platform/linux.rs` | 0 % | bucket 3 |
| **Total** | **72 %** | — |
CI floor is set at 65 % (`--fail-under-lines 65` in the workflow). The plan is to raise it as each phase lands.
## Phasing
### Phase 1 — extract `disk.rs` parsers (1 day)
Mirror the `system.rs` / `config.rs` work exactly. Lift the parsers out as module-scope `pub fn`s and add fixture tests.
- `parse_proc_diskstats(&str) -> Option<DiskIo>` — sums sector deltas (×512), excludes `loop`, `ram`, `dm-*`. Tests for typical kernel output, the empty `Module` line, virtual devices skipped, the `disk-stats` format vs the older 11-field format.
- `parse_proc_mounts(&str) -> Vec<MountEntry>` — filters to real `/dev/*`, returns `(device, mount_point, fstype)` tuples. Tests for the snap mounts (`/dev/loop12 on /snap/...`), for filesystem types we want to skip (`tmpfs`, `proc`, `sysfs`, `cgroup2`), and for the macOS variant (returned by `mount` not `/proc/mounts`).
- `parse_macos_mount(&str)` — same, for the `mount` output format.
Coverage payoff: `disk.rs` jumps from 0 % to ~80 %. **Total: ~72 % → ~78 %.**
After landing, raise CI floor to 70 %.
### Phase 2 — fixture-fill the uncovered branches in `connections.rs`, `network_intel.rs`, `health.rs` (3–4 days)
These files already have tests for the happy paths. The gaps are error paths and rare-format cases.
#### `connections.rs` (78 % → ~92 %)
Branches currently uncovered:
- IPv6 RTT extraction in `parse_nettop_output` — currently skipped pending IPv6 normalisation. Add the parsing and the test.
- `lsof` rows with neither `pcPtTn` field. Capture a real-world `lsof -i -F` output where a line is missing the optional fields.
- `parse_ss_output` lines without a valid `users:` field (root processes hidden when running unprivileged).
- The "macOS netstat-only" path in `count_established_connections` and `collect_tcp_states` (currently only the Linux path has fixture coverage).
#### `network_intel.rs` (74 % → ~90 %)
The four detectors are well-covered on the trigger path, but the **alert lifecycle** isn't:
- `tick()` pruning of stale port-scan windows.
- `tick()` pruning of expired DNS outstanding transactions (`DNS_OUTSTANDING_TIMEOUT_SECS`).
- `tick()` aging of alerts past `ACTIVE_ALERT_TTL_SECS`.
- The `BW_ALERT_CLEAR_RATIO` recovery path (alert that fires, then clears).
- `MAX_TRACKED_*` eviction caps.
Each of these is ~5 lines of test setup: build a collector, seed events at known timestamps, advance simulated time by calling `tick()` repeatedly, assert state.
#### `health.rs` (65 % → ~85 %)
The big gap is `run_ping` itself (it shells out to `ping`). Two options:
- **Don't test it** — accept the haircut. The parsers (`parse_loss`, `parse_avg_rtt`) are already well-covered and they're the bug-prone bit.
- **Test the smoke path** — a live test that runs `run_ping("127.0.0.1")` and asserts loss < 50 %. Skip on unprivileged hosts where loopback ping is restricted.
Recommend the live smoke test; it adds confidence that the SDK still talks to the system `ping`.
Coverage payoff: ~6 percentage points across the three files. **Total: ~78 % → ~84 %.**
After landing, raise CI floor to 78 %.
### Phase 3 — seams for `system::measure_cpu_*` and `platform::collect_interface_stats` (3–5 days)
This is where coverage gets architectural. Both functions read live OS state and sleep. Without a seam, they can only be exercised by integration tests on real hosts.
Two design options:
#### Option A — fakeable filesystem trait
Introduce a private trait:
```rust
pub(crate) trait FsReader {
fn read_to_string(&self, path: &Path) -> std::io::Result<String>;
}
```
Production code uses a `RealFs` implementation that calls `std::fs::read_to_string`. Tests construct collectors with a `FakeFs` map of `path → contents`.
- Pros: parses everything against fixtures, gets `platform/linux.rs` and the CPU-sampling code paths to ~95 % covered, no behaviour change in production.
- Cons: changes the public signature of `measure_cpu_usage`, `read_memory`, and friends. Or — keep the public signature, accept a `&impl FsReader` only in `pub(crate)` helpers, and have the public functions call them with `RealFs`. The latter has zero public-API impact.
Recommend the latter. It's the pattern used by `tokio` for its mocked clock and by `git2-rs` for repo objects.
#### Option B — `#[cfg(test)]` substitution
Replace `std::fs::read_to_string` with a thin wrapper that, under `#[cfg(test)]`, reads from a `thread_local!` map.
- Pros: no API change at all.
- Cons: less explicit; the seam is invisible to readers; test setup is a bit magic.
Recommend Option A.
#### What this unlocks
After the seam exists:
- `system::measure_cpu_usage` — fixture: two `/proc/stat` snapshots → assert the computed %.
- `system::measure_cpu_per_core` — same.
- `platform::linux::collect_interface_stats` — fixture: synthetic `/sys/class/net/eth0/statistics/*` files → assert the returned `InterfaceStats`. macOS gets the same treatment with a `CommandRunner` trait if needed.
- `disk::collect_disk_io` — fixture: synthetic `/proc/diskstats`.
Coverage payoff: ~6 percentage points. **Total: ~84 % → ~90 %.**
After landing, raise CI floor to 85 %.
### Phase 4 — property-based tests (optional, 1 week)
For the highest-stakes parsers (`parse_ss_output`, `parse_nettop_output`, `parse_proc_meminfo`), add `proptest` generators that produce structurally-valid synthetic outputs and assert round-trip / invariant properties.
- `parse_ss_output(format(c)).unwrap() == c` for any synthetic `Connection`.
- `parse_proc_meminfo` never returns more "used" than "total".
- `parse_nettop_output` never returns negative RTTs.
This won't move the coverage % much (the parsers are already at high %), but it catches the bugs the fixture suite never thought of.
## Targets and floor schedule
| Today | 72 % | 65 |
| Phase 1 | ~78 % | 70 |
| Phase 2 | ~84 % | 78 |
| Phase 3 | ~90 % | 85 |
| Phase 4 | ~92 % | 88 |
The floor lags the target by ~3 percentage points so legitimate refactors don't accidentally fail CI on a one-line uncovered helper.
## What we explicitly won't pursue
- **100 % coverage.** Some `unsafe` libc blocks (in `system::macos::read_load_avg`) would require linker-level mocking to exercise — not worth the complexity for one function call.
- **Mocking the system `ping` binary** in a way that returns crafted output. Either we live-test against `127.0.0.1` (preferred) or we don't test `run_ping` at all.
- **Coverage for the eBPF programs** (when phase 1 of the eBPF plan ships). BPF coverage tooling exists (`bpftool prog profile`) but isn't a fit for unit tests; rely on the privileged-container integration suite.
## Tooling housekeeping
- **Codecov / Coveralls upload.** The CI job already produces `lcov.info` as an artifact. A 5-line addition to publish to Codecov gives us per-PR diff summaries. Worth doing alongside Phase 1.
- **Coverage badge in README.** Codecov auto-generates one. Slot it next to the existing crates.io and license badges.
- **Diff coverage gate** (Codecov's "patch coverage"). Stricter than the absolute floor: every PR must cover ≥ 80 % of the lines it adds. Saves us from review fatigue around "did this PR add tests?".
## Estimate
| 1 | 1 day | +6 pp |
| 2 | 3–4 days | +6 pp |
| 3 | 3–5 days | +6 pp |
| 4 | 1 week (opt) | +2 pp + bug-finding |
End state: **~90 % line coverage** with a CI floor of 85 %, plus property tests for the highest-leverage parsers and per-PR diff coverage in the review surface.