grepdef 3.5.0 - Docs.rs

# Performance improvement ideas

Ideas considered but not yet implemented, roughly in order of impact.

## Auto-detect thread count

`lib.rs` — `Config::new` currently defaults to 5 threads:

```rust
None => NonZero::new(5).expect("Default number of threads was invalid"),
```

Use the actual CPU count instead:

```rust
None => std::thread::available_parallelism().unwrap_or(NonZero::new(5).unwrap()),
```

## Pre-create `memmem::Finder` once, reuse across all files

`file_type.rs` — `does_file_match_query` creates a new `Finder` inside the function, so it preprocesses the query pattern once per file:

```rust
pub fn does_file_match_query(mut file: &fs::File, query: &str) -> bool {
    let finder = memmem::Finder::new(query);  // ← rebuilt per file
```

`Finder::new` builds an internal table (similar to Boyer-Moore preprocessing). It should be created once and passed in, or stored in `Config`.

## Fix bug in `does_file_match_query`: always extends with full 2048 bytes

`file_type.rs` — `full.extend(buf)` always extends by 2048 bytes regardless of how many were actually read:

```rust
let bytes = file.read(&mut buf);
if bytes.unwrap_or(0) == 0 {
    break false;
}
// ...
full.extend(buf);  // should be full.extend_from_slice(&buf[..n])
```

If `file.read()` returns fewer than 2048 bytes, this feeds stale data from a prior read into the search buffer, causing potential false negatives.

The cross-chunk overlap logic also doesn't correctly handle matches that span two 2048-byte chunks. The correct approach is to keep only the last `query.len() - 1` bytes as overlap:

```rust
if full.len() > query.len() - 1 {
    full.drain(..full.len() - (query.len() - 1));
}
```

## Increase `BufReader` capacity

`lib.rs` — `search_file_line_by_line` uses the default 8 KB `BufReader` buffer. Increasing it reduces syscalls for large files:

```rust
// Current:
let lines = io::BufReader::new(file).lines();

// Better:
let lines = io::BufReader::with_capacity(64 * 1024, file).lines();
```