yosh 0.2.1

A POSIX-compliant shell implemented in Rust
# TODO

## Job Control: Known Limitations

- [ ] `disown` builtin — not implemented (non-POSIX extension)
- [ ] `suspend` builtin — not implemented
- [ ] Pipeline command display in `jobs` output uses placeholder format — improve to reconstruct shell syntax
- [ ] Task 7 (`fg` job-termios replay) has no direct PTY assertion — Task 9/10 verify end-state only (Task 6 shell-restore). On macOS/BSD, `/bin/cat`'s `read()` inherits `SIG_DFL` for SIGCONT and BSD does not auto-restart `read()` without `SA_RESTART`, so cat exits with EINTR immediately after `fg`. Linux auto-restarts `read()` on terminals for `SIG_DFL` signals, masking this asymmetry. Revisit by using a sleep/read-loop helper that retries on EINTR, or by reading `tcgetattr` directly via the PTY master between `fg\r` and cat's exit (the diagnosis details currently live in the `DEVIATION` comment of `test_pty_termios_preserved_across_suspend_fg` in `tests/pty_interactive.rs`).
- [ ] `JobTable.shell_tmodes` is a one-time startup snapshot — `stty` invoked at the interactive prompt modifies the real terminal but not the cached snapshot, so the post-foreground shell-restore overwrites user-applied `stty` changes (`src/interactive/mod.rs` + `src/env/jobs.rs`). Matches glibc manual behavior; revisit if user reports surface.
- [ ] `wait_for_foreground_job` Stopped arm writes `JobStatus::Stopped` twice — once via `update_status(pid, ...)` (per-pid lookup, resets `notified=false`) and again via `record_stopped_state(job_id, ...)` (per-job_id lookup). The second write is the same value, so harmless, but redundant. Pre-existing carry-over from before the helper extraction; cleanup would either drop the `update_status` call (and merge the `notified=false` reset into `record_stopped_state`) or have `record_stopped_state` skip the status write when called from this site. Code-review follow-up from 2026-04-26 terminal-state followup branch (`src/exec/mod.rs`).
- [ ] `Repl::new` comment phrasing — the Task 4 annotation says the symmetric guard is "inside `wait_for_foreground_job`'s `restore_shell_termios_if_interactive`", but `restore_shell_termios_if_interactive` is a sibling method called *after* `wait_for_foreground_job` returns (also called from `builtin_fg` and pipeline). Tighten to "in `restore_shell_termios_if_interactive`, which runs after each foreground wait" (`src/interactive/mod.rs`). Code-review follow-up from 2026-04-26 terminal-state followup branch.

## Code Format Drift

- [ ] `src/env/jobs.rs` test asserts use deprecated single-line-broken `assert!(expr,\n    "msg")` style — `test_job_saved_tmodes_defaults_none`, `test_job_table_shell_tmodes_defaults_none`, `test_set_shell_tmodes_stores_value` all trigger `rustfmt --edition 2024 --check` diffs because rustfmt prefers the multi-line trailing-comma form. Pre-existing across the file; would block any future rustfmt CI gate. Reformat to the canonical multi-line form when next touching these tests (`src/env/jobs.rs`).
- [ ] Project-wide `cargo fmt --check` drift — many files trigger rustfmt diffs because of manual column alignment or single-line `assert!`/`criterion_group!`/`CacheKey::for_runtime` calls that rustfmt rewrites. Affected sites observed during 2026-04-29 plugin commands:exec branch wrap-up: `benches/plugin_bench.rs:93` (`criterion_group!` body), `crates/yosh-plugin-api/src/lib.rs:6+` (capability constant alignment — `pub const CAP_VARIABLES_READ:  u32 = 0x01;` style), `crates/yosh-plugin-manager/src/{config,install,update,sync}.rs`, `src/exec/mod.rs:898`, `src/lexer/mod.rs:606,622`, `src/parser/mod.rs:1041,1493,1500,1564,1571,1606`. Pre-existing; CI gate would surface them all at once. Reformat in a dedicated commit when ready to enforce.

## History: Known Limitations

- [ ] `HISTCONTROL` colon-separated values — bash supports `ignoredups:ignorespace` but current implementation only accepts single values like `ignoreboth` (`src/interactive/history.rs`)
- [ ] `history.save()` silently ignores write errors — disk-full or permission errors are swallowed (`src/interactive/history.rs`)
- [ ] `suggest()` linear scan performance — iterates all history entries on each keystroke; acceptable for HISTSIZE ≤ 500, may need caching or indexing for larger histories (`src/interactive/history.rs`)

## Future: Interactive Mode Enhancements

- [ ] `ENV` tilde expansion PTY test — `ENV=~/foo` tilde expansion is only exercised on interactive startup; add PTY test to verify `~` and `~user` cases (`tests/pty_interactive.rs`)
- [ ] Multiline editing — visual multiline editing with cursor movement across lines
- [ ] `set -o interactive` flag management
- [ ] Interactive-specific trap behavior — SIGTERM/SIGQUIT ignored by default
- [ ] `CLICOLOR=0` support in `should_colorize()` — disable colors even on TTY when `CLICOLOR=0` is set; many CLI tools support this alongside `NO_COLOR` (`src/main.rs`)
- [ ] Bash-style prompt escapes — `\w` (working directory), `\u` (username), `\h` (hostname), etc.
- [ ] History expansion — `!!` (last command), `!n` (by number)
- [ ] Right-aligned prompt (`PS1_RIGHT`) — starship-style right-side prompt display based on terminal width (`src/interactive/line_editor.rs`)
- [ ] Pre-prompt hook timeout — protect against slow `pre_prompt` plugins blocking prompt display; consider timeout or async approach (`src/plugin/mod.rs`)
- [ ] Prompt segment API — structured segment registration for multiple plugins to contribute prompt sections without PS1 conflicts (`src/plugin/`, `crates/yosh-plugin-sdk/`)
- [ ] Ctrl+C / empty-Enter type distinction — both return `Ok(Some(""))` from `read_line`; introduce a dedicated variant for clearer intent (`src/interactive/line_editor.rs`, `src/interactive/mod.rs`)
- [ ] Parse status edge-case tests — `||` continuation, `for...do` incomplete, nested structures, unterminated here-document (`tests/interactive.rs`)
- [ ] Tab completion: `CompletionUI`/`FuzzySearchUI` filtered/total display — both UIs show `N/N` instead of `filtered/total` because original count is not tracked (`src/interactive/completion.rs`, `src/interactive/fuzzy_search.rs`)
- [ ] Tab completion: unify `read_line` and `read_line_with_completion` — `read_line` is now only used by tests; consider merging into a single method (`src/interactive/line_editor.rs`)
- [ ] Syntax highlighting: color palette customization — allow users to override colors via environment variables like `YOSH_COLOR_KEYWORD=blue` (`src/interactive/highlight.rs`)
- [ ] Syntax highlighting: double-quote `$` expansion uses inline scanning — deeply nested cases like `"$(foo "$(bar)")"` may highlight incorrectly; consider mode-stack approach (`src/interactive/highlight.rs`)
- [ ] Syntax highlighting: `redraw()` ANSI optimization — currently calls `reset_style()` on every style change; could reduce escape sequences with diff-based rendering (`src/interactive/line_editor.rs`)
- [ ] Emacs keybindings: `~/.inputrc` config file — Keymap struct is separated for future configurability but no config file reading is implemented (`src/interactive/keymap.rs`)
- [ ] Emacs keybindings: undo group boundary on space — spec says space triggers undo group boundary but implementation defers boundary to next non-space char; undo granularity is slightly coarser than readline (`src/interactive/line_editor.rs`)
- [ ] Emacs keybindings: PTY E2E tests — kill/yank round-trip, undo, word movement, numeric arg scenarios not covered by PTY tests (`tests/pty_interactive.rs`)
- [ ] PTY tests: remaining `thread::sleep` after send — autosuggest/tab completion/syntax highlight/`set -m` tests still rely on 50–200ms fixed waits for UI render or child startup (not raw-mode races); if CI flakiness appears on those paths, migrate them to condition-based waits similar to `wait_for_raw_mode` (`tests/pty_interactive.rs`)

## Future: Plugin System Enhancements

- [ ] WASI surface lockdown deviation from spec §6 — `src/plugin/linker.rs` registers the full `wasmtime_wasi::add_to_linker_sync` surface rather than the spec-prescribed `clocks` + `random` subset, because cargo-component's wasip2 adapter pulls in `wasi:io`, `wasi:cli/*`, `wasi:filesystem`, and `wasi:sockets` transitively for any Rust component (even plugins that touch only the `yosh:plugin/*` host imports). Privacy is still enforced by the empty `WasiCtx` (no preopens, no stdio, no env, no args), but the linker surface is wider than the spec implied. Revisit if a future cargo-component release stops emitting unused WASI imports, or if a hand-built core-wasm pipeline becomes practical.
- [ ] Spec §8.4 "metadata cannot reach host APIs" — covered at the host-internal level via `src/plugin/host.rs::tests::metadata_contract_*` (every real host import returns `Err(Denied)` when `HostContext.env` is null). A contrived plugin whose `metadata()` calls `cwd()` would test the same invariant but requires SDK plumbing to override the trait's default `metadata` body, which Task 6 deferred. If the SDK gains an `override_metadata` hook in the future, add the integration-level companion.
- [ ] Spec §8.10 "WASI surface lockdown" integration test — currently covered indirectly by `src/plugin/linker.rs::tests::linker_construction_smoke` and the empty-`WasiCtx` isolation property. A hand-crafted wasm component that imports `wasi:cli/stdout` and asserts an unsatisfied-import error at instantiate would be a stronger negative test, but requires fixture authoring (raw wasm) outside the cargo-component pipeline. Defer until a fixture pattern is established.
- [ ] Plugin runtime limits (fuel / memory caps / pre-prompt timeout) — out of scope for v0.2.0 per spec §10; add wasmtime fuel metering and per-call memory caps when ready.
- [ ] Cargo workspace profile warning — `tests/plugins/{test_plugin,trap_plugin}/Cargo.toml` declare `[profile.release]` blocks that Cargo emits "profiles for the non root package will be ignored" on every build. cargo-component honors them but stock Cargo does not. Either hoist the profile config to the workspace root `Cargo.toml` (gated on `wasm32-wasip2` target) or document the warning as expected. Cosmetic but noisy.
- [ ] Spec §8.6–§8.8 cwasm field-mutation tests at integration level — `tests/plugin.rs` covers `t06` (cwasm missing) and `t09` (wasm SHA mismatch) end-to-end, but per-field mutation of `wasmtime_version` / `target_triple` / `engine_config_hash` is currently only unit-tested in `src/plugin/cache.rs::tests`. Adding integration smokes would require a fixture-cwasm builder helper.
- [ ] `benches/plugin_bench.rs` output noise — each iteration triggers `test_plugin`'s `print()` calls, polluting stdout during benchmark runs. Add a silent `noop_cmd` to `test_plugin` for bench use, or capture stdio inside the bench harness.
- [ ] Runtime plugin load/unload — builtin commands `plugin load <path>` / `plugin unload <name>` for dynamic management
- [ ] Workspace default package: `cargo test` without `-p` or `--workspace` may not find yosh tests — document in CLAUDE.md or set `default-members` in workspace config (`Cargo.toml`)
- [ ] `yosh-plugin update` help: add `#[arg(value_name = "PLUGIN")]` to show `[PLUGIN]` instead of `[NAME]` in help output (`crates/yosh-plugin-manager/src/main.rs`)
- [ ] `verify.rs` reads entire file into memory for SHA-256 — use streaming `Digest::update()` for large binaries (`crates/yosh-plugin-manager/src/verify.rs`)
- [ ] `GitHubClient` public API error type — `find_asset_url`, `latest_version`, `download` still return `Result<_, String>`; promote internal `GitHubApiError` to a public error type so callers can match on structured variants instead of string messages (`crates/yosh-plugin-manager/src/github.rs`)
- [ ] Integration tests: add checksum mismatch re-download test and partial failure (404) test per spec (`crates/yosh-plugin-manager/tests/`)
- [ ] `update_no_changes_preserves_file_contents` test asserts content equivalence but not file mtime — a future regression that drops the `if any_updated` guard at `src/update.rs:122` and writes unconditionally would still produce byte-identical content (since `set_plugin_version` was not called) and the test would pass. Strengthen by capturing `fs::metadata(&config_path).modified()` before and after the call (`crates/yosh-plugin-manager/src/update.rs`). Code-review follow-up from 2026-04-28 plugin-update toml_edit migration.
- [ ] Error-string prefix consistency: `update.rs::set_plugin_version` returns `"config 'plugin' key is not an array of tables"` while `install.rs::write_plugin_entry` returns `"'plugin' key is not an array of tables"` (no `config ` prefix). Both compose under `cmd_*`'s `yosh-plugin: ` prefix, but the two modules disagree on style. Align on one form when next touching either site (`crates/yosh-plugin-manager/src/install.rs`, `crates/yosh-plugin-manager/src/update.rs`). Code-review follow-up from 2026-04-28 plugin-update toml_edit migration.
- [ ] `src/plugin/host.rs` is now ~970 lines after the `commands:exec` addition. Consider splitting into `src/plugin/host/{mod,variables,filesystem,io,files}.rs` so each capability owns a focused file (`HostContext` + `WasiView` impl + `null_env_ctx` helper stay in `mod.rs`). Code-review follow-up from 2026-04-29 plugin files-rw branch.
- [ ] `files:write` host ops (`write-file`/`append-file`/`create-dir`) collapse `io::ErrorKind::NotFound` into `IoFailed` rather than mapping to `ErrorCode::NotFound` like the read ops do. Spec §4 error-mapping table is written as if uniform across all 8 functions; either add a footnote acknowledging the write-side asymmetry or actually map NotFound on the write side (e.g., parent-dir-missing on `write-file`/`create-dir`). Pre-decided as acceptable during implementation but worth revisiting if a plugin author wants the parent-not-found distinction (`src/plugin/host.rs`, `docs/superpowers/specs/2026-04-29-plugin-files-rw-capability-design.md` §4). Code-review follow-up from 2026-04-29 plugin files-rw branch.
- [ ] `FileStat::is_symlink` is effectively always `false` because `host_files_metadata` uses `std::fs::metadata` (which follows symlinks). Document in the SDK doc comment that the field is currently always `false` for symlinks-on-disk and that detecting them requires the future `symlink_metadata` host import (Spec §10 Open Questions already lists adding it) (`crates/yosh-plugin-sdk/src/lib.rs`, `crates/yosh-plugin-api/wit/yosh-plugin.wit`). Code-review follow-up from 2026-04-29 plugin files-rw branch.
- [ ] Restore WIT inline doc-comments stripped during implementation — spec §2 includes design-intent comments on `interface files` (e.g. `// Lightweight stat. Extended in the future by adding new functions, never by changing this record's shape.`, `// basename only, not full path`, `// Read group — gated by CAP_FILES_READ`) that were dropped from `crates/yosh-plugin-api/wit/yosh-plugin.wit`. These are the only WIT-level documentation telling future authors why each record/group is shaped the way it is. Cosmetic but high-value for future maintainers. Code-review follow-up from 2026-04-29 plugin files-rw branch.
- [ ] `yosh-plugin-sdk` public functions have no doc comments — every `read_file` / `read_to_string` / `write_file` / `exists` / `create_dir_all` / `remove_dir_all` etc. is undocumented. Several behaviors are non-obvious and easy to misuse: (a) `exists(path)` returns `false` not just for missing files but ALSO for `Denied` and any I/O error (caller cannot tell apart); (b) `read_to_string` collapses `String::from_utf8` failure onto `ErrorCode::InvalidArgument`, the same code used for empty-path errors at the host layer (a plugin author calling `read_to_string` on a binary file could mistake it for "I passed a bad path"); (c) `files:write` lets the plugin clobber any user-writable file, which the spec calls out in its threat model but the SDK doesn't surface to consumers. Add a doc comment to each public helper in `crates/yosh-plugin-sdk/src/lib.rs`, mirroring `std::fs` documentation style. Code-review follow-up from 2026-04-29 plugin files-rw branch.
- [ ] `kish-plugin-*` test fixture names — `crates/yosh-plugin-manager/src/config.rs:199,214` (`"kish-plugin-git-status"`) and `crates/yosh-plugin-manager/src/install.rs:362-368` (`"kish-plugin-foo"`) still use literal `kish-plugin-*` names left over from before the kish→yosh rename. Cosmetic but inconsistent with the env-var/CLI cleanup in v0.2.0. Replace with `yosh-plugin-*` next time these tests are touched. Code-review follow-up from 2026-04-29 plugin-manager rate-limit hint branch.
- [ ] `latest_version_403_with_token_no_hint` test asserts `err.contains("403")` — too loose, would pass on any incidental "403" substring. Tighten to `err.contains("HTTP 403")` to pin the exact `Display` form (`HTTP 403 (rate-limited or unauthorized)`) of `GitHubApiError::RateLimited`. Code-review Suggestion from 2026-04-29 plugin-manager rate-limit hint branch (`crates/yosh-plugin-manager/src/github.rs`).
- [ ] `find_asset_url_429_with_token_no_hint` test missing — `latest_version_403_with_token_no_hint` covers the token-set 4xx branch on `latest_version`, but the symmetric `find_asset_url` token-set 429 path is only circuit-covered (same Display + same `if self.token.is_none()` guard). Add an explicit test mirroring the no-token 429 fixture but with `with_token`, asserting hint absence + status surfaced. Whole-branch reviewer follow-up from 2026-04-29 plugin-manager rate-limit hint branch (`crates/yosh-plugin-manager/src/github.rs`).
- [ ] `capability_round_trip` test in `crates/yosh-plugin-api/src/lib.rs` enumerates only the original 8 `Capability` variants — `FilesRead`, `FilesWrite`, and `CommandsExec` are each covered by their own dedicated round-trip tests, but the function name implies completeness. Either expand the array to all 11 variants or rename to `capability_round_trip_legacy_set`. Code-review follow-up from 2026-04-29 plugin commands:exec branch.
- [ ] `host_commands_exec` env-inheritance design comment — the function takes `&mut HostContext` but only uses `env_mut()` as the metadata-contract null guard; CWD/env inheritance happens implicitly via `std::process::Command::new` defaults. Add a one-line comment in `src/plugin/host.rs::host_commands_exec` explaining the design choice (spec §5: "CWD is the shell's current directory; environment is the shell's full environment"). Code-review follow-up from 2026-04-29 plugin commands:exec branch.
- [ ] `yosh-plugin-sdk` could grow an `exec_to_string` helper that wraps `exec()` and returns `Result<(String, i32), ErrorCode>` (mirrors `read_to_string` vs `read_file`). Plugin authors invoking `exec()` for line-counting use cases will commonly write `String::from_utf8_lossy(&out.stdout)`. Code-review follow-up from 2026-04-29 plugin commands:exec branch (`crates/yosh-plugin-sdk/src/lib.rs`).
- [ ] `test_plugin::run-echo` arm absorbs `Err(ErrorCode::InvalidArgument)` into the catch-all `Err(_) => 1` (`tests/plugins/test_plugin/src/lib.rs`). The variant cannot fire because `program = "echo"` is fixed, but a comment clarifying that intent would help future readers who modify the arm to take `program` from `args`. Code-review follow-up from 2026-04-29 plugin commands:exec branch.
- [ ] `test_helpers::load_plugin_with_caps` no-allowlist ergonomics — all 15 callers in `tests/plugin.rs` pass `&[]` for the new `allowed_commands` parameter introduced in T6. Consider a no-allowlist convenience method or a `Default` impl so common test setups are less verbose. Code-review follow-up from 2026-04-29 plugin commands:exec branch (`src/plugin/mod.rs`).
- [ ] `t23_commands_exec_exact_pattern_rejects_extra_args` function name is ambiguous against `t22_commands_exec_pattern_not_allowed_without_match` — both end in `Handled(101)`, so the name doesn't immediately distinguish "exact-length pattern (no `:*` suffix)" from "non-matching argv". Rename to e.g. `t23_commands_exec_exact_pattern_no_glob_rejects_extra_args` or update the doc comment to lead with the distinguishing constraint (`tests/plugin.rs`). Code-review follow-up from 2026-04-29 plugin commands:exec branch.
- [ ] `resolve_cdpath_empty_entry_is_dot` test in `src/builtin/regular.rs:783` calls `std::env::set_current_dir(tempdir.path())` then lets the tempdir drop, leaving the lib-test process cwd pointing at a deleted directory. Any subsequent test that spawns a subshell (e.g., `host_commands_exec_captures_stderr_separately`) sees a "shell-init: error retrieving current directory" warning leak into stderr. Mitigated at the consumer side via `ends_with` in commit `715ffd6`; root-cause fix is to capture the original cwd at test entry and restore on exit (or use `set_current_dir(tempdir)` only after replacing the tempdir's drop semantics). Code-review follow-up from 2026-04-29 plugin commands:exec branch.

## Future: Code Quality Improvements

- [ ] `JobTable::update_status` per-process status tracking — currently overwrites the overall `job.status` on each child exit; if per-process status tracking (e.g., `$PIPESTATUS` array) is needed in the future, the `Job` struct will need a `Vec<(Pid, JobStatus)>` field instead of a single `status` (`src/env/jobs.rs`)
- [ ] `skip_balanced_*` unterminated input tests — `skip_balanced_parens`, `skip_balanced_braces`, `skip_balanced_double_parens` all return `bytes.len()` on unterminated input but none have tests for this behavior (`src/expand/mod.rs`)
- [ ] `find_in_path` vs `lookup_in_path` — `find_in_path` returns `Option<PathBuf>` (exec-only); `lookup_in_path` returns 3-state `PathLookup` for 126/127 distinction. Consider making `find_in_path` a thin wrapper over `lookup_in_path` to remove the near-duplicate directory walk (`src/exec/command.rs`)
- [ ] `exec_regular_builtin` "internal error" guards for `wait` / `fg`/`bg`/`jobs` / `command` are growing — consider factoring "Executor-requiring builtins" into an explicit classification or dispatch table instead of per-name guards (`src/builtin/mod.rs`)
- [ ] `render_verbose` Function arm has no unit test — `command -V <function>` branch exercised only through E2E; add a focused unit test in `src/builtin/command.rs` tests module
- [ ] `preview_command` has no direct unit tests — only exercised via E2E; add focused tests for compound-command / unexpandable-word fallback and pipeline first-command extraction (`src/exec/mod.rs`)
- [ ] `JobSpecError::Ambiguous` fully qualified at 3 call sites in `src/exec/mod.rs` (builtin_wait/fg/bg) — add a module-level `use crate::env::jobs::JobSpecError;` for readability
- [ ] `highlight_scanner.rs` `KEYWORDS` duplicates POSIX §2.4 list — `src/interactive/highlight_scanner.rs:66-69` defines its own copy of the 16 reserved words, separate from the canonical `crate::lexer::reserved::RESERVED_WORDS`. Consolidate once the contextual subsets (`COMMAND_POSITION_KEYWORDS` includes `"time"`, command-position restoration logic) are re-expressed in terms of the canonical list (`src/interactive/highlight_scanner.rs`)
- [ ] `cargo fmt --check -- <path>` misreads edition — rustfmt 1.8.0 / Rust 1.94.1 fails to parse let-chain syntax as edition 2024 when invoked with explicit file paths despite `Cargo.toml` specifying `edition = "2024"`, producing spurious fmt errors. Workaround: invoke `rustfmt --edition 2024 --check <path>` directly. Revisit when upstream rustfmt catches up.
- [ ] `parse_compound_list` non-empty regression tests are incomplete — only `nonempty_if_parses_ok` exists in `src/parser/mod.rs`. Add parallel `nonempty_while_parses_ok` / `nonempty_until_parses_ok` / `nonempty_for_parses_ok` / `nonempty_brace_group_parses_ok` / `nonempty_subshell_parses_ok` so future refactors cannot accidentally over-reject any individual context.
- [ ] LINENO update allocates a `String` per command — `exec_simple_command` / `exec_compound_command` call `cmd.line.to_string()` and go through `VarStore::set`. For tight loops this is ~500μs per 10k commands. If benchmarks ever show pressure, add `ShellEnv.exec.current_lineno: usize` and intercept `$LINENO` in `expand::param` to read that field directly, bypassing the alloc + HashMap write (`src/exec/simple.rs`, `src/exec/compound.rs`, `src/expand/param.rs`).
- [ ] `first_simple_cmd` duplicates `parse_first_simple` — both walk `Parser::new(src).parse_program()` to pull the first `SimpleCommand`, but with different APIs (`into_iter().next()` vs `[0]` indexing) and different panic messages. Consolidate or delete one (`src/parser/mod.rs`).
- [ ] Test helpers `first_simple_cmd` / `first_compound_cmd` use bare `unwrap()` — swap to `expect("program should contain …")` so future test failures pinpoint the step that produced `None` (`src/parser/mod.rs`).
- [ ] Extract `try_parse_assignment` value-construction walker into a private helper — the ~25-line match loop plus its 21-line doc comment dominates `try_parse_assignment` and will be swapped wholesale when sub-project 4 replaces `prev_was_literal` with escape metadata. A helper like `fn build_assignment_value_parts(after_eq: &str, remaining_parts: &[WordPart]) -> Vec<WordPart>` would make the doc comment a rustdoc `///`, keep `try_parse_assignment` focused on name/value splitting, and localize sub-project 4's diff (`src/parser/mod.rs`).
- [ ] `try_parse_assignment` `other.clone()` deep-copies CommandSub — the non-Literal branch clones each remaining `WordPart`, which for `$(...)` substitutions clones the embedded `Program`. Same inefficiency as the prior `extend_from_slice`, so not a regression, but consider consuming `Word` (take ownership) or draining `word.parts` to avoid the copy (`src/parser/mod.rs`).
- [ ] `parser::tests` duplicated `use ast::ParamExpr;` — three `assignment_rhs_param_*` tests each declare `use ast::ParamExpr;` inline. Hoist to the existing `use ast::{AndOrOp, CaseTerminator, CompoundCommandKind, RedirectKind, SeparatorOp, WordPart};` line at the top of `mod tests` for consistency (`src/parser/mod.rs`).
- [ ] `expand_assignment_builtin_args` FQN inconsistency — `src/exec/simple.rs:25-31` uses full paths (`crate::env::ShellEnv`, `crate::parser::ast::Word`, `crate::expand::expand_word_to_string`, `crate::expand::expand_words`) mixed with inline `use crate::parser::Parser; use crate::parser::ast::Assignment;`, while `Assignment`, `Word`, and `expand_words` are already imported at module level. Hoist `ShellEnv` and `Parser` to the module preamble and drop the redundant `crate::` prefixes for symmetry (`src/exec/simple.rs`).
- [ ] `expand_assignment_builtin_args` string round-trip — helper builds `"NAME=value"` strings that the builtin re-parses with `find('=')`. Lossless today, but couples the helper shape to the legacy builtin API. When a future refactor touches `builtin_export`/`builtin_readonly` signatures, consider passing `Vec<(String, Option<String>)>` directly to skip the round-trip (`src/exec/simple.rs`, `src/builtin/special.rs`).
- [ ] `parse_for_reserved_word_*_rejected` assertion OR clause is too broad — the sub-project-5 parser tests use `msg.contains("reserved word") || msg.contains("not a valid")`, but the `"not a valid"` side is never reached for inputs like `if`/`in` because those tokens pass `is_valid_name`. Tighten to `msg.contains("reserved word")` only so the assertion clearly pins the new Rule-5 rejection path (`src/parser/mod.rs`).
- [ ] `src/signal.rs` remaining hardcoded signal literals in tests — `test_signal_name_to_number_int` / `_sigint` / `_case_insensitive` / `_term` / `_kill`, `test_signal_number_to_name_2` / `_15` / `_9` still assert against raw numbers (1, 2, 9, 15). Numbers happen to be portable Linux/macOS, but the file now mixes `libc::SIG*` and raw literals. Replace with `libc::SIG*` for consistency (`src/signal.rs:381-421`). Code-review follow-up from 2026-04-20 signal-table fix.
- [ ] `RedirectState::apply` doc comment does not mention `save=false` fd leak on partial failure — spec §C declares this out of scope, but the code comment only says "rollback is a no-op" without flagging that opened fds from successful redirects preceding the failing one are still leaked. Add a sentence to the doc comment (`src/exec/redirect.rs:32-44`). Code-review follow-up from 2026-04-20 redirect self-heal.
- [ ] macOS CI job — Task 1 (SIGNAL_TABLE libc-const fix) corrects a bug that only manifests on macOS. Current CI only runs on Linux, so the regression test for the fix is not actually exercising the bug pre-fix. Add a GitHub Actions macOS runner to `cargo test` on every push so future signal-numbering regressions are caught. Spec cross-cutting concern from 2026-04-20 signal-table design.
- [ ] `test_signal_table_matches_libc_constants` / `test_handled_signals_match_libc_constants` duplicate a `match name => libc::SIG*` table — extract a private helper `fn name_to_libc(name: &str) -> Option<i32>` so adding a new entry to `SIGNAL_TABLE` only requires updating one arm, not both tests (`src/signal.rs:456-520`). Code-review Minor follow-up from 2026-04-20 signal-table fix.
- [ ] `exec_function_call` residual 2.1× overhead vs arithmetic loop (§4.2) — ~50 µs/call vs ~24 µs/iter at HEAD. Sub-benches are the prerequisite per `performance.md` §4.2 candidate #1: split into `exec_function_call_nopanic_guard` (replace `catch_unwind` with a Drop-guard scope popper), `exec_function_call_cached_environ`, `exec_function_call_smallvec_scope` to isolate which of the four suspected causes dominates. Then act on whatever the sub-benches reveal (`src/exec/function.rs:9-45`).
- [ ] Multi-byte IFS support in UTF-8 locale (bash-extension parity) — `field_split::split` currently matches IFS as an ASCII byte-set. `IFS="日"; set -- $"a日b"` yields `[a] [b]` under bash in UTF-8 locale (character-level match) but is silently ignored (post-fix A) or produces garbled bytes (pre-fix A) in yosh. POSIX leaves this locale-dependent; bash uses character-level matching when locale is multi-byte. Plan: introduce a `char`-level IFS match path (`char_indices` in `split_field`, char-mode `ifs` set) gated by locale detection. Deferred from the 2026-04-21 `append_byte` UTF-8 panic fix to keep scope minimal. See the brainstorming log for that fix; reference bash 3.2 behavior under `LC_ALL=en_US.UTF-8` as the target semantics.
- [ ] `fork + run-Rust-shell-code-in-child` is fundamentally POSIX-UB in MT contexts — even with `exit_child` helper, `exec_subshell` runs `self.exec_body(body)` in the child, which touches arbitrary Rust std (mutexes, allocators, env) and is technically only legal between `fork()` and `exec()` if all calls are async-signal-safe. Currently safe in practice because interactive shell parent is single-threaded; test harness is the exception. Long-term architectural consideration: reevaluate whether subshells should use `fork+exec` (separate yosh invocation with serialized state) instead of `fork+in-process interpreter`. Out of scope for the immediate fix; record to avoid forgetting the latent hazard.
- [ ] `TimeoutGuard::new` in `tests/pty_interactive.rs` hard-codes `TIMEOUT` as the restored value — if a future test changes the session timeout before calling `TimeoutGuard::new`, the guard will restore the wrong value. Either add a `debug_assert!` on the current timeout or provide a constructor that takes the prior value explicitly (`tests/pty_interactive.rs`).
- [ ] `PTY_DRAIN_MAX_BYTES = 8192` naming in `tests/pty_interactive.rs` is misleading — the constant is the regex upper bound, not the actual drain depth (which is bounded by the 300 ms timeout window). A one-sentence comment clarifying this would prevent future "this is obviously the drain limit" misreads.

## Future: E2E Test Expansion

- [ ] Builtin test POSIX_REF values could use more specific section numbers (e.g., `2.14.3` instead of `2.14 Special Built-In Utilities`)
- [ ] `fd_close.sh` test only checks exit code, not actual fd close effect
- [ ] Extend chapter-by-chapter POSIX coverage beyond XCU Chapter 2 — once the Chapter 2 coverage matrix stabilizes, add systematic E2E coverage for Chapter 4 Utilities (all shell-relevant builtins: special + regular, with option/edge-case matrices) and Chapter 8 Environment Variables. Reuse the `POSIX_REF`/`XFAIL` harness established for Chapter 2.
- [ ] Deepen Chapter 2 POSIX coverage to normative-requirement granularity — after the hybrid (representative + thin-section) coverage lands, enumerate every shall/must/should clause in XCU Chapter 2 and add one E2E test per normative requirement (est. +100–200 tests). Use `XFAIL` liberally to register gaps; the goal is to make each normative clause individually traceable to a test ID.
- [ ] `tilde_rhs_user_form.sh` documents absence of `EXPECT_OUTPUT` — the test omits `EXPECT_OUTPUT` because `~root` resolution is platform-dependent and verifies correctness in-script via `case`. Add a one-line comment explaining this so future contributors do not misread the omission as an oversight (`e2e/posix_spec/2_06_01_tilde_expansion/tilde_rhs_user_form.sh`).
- [ ] `tilde_rhs_command_prefix.sh` depends on external `sh -c` — the test uses `sh -c 'echo "$PREFIXED"'` to verify command-prefix assignment expansion, which cross-checks the external `sh` rather than yosh alone. If CI flakes arise on minimal Alpine/busybox environments, switch to a yosh-internal verification path (e.g., a builtin that echoes an env var) (`e2e/posix_spec/2_06_01_tilde_expansion/tilde_rhs_command_prefix.sh`).
- [ ] `readwrite_bidirectional.sh` description and name overstate body — test only exercises `exec 3<>file; exec 3<&-` (open-then-close smoke). Rename to `readwrite_opens_fd.sh` and reword DESCRIPTION to "N<>file opens the file without error" so readers don't expect an actual round-trip assertion (`e2e/posix_spec/2_07_redirection/readwrite_bidirectional.sh`).
- [ ] `readwrite_basic.sh` and `readwrite_param_expansion.sh` are near-duplicates — both do `echo X 1<>"$f"; cat "$f"`. Differentiate the parameter-expansion variant by embedding `$TEST_TMPDIR` directly in the redirect target (e.g. `echo roundtrip 1<>"${TEST_TMPDIR}/rw_pe_direct"`) so the redirect-target word-expansion code path is pinned, not the outer assignment (`e2e/posix_spec/2_07_redirection/`).
- [ ] `dup_input_*.sh` missing unquoted-fd variant — only `cat <&"$fd"` (quoted) is exercised. Add a `dup_input_unquoted_fd.sh` covering `fd=3; cat <&$fd` so future changes to word-expansion in redirect contexts cannot regress silently (`e2e/posix_spec/2_07_redirection/`).
- [ ] `dup_output_*.sh` missing unquoted-fd variant — symmetric gap to `dup_input_*.sh`. Only `echo hi >&"$fd"` (quoted) is exercised. Add a `dup_output_unquoted_fd.sh` covering `fd=3; echo hi >&$fd` when the DupInput counterpart is added so the §2.7.5/§2.7.6 suites stay symmetric (`e2e/posix_spec/2_07_redirection/`). Follow-up from 2026-04-24 DupOutput suite addition.
- [ ] Legacy `e2e/redirection/stderr_to_stdout.sh` migration — the §2.7.6 DupOutput suite under `e2e/posix_spec/2_07_redirection/` is now canonical. The legacy test covers only `2>&1` without `POSIX_REF` metadata. Either migrate it into `e2e/posix_spec/2_07_redirection/dup_output_stderr_to_stdout.sh` with a proper `POSIX_REF: 2.7.6 …` header, or delete if the new suite is judged sufficient. Follow-up from 2026-04-24 DupOutput suite addition.
- [ ] E2E test defensive `$TEST_TMPDIR` check — add `: "${TEST_TMPDIR:?TEST_TMPDIR not set}"` to the top of tests that rely on it, so standalone invocations fail with a clear error instead of writing to `/rw_basic`-style root paths (`e2e/posix_spec/2_07_redirection/`, `e2e/posix_spec/2_14_13_times/`).
- [ ] `times` operand rejection test missing — POSIX §2.14.13 says `times` takes no operands. Add `times_rejects_operand.sh` verifying non-zero exit (and `yosh:` stderr prefix) for `times foo` (`e2e/posix_spec/2_14_13_times/`).
- [ ] Rule 7/10 weak-intent tests need failure-signature comments — `rule07_not_at_word_position.sh`, `rule10_reserved_after_cmd_is_arg.sh`, `rule10_reserved_after_pipe_in_cmdpos.sh` pass but don't obviously document what a regression failure would look like. Add inline `# NOTE:` comments (mirroring the pattern in `rule10_reserved_quoted_not_recognized.sh`) explaining the expected observable vs. the buggy observable (`e2e/posix_spec/2_10_shell_grammar/`).
- [ ] §2.10.1 backslash-newline line-continuation test missing — a cheap and unambiguous §2.10.1 lexical test (`echo a\<newline>b` → `ab`) was omitted from the representative 3-test set. Add `line_continuation.sh` under `e2e/posix_spec/2_10_1_lexical/` when normative-granularity coverage expands.
- [ ] POSIX_REF format contract is undocumented — the current convention mixes `2.10.2 Rule N - <Name>`, `2.10.2 Rule N - <Name> (<discriminator>)`, `2.10 Shell Grammar - <Topic>`, and `2.X.Y <Section Name>` forms. A tooling-oriented grep like `grep -E 'POSIX_REF: 2\.10\.2 Rule'` will miss the topic-form entries (e.g. `terminator_semicolon_equals_newline.sh`). Document acceptable shapes in CLAUDE.md or an `e2e/README.md` once a second exceptional case appears.
- [ ] Rule 9 taxonomy needs a disambiguation note — the label reuses "Rule 9" across `Body of function`, `Body of compound_command`, and `Body of compound_list (<ctx>)` forms. The first is literal POSIX Rule 9; the other two generalize to the same grammar class. Add a single sentence to `CLAUDE.md` or `e2e/README.md` documenting that "Rule 9" is shorthand for compound_command/compound_list body violations across all contexts.
- [ ] E2E counterpart for `x=foo:\~/bin` escape case missing — `assignment_rhs_backslash_tilde_after_colon_stays_literal` pins the behavior at the AST level, but no E2E exercises the full parser→expander pipeline. Add `tilde_mixed_backslash_after_colon.sh` under `e2e/posix_spec/2_06_01_tilde_expansion/` asserting `x=foo:\~/bin; echo "$x"` outputs `foo:~/bin` literally.
- [ ] Double-quote escape coverage gap — only `e2e/quoting/backslash_in_double_quotes.sh` (`"\$HOME"` form) exercises the sub-project-4 `EscapedLiteral` switch inside double quotes. Manual checks confirm `"\\\\"`, `"\""`, and `` "\`" `` still round-trip correctly, but no E2E pins them. Add compound `backslash_dq_special_chars.sh` asserting the four POSIX-recognized double-quote escapes (`\$`, `\\`, `\"`, `` \` ``) each render to a single character (`e2e/quoting/` or `e2e/posix_spec/2_02_01_escape_character/`).
- [ ] `assignment_rhs_param_then_escaped_tilde_stays_literal` assertion is loose — the parser-unit test added in sub-project 4 Task 3 uses `!any(matches!(p, Tilde(_)))`, which catches the user-visible bug but wouldn't detect shape regressions (e.g., `/bin` being dropped from the value). Tighten to `assert_eq!(parts, vec![Parameter(var), lit(":"), EscapedLiteral("~"), lit("/bin")])` mirroring the sibling `assignment_rhs_param_then_tilde_no_colon_stays_literal` test style (`src/parser/mod.rs`).
- [ ] Full E2E suite occasional transient failures — one observed run showed `2 failed + 4 timedout` out of 374 while a back-to-back re-run was `374/374` clean. Specific tests that flaked were not captured. Next time flakes recur, save the full run via `./e2e/run_tests.sh 2>&1 | tee /tmp/e2e.log` and grep `\[FAIL\]|\[TIMEOUT\]` to identify them; determine whether the flakes concentrate in PTY-sensitive paths (expected per CLAUDE.md) or leak into non-PTY tests (would warrant isolation work).

## Future: Release Skill Enhancements

- [ ] `phase_push` remote tag upsert — currently only checks local tag existence; if the same tag already exists on origin, `git push origin <tag>` rejects. Add `git ls-remote --exit-code --tags origin <tag>` check before pushing (`.claude/skills/release/scripts/release.sh`)
- [ ] `test_plugin/Cargo.toml` version lag risk — `tests/plugins/test_plugin` is a workspace member but not in the `phase_bump` manifests list (not publishable). Currently safe because it depends on workspace crates only via `path =`; breaks if it ever adds `version = "..."` pins (`.claude/skills/release/scripts/release.sh`)
- [ ] `phase_publish` root-crate branch — the `if [[ "$crate" == "yosh" ]]` special case (bare `cargo publish` for root vs `cargo publish -p` for members) can be simplified to uniform `cmd=(cargo publish -p "$crate")` since cargo accepts `-p` on root crates too (`.claude/skills/release/scripts/release.sh`)
- [ ] e2e runner timer race-condition comment — `( sleep $TIMEOUT && kill -9 $_pid && echo timeout )` has a benign race: if the test exits just as `sleep` expires, `kill -9` returns ESRCH and the marker is not written. Behavior is correct (`wait $_pid` already captured the real exit code) but the race is undocumented. Add a short comment above the subshell explaining this (`e2e/run_tests.sh:186-192`). Code-review follow-up from 2026-04-22 release-perf work.
- [ ] `YOSH_E2E_NO_TIMEOUT` help wording — current `--help` text says "local use only"; tighten to "never set in CI or release.sh; individual runaway tests will hang forever" to prevent accidental production use (`e2e/run_tests.sh:35`). Code-review follow-up from 2026-04-22 release-perf work.
- [ ] `release.sh test` wall-time variance observation — after per-test-binary parallelization (2026-04-23), 3 back-to-back runs measured 95 s / 162 s / 178 s (±22 %, exceeds nominal ±20 % stability threshold). Root cause: `cargo test --no-run --workspace` incremental-check time varies with filesystem cache state (run 1 benefits from peak warmth). Not a correctness issue. If CI-based benchmarking is added, introduce a warm-up run before timed measurements to reduce first-run bias (`.claude/skills/release/scripts/release.sh`).