cargo-spellcheck
Check your spelling with hunspell
and/or nlprule
.
Use Cases
Run cargo spellcheck --fix
or cargo spellcheck fix
to fix all your
documentation comments in order to avoid nasty typos all over your source tree.
Meant as a helper simplifying review as well as improving CI checks after a
learning phase for custom/topic specific lingo.
See automation.md for instructions on how to use
cargo-spellcheck
in automated contexts like CI/CD systems and git hooks.
Check For Spelling and/or Grammar Mistakes
Apply Suggestions Interactively
Implemented Features + Roadmap
- Parse doc comments from arbitrary files
- Decent error printing
-
cargo-spellcheck check
- Spell checking using
hunspell
- Merge multiline doc comments
- Handle multiline and fragmented mistakes (i.e. for grammar) #25
- Grammar check using
nlprule
- Follow module declarations rather than blindly recurse
- Be
commonmark
/markdown
aware - Check
README.md
files #37 - Improve interactive user interface with
crossterm
- Ellipsize overly long statements with
...
#42 - Learn topic lingo and filter false-positive-suggestions #41
- Handle cargo workspaces #38
- Re-flow doc comments #39
- Collect dev comments as well #115
hunspell
(dictionary based lookups) and nlprules
(static grammar rules,
derived from languagetool
) are currently the two supported checkers.
Configuration
Source
There are various ways to specify the configuration. The prioritization is as follows:
Explicit specification:
-
Command line flags
--cfg=...
. -
Cargo.toml
metadata[] = "somewhere/cfg.toml"
which will fail if specified and not existent on the filesystem.
If neither of those ways of specification is present, continue with the implicit.
Cargo.toml
metadata in the current working directoryCWD
.- Check the first arguments location if present, else the current working directory for
.config/spellcheck.toml
. - Fallback to per user configuration files:
- Linux:
/home/alice/.config/cargo_spellcheck/config.toml
- Windows:
C:\Users\Alice\AppData\Roaming\cargo_spellcheck\config.toml
- macOS:
/Users/Alice/Library/Preferences/cargo_spellcheck/config.toml
- Linux:
- Use the default, builtin configuration (see
config
sub-command).
Since this is rather complex, add -vv
to your invocation to see the info
level logs printed, which will contain the config path.
Format
# Project settings where a Cargo.toml exists and is passed
# ${CARGO_MANIFEST_DIR}/.config/spellcheck.toml
# Also take into account developer comments
= false
# Skip the README.md file as defined in the cargo manifest
= false
[]
# lang and name of `.dic` file
= "en_US"
# OS specific additives
# Linux: [ /usr/share/myspell ]
# Windows: []
# macOS [ /home/alice/Libraries/hunspell, /Libraries/hunspell ]
# Additional search paths, which take presedence over the default
# os specific search dirs, searched in order, defaults last
# search_dirs = []
# Adds additional dictionaries, can be specified as
# absolute paths or relative in the search dirs (in this order).
# Relative paths are resolved relative to the configuration file
# which is used.
# Refer to `man 5 hunspell`
# or https://www.systutorials.com/docs/linux/man/4-hunspell/#lbAE
# on how to define a custom dictionary file.
= []
# If set to `true`, the OS specific default search paths
# are skipped and only explicitly specified ones are used.
= false
# Use the builtin dictionaries if none were found in
# in the configured lookup paths.
# Usually combined with `skip_os_lookups=true`
# to enforce the `builtin` usage for consistent
# results across distributions and CI runs.
# Setting this will still use the dictionaries
# specified in `extra_dictionaries = [..]`
# for topic specific lingo.
= true
[]
# Transforms words that are provided by the tokenizer
# into word fragments based on the capture groups which are to
# be checked.
# If no capture groups are present, the matched word is whitelisted.
= ["^'([^\\s])'$", "^[0-9]+x$"]
# Accepts `alphabeta` variants if the checker provides a replacement suggestion
# of `alpha-beta`.
= true
# And the counterpart, which accepts words with dashes, when the suggestion has
# recommendations without the dashes. This is less common.
= false
[]
# Allows the user to override the default included
# exports of LanguageTool, with other custom
# languages
# override_rules = "/path/to/rules_binencoded.bin"
# override_tokenizer = "/path/to/tokenizer_binencoded.bin"
[]
# Reflows doc comments to adhere to adhere to a given maximum line width limit.
= 80
To increase verbosity add -v
(multiple) to increase verbosity.
Installation
cargo install --locked cargo-spellcheck
The --locked
flag is the preferred way of installing to get the tested set of
dependencies.
Checkers
Available checker support
Hunspell
Requires a C++ compiler to compile the hunspell CXX source files which are part
of hunspell-sys
Fedora 30+
Ubuntu 19.10+
Mac OS X
The environment variable LLVM_CONFIG_PATH
needs to point to llvm-config
, to
do so:
NlpRules
When compiled with the default featureset which includes nlprules
, the
resulting binary can only be distributed under the LGPLv2.1
since the rules
and tokenizer
definitions are extracted from LanguageTool
(which is itself licensed under LGPLv2.1
) as described by
the library that is used for pulling and integrating - details are to be found
under crate nlprule
's
README.md.
🎈 Contribute!
Contributions are very welcome!
Generally the preferred way of doing so, is to comment in an issue that you would like to tackle the implementation/fix.
This is usually followed by an initial PR where the implementation is then discussed and iteratively refined. No need to get it all correct the first time!