Expand description
§harper-core
harper-core is the fundamental engine behind Harper, the grammar checker for developers.
harper-core is available on crates.io. However, improving the API is not currently a high priority.
Feel free to use harper-core in your projects.
If you run into issues, create a pull request.
§Features
concurrent: Whether to use thread-safe primitives (Arc vs Rc). Disabled by default.
It is not recommended unless you need thread-safely (i.e. you want to use something like tokio).
Modules§
- expr
- An
Expris a declarative way to express whether a certain set of tokens fulfill a criteria. - language_
detection - This module implements rudimentary, dictionary-based English language detection.
- linting
- Frameworks and rules that locate errors in text.
- parsers
- Adds support for parsing various programming and markup languages through a unified trait:
Parser. - patterns
Patterns are one of the more powerful ways to query text inside Harper, especially for beginners. They are a simplified abstraction overExpr.- spell
- Contains the relevant code for performing dictionary lookups and spellchecking (i.e. fuzzy dictionary lookups).
Structs§
- Adverb
Data - Adverb can be a “junk drawer” category for words which don’t fit the other major categories. The typical adverbs are “adverbs of manner”, those derived from adjectives in -ly other adverbs (time, place, etc) should probably not be considered adverbs for Harper’s purposes
- Conjunction
Data - Determiner
Data - Additional metadata for determiners
- Dialect
Flags - A collection of bit flags used to represent enabled dialects.
- Dict
Word Metadata - This represents a “lexeme” or “headword” which is case-folded but affix-expanded. So not only lemmata but also inflected forms are stored here, with “horn” and “horns” each having their own lexeme, but “Ivy” and “ivy” sharing the same lexeme.
- Document
- A document containing some amount of lexed and parsed English text.
- FatString
Token - Similar to a
FatToken, but uses aStringas the underlying store. - FatToken
- A
Tokenthat holds its content as a fatVec<char>rather than as aSpan. - Ignored
Lints - A structure that keeps track of lints that have been ignored by users.
- Lint
Context - A location-agnostic structure that attempts to captures the context and content that a
Lintoccurred. - Lrc
- A single-threaded reference-counting pointer. ‘Rc’ stands for ‘Reference Counted’.
- Mask
- Identifies portions of a
charsequence that should not be ignored by Harper. - Noun
Data - Number
- Represents a written number.
- Orth
Flags - A collection of bit flags used to represent orthographic properties of a word.
- Pronoun
Data - Quote
- Span
- A window in a [
T] sequence. - Token
- Represents a semantic, parsed component of a
Document. - Verb
Data - Verb
Form Flags - A collection of bit flags used to represent verb forms.
Enums§
- Currency
- A national or international currency
- Degree
- Degree is a property of adjectives: positive is not inflected Comparative is inflected with -er or comes after the word “more” Superlative is inflected with -est or comes after the word “most”
- Dialect
- A regional dialect.
- Ordinal
Suffix - Orthography
- Orthography information.
- Punctuation
- Token
Kind - The parsed value of a
Token. Has a variety of queries available. If there is a query missing, it may be easy to implement by just calling thedelegate_to_metadatamacro. - Verb
Form
Traits§
- Char
String Ext - Extensions to character sequences that make them easier to wrangle.
- LSend
- Masker
- A Masker is a tool that can be composed to eliminate chunks of text from being parsed. They can be composed to do things like isolate comments from a programming language or disable linting for languages that have been determined to not be English.
- Token
String Ext - Extension methods for
Tokensequences that make them easier to wrangle and query. - VecExt
- Extensions on top of
Vecthat make certain common operations easier.
Functions§
- core_
version - Return
harper-coreversion - make_
title_ case - make_
title_ case_ str - A helper function for
make_title_casethat uses Strings instead of char buffers. - remove_
overlaps - A utility function that removes overlapping lints in a vector, keeping the more important ones.
- remove_
overlaps_ map - Remove overlapping lints from a map keyed by rule name, similar to
remove_overlaps.
Type Aliases§
- Char
String - A char sequence that improves cache locality. Most English words are fewer than 12 characters.