# textstat
[](#contributors-)
[](https://crates.io/crates/textstat)
[](https://docs.rs/textstat)
[](https://github.com/trananhtung/textstat/actions/workflows/ci.yml)
[](#license)
[](#no_std)
**Readability metrics for English text.** Flesch Reading Ease, Flesch-Kincaid
Grade, Gunning Fog, SMOG, Automated Readability Index, Coleman-Liau — plus the
word / sentence / syllable counts they're built on. Inspired by Python's
[`textstat`](https://pypi.org/project/textstat/). Zero dependencies, `#![no_std]`.
```rust
let text = "The cat sat on the mat. The dog ran fast.";
assert_eq!(textstat::lexicon_count(text), 10); // words
assert_eq!(textstat::sentence_count(text), 2);
assert_eq!(textstat::syllable_count(text), 10);
let ease = textstat::flesch_reading_ease(text); // ~117 (very easy)
let grade = textstat::flesch_kincaid_grade(text); // ~ -1.8 (well below grade 1)
```
## Why textstat?
Rust's `readability` crates are *article extractors* (arc90/Mozilla Readability) —
they pull the main content out of a web page. **None of them compute readability
*scores*.** `textstat` fills that gap: drop-in functions for the standard formulas,
useful for content tooling, SEO, writing assistants, education, and accessibility
(WCAG) checks.
## Install
```toml
[dependencies]
textstat = "0.1"
```
## Metrics
| `flesch_reading_ease` | 0–100+ score (higher = easier) |
| `flesch_kincaid_grade` | U.S. school grade level |
| `gunning_fog` | U.S. school grade level |
| `smog_index` | U.S. school grade level |
| `automated_readability_index` | U.S. school grade level |
| `coleman_liau_index` | U.S. school grade level |
| `reading_time(text, wpm)` | estimated seconds to read |
### Counts
`lexicon_count` (words), `sentence_count`, `syllable_count`, `syllables(word)`,
`polysyllabic_count`, `char_count` (non-whitespace), `letter_count`.
## Accuracy
Syllables are estimated with a fast English heuristic (vowel groups + a silent-`e`
rule, including accented Latin vowels), not a pronunciation dictionary, so scores
are *close to* — not bit-identical with — dictionary-based tools. The metric
**formulas** are the standard published ones.
- `sentence_count` skips decimal points (`3.14`) and initialism dots (`U.S.A.`);
trailing abbreviation dots (`Dr.`) may still add one.
- Best results are on English text. Non-Latin scripts (no Latin vowels) fall back
to one syllable per word, so scores stay bounded rather than correct.
- Empty input yields `0` everywhere — no panics, no division by zero.
## no_std
`textstat` is `#![no_std]` (needs only `alloc`) with a dependency-free Newton's
-method `sqrt`, so it builds for bare-metal targets such as `thumbv7em-none-eabi`.
## Contributors ✨
This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind are welcome — code, docs, bug reports, ideas, reviews! See the [emoji key](https://allcontributors.org/docs/en/emoji-key) for how each contribution is recognized, and open a PR or issue to get involved.
Thanks goes to these wonderful people:
<table>
<tbody>
<tr>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/trananhtung"><img src="https://avatars.githubusercontent.com/u/30992229?v=4?s=100" width="100px;" alt="Tung Tran"/><br /><sub><b>Tung Tran</b></sub></a><br /><a href="https://github.com/trananhtung/textstat/commits?author=trananhtung" title="Code">💻</a> <a href="#maintenance-trananhtung" title="Maintenance">🚧</a></td>
</tr>
</tbody>
</table>
## License
Licensed under either of [Apache-2.0](LICENSE-APACHE) or [MIT](LICENSE-MIT) at
your option.