Rust_Grammar
Complete Professional Text Analysis Library & API
Rust_Grammar is a production-grade, comprehensive text analysis library written in Rust that provides detailed analysis of grammar, readability, style, and 19+ professional writing metrics. Built for maximum performance, reliability, and accuracy with zero-crash guarantees.
Table of Contents
- Overview
- Features
- Architecture
- Installation
- Configuration
- Usage
- REST API Reference
- Code Examples
- Data Models
- Error Handling
- Testing
- Deployment
- Contributing
- Changelog & Roadmap
- FAQ
- License & Credits
Overview
Project Description
Rust_Grammar is a comprehensive text analysis toolkit designed for writers, editors, developers, and content creators who need detailed insights into their writing. Whether you're analyzing academic papers, fiction manuscripts, business documents, or technical documentation, Rust_Grammar provides actionable feedback on grammar, style, readability, and writing quality.
Key Value Propositions
| Value | Description |
|---|---|
| Performance | Written in Rust for blazing-fast analysis (~500ms per 1000 words) |
| Accuracy | 95%+ sentence splitting, 85%+ passive voice detection, 90%+ syllable counting |
| Reliability | Zero crashes with comprehensive error handling using Result<T, E> |
| Completeness | 19+ professional analysis features in a single library |
| Flexibility | CLI, Library, and REST API interfaces |
| Extensibility | Modular architecture with configurable presets |
Target Audience
- Writers & Authors: Improve prose quality and readability
- Editors: Automated first-pass analysis
- Developers: Integrate text analysis into applications
- Educators: Teaching writing improvement
- Content Teams: Maintain consistent writing standards
Features
Complete Feature List (19+ Professional Features)
| Feature | Description |
|---|---|
| Subject-Verb Agreement | Detects mismatched subjects and verbs (e.g., "He are going") |
| Double Negative Detection | Finds double negatives (e.g., "don't have nothing") |
| Run-on Sentence Detection | Identifies excessively long compound sentences |
| Comma Splice Detection | Finds improperly joined independent clauses |
| Missing Punctuation | Detects sentences lacking end punctuation |
| Severity Levels | Issues categorized as Low, Medium, or High severity |
| Feature | Description |
|---|---|
| Passive Voice Detection | 85%+ accuracy with confidence scoring (0.0-1.0) |
| Adverb Counting | Counts -ly words throughout the text |
| Hidden Verbs | Finds nominalizations (e.g., "decision" → "decide") |
| Overall Style Score | 0-100% rating based on multiple factors |
| Metric | Description |
|---|---|
| Flesch Reading Ease | 0-100 scale (higher = easier to read) |
| Flesch-Kincaid Grade Level | U.S. school grade level equivalent |
| SMOG Index | Readability formula for healthcare documents |
| Coleman-Liau Index | Character-based readability |
| Automated Readability Index | Character and word count based |
| Average Words per Sentence | Sentence complexity indicator |
| Average Syllables per Word | Vocabulary complexity indicator |
| Feature | Description |
|---|---|
| Overall Glue Index | Percentage of "glue words" (the, a, is, etc.) |
| Sticky Sentence Detection | Sentences with >45% glue words |
| Semi-Sticky Detection | Sentences with 35-45% glue words |
| Position Tracking | Character-level positions for highlighting |
| Category | Description |
|---|---|
| Fast-Paced | Sentences with <10 words |
| Medium-Paced | Sentences with 10-20 words |
| Slow-Paced | Sentences with >20 words |
| Distribution | Count and percentage per category |
| Feature | Description |
|---|---|
| Average Length | Mean word count per sentence |
| Standard Deviation | Variation in sentence length |
| Variety Score | 0-10 scale measuring length diversity |
| Very Long Detection | Sentences >30 words flagged |
| Shortest/Longest | Extremes identified |
| Feature | Description |
|---|---|
| Transition Count | Sentences containing transitions |
| Transition Percentage | Ratio of sentences with transitions |
| Unique Transitions | Count of distinct transition words |
| Most Common | Ranked list with frequencies |
| Multi-word Support | Detects phrases like "on the other hand" |
| Feature | Description |
|---|---|
| Overused Words | Words appearing >0.5% frequency |
| Repeated Phrases | 2-4 word phrase repetition |
| Echoes | Word repetition within 20 words |
| Position Tracking | Exact character positions |
| Sense | Example Words |
|---|---|
| Sight | see, bright, vivid, sparkle, glowing |
| Sound | hear, loud, whisper, echo, buzz |
| Touch | feel, soft, rough, texture, smooth |
| Smell | scent, aroma, fragrant, stench |
| Taste | flavor, sweet, savory, bitter |
| Feature | Description |
|---|---|
| Vague Words | thing, stuff, nice, very, really |
| Vague Phrases | "kind of", "sort of", "a bit" |
| Frequency Count | Per-word usage statistics |
| Position Tracking | Character-level locations |
50+ common clichés tracked including:
- "avoid like the plague"
- "piece of cake"
- "think outside the box"
- "at the end of the day"
- "break the ice"
| Check Type | Examples |
|---|---|
| US vs UK Spelling | color/colour, analyze/analyse |
| Hyphenation | email/e-mail, online/on-line |
| Capitalization | Inconsistent title case |
| Feature | Description |
|---|---|
| Total Count | All acronym occurrences |
| Unique Count | Distinct acronyms found |
| Frequency List | Ranked by usage |
Single words and phrases detected:
- synergy, leverage, paradigm
- "circle back", "touch base"
- "low-hanging fruit", "move the needle"
| Threshold | Description |
|---|---|
| Sentence Length | >20 words average |
| Syllables | >1.8 per word average |
| Position Tracking | Start/end character indices |
Tracks sentences beginning with:
- and, but, or, so, yet, for, nor
Architecture
System Design Overview
┌─────────────────────────────────────────────────────────────────────────────┐
│ Rust_Grammar Architecture │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ CLI App │ │ REST API │ │ Library │ │ HTML Visual │ │
│ │ (main.rs) │ │(api-server) │ │ (lib.rs) │ │(visualizer) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │ │
│ └──────────────────┴────────┬─────────┴──────────────────┘ │
│ │ │
│ ┌──────────▼──────────┐ │
│ │ TextAnalyzer │ │
│ │ (Core Engine) │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌───────────────────────────┼───────────────────────────┐ │
│ │ │ │ │
│ ┌──────▼──────┐ ┌───────▼───────┐ ┌───────▼───────┐ │
│ │ Grammar │ │Comprehensive │ │ Configuration │ │
│ │ Module │ │ Analysis │ │ System │ │
│ ├─────────────┤ ├───────────────┤ ├───────────────┤ │
│ │• Sentence │ │• Sticky Sent. │ │• YAML/TOML │ │
│ │ Splitter │ │• Pacing │ │• Presets │ │
│ │• Passive │ │• Transitions │ │• Thresholds │ │
│ │ Voice │ │• Sensory │ │• Feature │ │
│ │• Grammar │ │• Clichés │ │ Toggles │ │
│ │ Checker │ │• Jargon │ │ │ │
│ └──────┬──────┘ └───────┬───────┘ └───────────────┘ │
│ │ │ │
│ ┌──────▼──────────────────────────▼──────┐ │
│ │ Dictionaries Module │ │
│ ├─────────────────────────────────────────┤ │
│ │ • 200+ Abbreviations │ │
│ │ • 200+ Irregular Past Participles │ │
│ │ • 1000+ Syllable Counts │ │
│ │ • Glue Words, Transitions, Clichés │ │
│ └─────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ Error Handling Layer ││
│ │ Result<T, AnalysisError> throughout • Zero crashes guaranteed ││
│ └─────────────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────────────┘
Directory Structure
rust_grammar/
├── Cargo.toml # Project manifest and dependencies
├── README.md # This documentation
├── LICENSE # MIT License
├── config.example.yaml # Example configuration file
├── sample.txt # Sample text for testing
│
├── src/
│ ├── main.rs # CLI entry point with clap
│ ├── lib.rs # Core library exports
│ ├── error.rs # Custom error types (thiserror)
│ ├── config.rs # Configuration system
│ ├── word_lists.rs # Static word dictionaries
│ ├── analysis_reports.rs # Report data structures
│ ├── comprehensive_analysis.rs # All 19 analysis features
│ ├── visualizer.rs # HTML report generator
│ │
│ ├── bin/
│ │ ├── api-server.rs # Basic REST API server (1 endpoint)
│ │ └── api-server-enhanced.rs # Enhanced API with 6 endpoints
│ │
│ ├── dictionaries/
│ │ ├── mod.rs # Module exports
│ │ ├── abbreviations.rs # 200+ abbreviations
│ │ ├── irregular_verbs.rs # Past participles dictionary
│ │ └── syllable_dict.rs # 1000+ syllable counts
│ │
│ └── grammar/
│ ├── mod.rs # Module exports
│ ├── sentence_splitter.rs # Advanced sentence boundary detection
│ ├── passive_voice.rs # Confidence-scored detection
│ └── checker.rs # Grammar rules engine
│
├── tests/
│ └── integration_tests.rs # Comprehensive integration tests
│
├── benches/
│ └── performance.rs # Criterion benchmarks
│
└── docs/
├── API_DOCUMENTATION.md # REST API documentation
├── IMPLEMENTATION.md # Technical details
├── QUICKSTART.md # Quick setup guide
└── CHANGELOG.md # Version history
Installation
Prerequisites
| Requirement | Minimum Version | Notes |
|---|---|---|
| Rust | 1.75+ | Install via rustup |
| Cargo | Latest | Included with Rust |
| Git | Any recent | For cloning repository |
Method 1: Install from Crates.io
# Install as a library dependency
# Or add to Cargo.toml manually
Method 2: Install from Source
# Clone the repository
# Build release version (optimized)
# Verify installation
# Optional: Install binary globally
Method 3: Build API Server
# Build enhanced API server (recommended)
# Or build basic API server
Feature Flags
| Feature | Description | Default |
|---|---|---|
cli |
Command-line interface | ✅ Yes |
server |
REST API server | ❌ No |
parallel |
Rayon parallel processing | ✅ Yes |
markdown |
Markdown preprocessing | ✅ Yes |
html |
HTML parsing support | ✅ Yes |
full |
All features enabled | ❌ No |
Configuration
Configuration File Format
The analyzer supports both YAML and TOML configuration formats.
YAML Configuration (config.yaml)
# Text Analyzer Configuration
# Input validation settings
validation:
max_file_size_mb: 10
min_words: 10
max_words: null
timeout_seconds: 300
# Analysis behavior settings
analysis:
parallel_processing: true
cache_results: false
document_type: general # general|academic|fiction|business|technical
# Threshold settings
thresholds:
sticky_sentence_threshold: 40.0
overused_word_threshold: 0.5
echo_distance: 20
very_long_sentence: 30
complex_paragraph_sentence_length: 20.0
complex_paragraph_syllables: 1.8
passive_voice_max: 10
adverb_max: 20
# Feature toggles
features:
grammar_check: true
style_check: true
readability_check: true
consistency_check: true
sensory_analysis: true
cliche_detection: true
jargon_detection: true
echo_detection: true
# Output settings
output:
format: text # text|json|yaml|html
verbosity: normal # quiet|normal|verbose|debug
color: true
show_progress: true
Document Type Presets
| Preset | Use Case | Key Modifications |
|---|---|---|
general |
Default for most documents | Balanced settings |
academic |
Research papers, theses | Lenient on passive voice (max=20%) |
fiction |
Novels, short stories | Strict sticky sentences (35%) |
business |
Reports, proposals | Lenient glue words (45%) |
technical |
Documentation, manuals | Lenient complexity |
Usage
CLI Usage
# Basic analysis
# Comprehensive analysis (ALL 19 features)
# With document type preset
# Save report to file
# JSON output
# Visual HTML report
# Use custom configuration
Starting the API Server
# Start enhanced API server (recommended)
# Output:
# 🚀 Text Analyzer API running on http://0.0.0.0:2000
# 📝 POST to http://0.0.0.0:2000/analyze with JSON body: {"text": "your text"}
# 📊 POST to http://0.0.0.0:2000/score for scores only
# 📏 POST to http://0.0.0.0:2000/sentencelength for sentence length analysis
# 📖 POST to http://0.0.0.0:2000/readability for readability analysis
# 🎯 POST to http://0.0.0.0:2000/passivevoice for passive voice analysis
# 🔗 POST to http://0.0.0.0:2000/glueindex for glue index analysis
REST API Reference
The API server runs on http://0.0.0.0:2000 by default with CORS enabled.
API Servers
| Server | Binary | Endpoints | Use Case |
|---|---|---|---|
| Basic | api-server |
1 (/analyze) |
Simple integration |
| Enhanced | api-server-enhanced |
6 | Full-featured applications |
Endpoint Summary
| Endpoint | Method | Description |
|---|---|---|
/analyze |
POST | Full analysis with all scores and issues |
/score |
POST | Scores only with ideal values and status |
/sentencelength |
POST | Detailed sentence length analysis |
/readability |
POST | Readability metrics and difficult paragraphs |
/passivevoice |
POST | Passive voice, adverbs, hidden verbs, and style |
/glueindex |
POST | Glue index and sticky sentences |
1. POST /analyze
Full analysis with comprehensive scores and all detected issues.
Request
Request Body:
Response
2. POST /score
Returns only scoring metrics with ideal values, status, and helpful messages.
Request
Request Body:
Response
3. POST /sentencelength
Detailed sentence length analysis with individual sentence positions.
Request
Request Body:
Response
4. POST /readability
Readability metrics with difficult paragraph identification.
Request
Request Body:
Response
Difficulty Levels:
"very hard"- Average word length >6.5 and sentence length >30"hard"- Average word length >5.5 and sentence length >25"slightly difficult"- Sentence length >20
5. POST /passivevoice
Comprehensive style analysis including passive voice, adverbs, hidden verbs, and more.
Request
Request Body:
Response
Analysis Categories:
| Category | Description |
|---|---|
passiveVerbs |
Passive voice constructions |
hiddenVerbs |
Nominalizations (e.g., "make a decision" → "decide") |
adverbsList |
Words ending in -ly |
readabilityEnhancements |
Weak constructions ("there is", "it was") |
inclusiveLanguageImprovements |
Gendered/non-inclusive language |
emotionTells |
Words like "felt", "seemed", "appeared" |
styleImprovements |
Filler words ("very", "really", "just") |
businessJargon |
Corporate buzzwords |
longSubordinateClauses |
Sentences with 3+ commas |
repeatedSentenceStarts |
Words used 3+ times to start sentences |
styleGuideItems |
Common grammar mistakes ("alot", "could of") |
6. POST /glueindex
Glue word analysis and sticky sentence detection.
Request
Request Body:
Response
Sentence Categories:
| Category | Glue Percentage | Description |
|---|---|---|
sticky |
>45% | High glue word density, needs revision |
semi-sticky |
35-45% | Moderate glue word density |
Error Responses
All endpoints return consistent error responses:
| Status Code | Error | Description |
|---|---|---|
| 400 | Text cannot be empty |
Empty text provided |
| 500 | Failed to analyze text: [details] |
Analysis processing error |
Code Examples
Example 1: Rust Library Usage
use ;
Example 2: Python API Client
#!/usr/bin/env python3
=
"""Full analysis with scores and issues."""
=
return
"""Get scores only with ideal values."""
=
return
"""Analyze sentence length by paragraph."""
=
return
"""Get readability metrics."""
=
return
"""Get passive voice and style analysis."""
=
return
"""Get glue index and sticky sentences."""
=
return
# Example usage
=
# Full analysis
=
# Scores only
=
# Paragraph-based analysis
=
=
=
=
Example 3: JavaScript/Node.js Client
const API_BASE = 'http://localhost:2000';
// Example usage
.;
Example 4: cURL Test Script
#!/bin/bash
# test-api.sh - Test all API endpoints
API_URL="http://localhost:2000"
TEXT='{"text": "The report was written yesterday. This is very important."}'
PARA='{"data": [{"text": "The report was written yesterday. This is very important.", "key": "para_0"}]}'
|
|
|
|
|
|
Data Models
Request Models
Simple Text Request (for /analyze, /score)
interface AnalyzeRequest {
text: string;
}
Paragraph Data Request (for /sentencelength, /readability, /passivevoice, /glueindex)
interface ParagraphRequest {
data: ParagraphData[];
}
interface ParagraphData {
text: string;
key: string; // Unique identifier for the paragraph
}
Response Models
interface Occurrence {
start: number; // Character position (start)
end: number; // Character position (end)
length: number; // Character length
string: string; // Matched text
paragraphKey: string; // Reference to source paragraph
}
interface SimpleScore {
current: number; // Current value
ideal: string; // Target range (e.g., "< 10%")
status: string; // "good" | "fair" | "needs improvement"
message: string; // Actionable guidance
}
interface ScoreDetail {
score: number;
percentage: number;
message?: string;
}
interface PercentageScore {
percentage: number;
count: number;
total: number;
message?: string;
occurrences?: Occurrence[];
}
interface CountScore {
count: number;
percentage?: number;
message?: string;
occurrences?: Occurrence[];
}
interface AnalysisIssue {
Id: string; // Unique identifier
start: number; // Character position (start)
end: number; // Character position (end)
length: number; // Character length
paragraphKey: string; // Source paragraph
string: string; // Matched text
type: string; // Issue category
suggestions: {
recommendation: string[];
};
}
Issue Types:
PassiveVoiceGrammar_SubjectVerbAgreementGrammar_DoubleNegativeGrammar_RunOnSentenceStickySentenceOverusedWordRepetitionClicheVagueWordBusinessJargon
Error Handling
Error Types
API Error Handling
// Returns:
// 400 Bad Request - for empty text
// 500 Internal Server Error - for analysis failures
Testing
Running Tests
# Run all tests
# Run with output
# Run specific tests
API Testing
# Start the server
&
# Run test script
# Or test individual endpoints
|
Deployment
Running the API Server
# Build
# Run directly
# Server starts on http://0.0.0.0:2000
# Run with PM2 (production)
Docker Deployment
FROM rust:1.75 as builder
WORKDIR /app
COPY . .
RUN cargo build --release --features server --bin api-server-enhanced
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y ca-certificates && rm -rf /var/lib/apt/lists/*
COPY --from=builder /app/target/release/api-server-enhanced /usr/local/bin/
EXPOSE 2000
CMD ["api-server-enhanced"]
PM2 Configuration
// ecosystem.config.js
module.exports = ;
Contributing
- Fork the repository
- Create a feature branch:
git checkout -b feature/new-feature - Make your changes
- Run tests:
cargo test - Format code:
cargo fmt - Create pull request
Changelog & Roadmap
v2.1.1 (Current)
- ✅ Enhanced API with 6 endpoints
- ✅ Paragraph-relative position tracking
- ✅ Unicode-safe character indexing
- ✅ Comprehensive style analysis
- ✅ Intelligent messaging system
Roadmap
- WebSocket real-time analysis
- Multi-language support
- VS Code extension
- WebAssembly build
FAQ
Q: What port does the API run on?
A: 0.0.0.0:2000 by default.
Q: Is CORS enabled? A: Yes, permissive CORS is enabled for all endpoints.
Q: What's the difference between the two API servers?
A: api-server has 1 endpoint (/analyze), while api-server-enhanced has 6 specialized endpoints.
Q: How are positions calculated? A: All positions are character-based (not byte-based) for proper Unicode support.
License & Credits
MIT License - Copyright (c) 2025 Eeman Majumder
Built with:
- Rust
- Axum - Web framework
- Tokio - Async runtime
- Serde - Serialization
- Tower-HTTP - HTTP middleware
Built with ❤️ using Rust 🦀