clockwords
Find and resolve natural-language time expressions in text.
clockwords scans free-form text for relative time expressions like "last Friday from 9 to eleven", "yesterday at 3pm", or "letzten Freitag von 9 bis 12 Uhr" and returns their byte-offset spans together with resolved DateTime<Utc> values. It supports English, German, French, and Spanish out of the box.
Built for real-time GUI applications (time-tracking, note-taking, calendars) where the user types naturally and the app highlights detected time references as they appear. Timezone-aware — times the user enters are interpreted in their local timezone (configurable, defaults to UTC).
Features
- Four languages: English, German, French, Spanish
- Timezone-aware: User input is interpreted in a configurable timezone (defaults to UTC for backward compatibility)
- Byte-offset spans: Directly usable for text highlighting in any GUI framework
- Resolved times: Every match resolves to a concrete
DateTime<Utc>point or range - Incremental typing support: Detects partial matches (e.g.
"yester"while the user is still typing"yesterday") - Accent-tolerant: Handles
días/dias,à/a,mañana/manana,dernière/derniere - Fast rejection: Aho-Corasick keyword prefilter skips text with no time-related words in sub-microsecond time
- Zero allocations on rejection: If no keywords are found,
scan()returns immediately - No unsafe code
- Defensive: All internal date arithmetic returns
Option— no panics from edge-case dates
Quick Start
Add to your Cargo.toml:
[]
= "0.3"
Basic Usage
use ;
use Utc;
Output:
Found 'The last hour' at bytes 0..13 (TimeRange)
Resolved to: 2026-02-08T12:30:00Z .. 2026-02-08T13:30:00Z
Select Specific Languages
use scanner_for_languages;
// Only English and German
let scanner = scanner_for_languages;
Timezone Support
By default, all times are interpreted in UTC. To interpret user input in a specific timezone, configure ParserConfig::timezone or use scan_with_tz():
use ;
use Utc;
// Option 1: Set timezone in config
let config = ParserConfig ;
// Pass config when constructing the scanner (e.g. via TimeExpressionScanner::new)
// Option 2: Override per scan call
let scanner = default_scanner;
let matches = scanner.scan_with_tz;
// "3pm" is interpreted as 15:00 Berlin time → resolves to 14:00 UTC (in winter)
When a timezone is set, all day boundaries (midnight), time-of-day values, and weekday calculations use the user's local timezone. The resolved output always remains in UTC. For example, with Europe/Berlin (CET, UTC+1 in winter):
"today"at 23:30 UTC (= 00:30 CET next day) → the range covers the next calendar day in Berlin"at 3pm"→ resolves to 14:00 UTC (not 15:00 UTC)"the last hour"→ unchanged (duration-based, timezone-independent)
Supported Expressions
Relative Days
| Language | Examples |
|---|---|
| English | today, tomorrow, yesterday |
| German | heute, morgen, gestern |
| French | aujourd'hui, demain, hier |
| Spanish | hoy, mañana, ayer |
Resolves to a full-day Range (midnight to midnight in the configured timezone).
Relative Weekdays
| Language | Examples |
|---|---|
| English | last Friday, next Monday, this Wednesday |
| German | letzten Freitag, nächsten Montag, diesen Mittwoch |
| French | vendredi dernier, lundi prochain, ce mercredi |
| Spanish | el viernes pasado, el próximo lunes, este miércoles |
Resolves to a full-day Range (midnight to midnight in the configured timezone). French and Spanish support both pre- and post-positive word order (e.g. lundi prochain and prochain lundi). Spanish also supports el viernes que viene.
Day Offsets
| Language | Examples |
|---|---|
| English | in 4 days, two days ago, in three days |
| German | in 3 Tagen, vor zwei Tagen |
| French | dans 3 jours, il y a deux jours |
| Spanish | en 3 días, hace 2 dias |
Supports both digits and written-out number words (1–30).
Time Specifications
| Language | Examples |
|---|---|
| English | at 3pm, at 3 am, 13 o'clock, at 3:30pm, 11:30am, at 15:30 |
| German | um 15 Uhr, um 15:30 Uhr, um 15:30 |
| French | à 13h, à 13h30, à 13:30 |
| Spanish | a las 3, a las 15:30 |
Colon-delimited minutes (H:MM) are supported in all languages. In English, am/pm is optional — bare H:MM with at is treated as 24-hour time. French supports both h and : as separators (13h30 and 13:30).
Resolves to a Point in time.
Time Ranges
| Language | Examples |
|---|---|
| English | the last hour, last minute, between 9 and 12, from 9 to 12 |
| German | die letzte Stunde, von 9 bis 12 Uhr, zwischen 9 und 12 |
| French | la dernière heure, entre 9 et 12 heures |
| Spanish | la última hora, entre las 9 y las 12 |
English supports both between X and Y and from X to Y with number words (from nine to five).
Combined Expressions
Any day reference (relative day, weekday, or day offset) can be combined with a time specification or time range in a single expression. The entire phrase is detected as one match:
Relative day + time:
| Language | Examples |
|---|---|
| English | yesterday at 3pm, yesterday at 3:30pm, yesterday at 15:30, tomorrow between 9 and 12, yesterday from 9 to 11 |
| German | gestern um 15 Uhr, gestern um 15:30 Uhr, gestern um 15:30, gestern von 9 bis 12 Uhr |
| French | hier à 13h, hier à 13h30, hier à 13:30, hier entre 9 et 12 heures |
| Spanish | ayer a las 3, ayer a las 15:30, ayer entre las 9 y las 12 |
Weekday + time:
| Language | Examples |
|---|---|
| English | last Friday at 3pm, last Friday at 3:30pm, last Friday at 15:30, last Friday from 9 to eleven, next Monday between 9 and 12 |
| German | letzten Freitag um 15 Uhr, letzten Freitag um 15:30 Uhr, nächsten Montag um 9:15, diesen Mittwoch zwischen 9 und 11 |
| French | vendredi dernier à 13h, vendredi dernier à 13h30, vendredi dernier à 13:30, ce lundi à 14h30, ce mercredi entre 9 et 11 heures |
| Spanish | el viernes pasado a las 3, el viernes pasado a las 3:30, el próximo lunes a las 9:30, el pasado viernes entre las 9 y las 12 |
Combined expressions resolve to either a Point (day + time spec) or a Range (day + time range) on the specified day.
Architecture
How Scanning Works
Input text
│
▼
┌─────────────────────┐
│ Aho-Corasick │ Fast keyword check (~ns)
│ Prefilter │ Rejects text with no time words
└─────────┬───────────┘
│ keywords found
▼
┌─────────────────────┐
│ Per-Language │ Regex rules with resolver closures
│ Grammar Rules │ Run for each enabled language
└─────────┬───────────┘
│ raw matches
▼
┌─────────────────────┐
│ Deduplication │ Prefer Complete > Partial, longer > shorter
│ & Sorting │ Remove overlapping inferior matches
└─────────┬───────────┘
│
▼
Vec<TimeMatch>
Buffer-Rescan Strategy
Rather than maintaining an incremental parser state machine, clockwords re-scans the full text buffer on every call to scan(). This is the right trade-off for GUI text input:
- Input buffers are typically < 1 KB
- Full regex scan of a short buffer completes in microseconds
- Dramatically simpler than maintaining parser state across edits
- No edge cases around cursor position, insertions, or deletions
Type Overview
| Type | Description |
|---|---|
TimeExpressionScanner |
Main entry point — holds language parsers and prefilter |
TimeMatch |
A single match result: span + confidence + resolved time + kind |
Span |
Byte-offset range (start..end) for slicing the original text |
ResolvedTime |
Point(DateTime<Utc>) or Range { start, end } |
MatchConfidence |
Partial (user still typing) or Complete |
ExpressionKind |
RelativeDay, RelativeDayOffset, TimeSpecification, TimeRange, Combined |
ParserConfig |
Settings: report_partial (default true), max_matches (default 10), timezone (default Tz::UTC) |
Tz |
Re-exported from chrono-tz — IANA timezone (e.g. Tz::Europe__Berlin, Tz::US__Eastern) |
GUI Integration
clockwords is designed for real-time text highlighting. Here's how to wire it up:
use ;
use Utc;
Partial Match Highlighting
When the user types "I worked yester", the scanner returns a Partial match on "yester". Your GUI can show a dimmed or dotted underline to hint that a time expression is being formed. Once the user completes "yesterday", the match upgrades to Complete with a fully resolved time.
To disable partial matching:
use ;
let config = ParserConfig ;
Adding a New Language
- Create
src/lang/xx.rs(copy an existing language file as a template) - Implement the
LanguageParsertrait:lang_id()— return the ISO 639-1 code (e.g."it")keywords()— return Aho-Corasick trigger wordskeyword_prefixes()— return typing prefixes (length >= 3)parse()— callapply_rules()with yourGrammarRulelist
- Add number-word mappings to
src/lang/numbers.rs - Register the language in
src/lib.rs→scanner_for_languages() - Add tests in
tests/
Each GrammarRule is a compiled regex paired with a resolver closure:
GrammarRule
Performance
| Scenario | Approximate Time |
|---|---|
| No keywords in text (fast rejection) | ~8 µs |
| Short sentence with 1 match | ~17 µs |
| Paragraph with multiple matches | ~18 µs |
The Aho-Corasick prefilter means that text without any time-related words is rejected in microseconds — the regex engine is never invoked.
Running Tests
The test suite includes 141 integration tests + 1 doctest covering:
- All four languages with various expression types
- Combined weekday + time expressions across all languages
- Timezone-aware resolution (Europe/Berlin, US/Eastern, UTC)
- Cross-midnight timezone boundary handling
- Accent-tolerant variants (with and without diacritics)
- Embedded expressions in longer sentences
- Colon-delimited time parsing (
3:30pm,15:30,13h30,13:30) from X to Ywith number words (nine to five)- Incremental/partial matching
- Edge cases (empty input, no false positives)
- Cross-language default scanner
Running the TUI Demo
An interactive terminal demo is included:
Type time expressions and watch them get parsed in real time. Press ESC to quit.
Dependencies
| Crate | Purpose |
|---|---|
chrono |
Date/time types and arithmetic |
chrono-tz |
IANA timezone database for timezone-aware resolution |
regex |
Per-language grammar patterns |
aho-corasick |
Fast multi-keyword prefilter |
License
Licensed under the Apache License, Version 2.0 (LICENSE or http://www.apache.org/licenses/LICENSE-2.0).