astorion
Duckling-style parsing engine in Rust.
What is it?
Astorion is a Rust port of Duckling’s rule-based entity parsing pipeline.
Who is it for / why does it exist?
- Teams that want Duckling-like time/numeral parsing but prefer a Rust codebase.
- Contributors who want an engine + rule architecture that’s easy to extend with new dimensions/locales.
- Anyone experimenting with saturation-style parsing (discover nodes → combine them → resolve).
Quick start
RUSTLING_DEBUG_RULES=1
Example usage
Run the built-in CLI (prints a saturation summary + resolved tokens):
Or build a release binary:
Status & guarantees
- Status: alpha, CI ready.
- Stability: a minimal public API is stabilized; see "Public API" below.
- MSRV: 1.85.0 (see
rust-versioninCargo.toml). - Breaking changes: allowed at any time while
0.x.
Roadmap
- Improve parity with Duckling semantics (span/ranking behavior).
- Add locale scaffolding and additional dimensions.
Public API
Astorion now exposes a deliberately small, stable API surface intended for early adopters:
parse(text) -> ParseResultparse_with(text, &Context, &Options) -> ParseResultContext,Options,Entity, andParseResult
These items are re-exported at the crate root (crate::time_expr::parse, crate::time_expr::ParseResult, etc.).
All other modules, types, and debug/verbose entry points are considered internal and may change
without notice while the crate is in 0.x.
Contributing
See CONTRIBUTING.md.
Release process
See docs/release-process.md and CHANGELOG.md for versioning and release guidance.
License
MIT. See LICENSE.
Features
- Duckling-style rule engine: regex/predicate patterns, production closures, saturation to a fixed point.
- Span-based results with rule provenance (
rule_name) and an evidence chain. - Built-in CLI debug report (saturation passes, tokens, timings).
Installation
To use it as a dependency, add a path dependency:
[]
= "0.4.0"
CLI usage
The CLI is the primary interface and ships with usage, flags, and exit codes:
Options
| Option | Description |
|---|---|
-i, --input <text> |
Input text to parse. If omitted, Astorion reads remaining args or stdin when no args are provided. |
--reference <timestamp> |
Reference time in YYYY-MM-DDTHH:MM:SS (default: 2013-02-12T04:30:00). |
--color |
Force ANSI color output. |
--no-color |
Disable ANSI color output. |
--regex-profile |
Collect regex timing stats and print a profiling summary (adds overhead). See docs/regex-profiling.md for guidance. |
-h, --help |
Show help text. |
-V, --version |
Print version information. |
Set RUSTLING_DEBUG_RULES=1 to print rule filtering/production diagnostics. Detailed tips for interpreting the regex profiling report live in docs/regex-profiling.md.
How it works
At a high level, the engine repeatedly applies rules to grow a stash of Nodes, then resolves and filters the results.
flowchart TD
subgraph Inputs
A(["Raw input string"])
B(["Rule set<br/>Pattern + production"])
end
subgraph ParserLifecycle
C(["Parser::new / new_compiled"])
C1(["TriggerInfo::scan<br/>buckets + phrases"])
C2(["Select active rules<br/>bucket + phrase gating"])
C3(["Split rules<br/>regex_rules / predicate_rules"])
D(["run_rule_set(regex_rules)<br/>seed pass"])
E(["Deduplicate<br/>node_key + seen"])
F(["stash = stash.union(new)"])
subgraph SaturationLoop
direction TB
G(["Filter rules by deps<br/>dimensions_in_stash"])
H(["run_rule_set(predicate + regex)"])
I(["Deduplicate + union"])
J{New nodes?}
end
K(["resolve_filtered<br/>resolve_node then drop subsumed spans"])
L(["ResolvedToken<br/>value + span + rule"])
end
A --> C
B --> C
C --> C1 --> C2 --> C3 --> D --> E --> F --> G --> H --> I --> J
J -- yes --> G
J -- no --> K --> L
%% Styling
classDef input fill:transparent,stroke:#06B6D4,stroke-width:1.5px;
classDef setup fill:transparent,stroke:#6366F1,stroke-width:1.5px;
classDef loop fill:transparent,stroke:#10B981,stroke-width:1.5px;
classDef resolve fill:transparent,stroke:#F97316,stroke-width:1.5px;
classDef decision fill:transparent,stroke:#94A3B8,stroke-width:1.5px;
class A,B input;
class C,C1,C2,C3,D,E,F setup;
class G,H,I loop;
class K,L resolve;
class J decision;
Key implementation touchpoints:
Parser::new_compiledperforms trigger scanning and rule activation.Parser::saturateruns an initial regex pass, then loops predicate-first until a fixed point.Parser::node_key+seenprevent unbounded growth from duplicate nodes.Parser::resolve_filteredresolves nodes, sorts by dimension/span, and drops spans contained by a wider match.
Contributing
See CONTRIBUTING.md.
License
MIT. See LICENSE.