Crate two_timer

source ·
Expand description

This crate provides a parse function to convert English time expressions into a pair of timestamps representing a time range. It converts “today” into the first and last moments of today, “May 6, 1968” into the first and last moments of that day, “last year” into the first and last moments of that year, and so on. It does this even for expressions generally interpreted as referring to a point in time, such as “3 PM”, though for these it always assumes a granularity of one second. For pointwise expression the first moment is the point explicitly named. The parse expression actually returns a 3-tuple consisting of the two timestamps and whether the expression is literally a range – two time expressions separated by a preposition such as “to”, “through”, “up to”, or “until”.

Example

extern crate two_timer;
use two_timer::{parse, Config};
extern crate chrono;
use chrono::naive::NaiveDate;

pub fn main() {
    let phrases = [
        "now",
        "this year",
        "last Friday",
        "from now to the end of time",
        "Ragnarok",
        "at 3:00 pm today",
        "5/6/69",
        "Tuesday, May 6, 1969 at 3:52 AM",
        "March 15, 44 BC",
        "Friday the 13th",
        "five minutes before and after midnight",
    ];
    // find the maximum phrase length for pretty formatting
    let max = phrases
        .iter()
        .max_by(|a, b| a.len().cmp(&b.len()))
        .unwrap()
        .len();
    for phrase in phrases.iter() {
        match parse(phrase, None) {
            Ok((d1, d2, _)) => println!("{:width$} => {} --- {}", phrase, d1, d2, width = max),
            Err(e) => println!("{:?}", e),
        }
    }
    let now = NaiveDate::from_ymd_opt(1066, 10, 14).unwrap().and_hms(12, 30, 15);
    println!("\nlet \"now\" be some moment during the Battle of Hastings, specifically {}\n", now);
    let conf = Config::new().now(now);
    for phrase in phrases.iter() {
        match parse(phrase, Some(conf.clone())) {
            Ok((d1, d2, _)) => println!("{:width$} => {} --- {}", phrase, d1, d2, width = max),
            Err(e) => println!("{:?}", e),
        }
    }
}

produces

now                                    => 2019-02-03 14:40:00 --- 2019-02-03 14:41:00
this year                              => 2019-01-01 00:00:00 --- 2020-01-01 00:00:00
last Friday                            => 2019-01-25 00:00:00 --- 2019-01-26 00:00:00
from now to the end of time            => 2019-02-03 14:40:00 --- +262143-12-31 23:59:59.999
Ragnarok                               => +262143-12-31 23:59:59.999 --- +262143-12-31 23:59:59.999
at 3:00 pm today                       => 2019-02-03 15:00:00 --- 2019-02-03 15:01:00
5/6/69                                 => 1969-05-06 00:00:00 --- 1969-05-07 00:00:00
Tuesday, May 6, 1969 at 3:52 AM        => 1969-05-06 03:52:00 --- 1969-05-06 03:53:00
March 15, 44 BC                        => -0043-03-15 00:00:00 --- -0043-03-16 00:00:00
Friday the 13th                        => 2018-07-13 00:00:00 --- 2018-07-14 00:00:00
five minutes before and after midnight => 2019-02-02 23:55:00 --- 2019-02-03 00:05:00

let "now" be some moment during the Battle of Hastings, specifically 1066-10-14 12:30:15

now                                    => 1066-10-14 12:30:00 --- 1066-10-14 12:31:00
this year                              => 1066-01-01 00:00:00 --- 1067-01-01 00:00:00
last Friday                            => 1066-10-05 00:00:00 --- 1066-10-06 00:00:00
from now to the end of time            => 1066-10-14 12:30:00 --- +262143-12-31 23:59:59.999
Ragnarok                               => +262143-12-31 23:59:59.999 --- +262143-12-31 23:59:59.999
at 3:00 pm today                       => 1066-10-14 15:00:00 --- 1066-10-14 15:01:00
5/6/69                                 => 0969-05-06 00:00:00 --- 0969-05-07 00:00:00
Tuesday, May 6, 1969 at 3:52 AM        => 1969-05-06 03:52:00 --- 1969-05-06 03:53:00
March 15, 44 BC                        => -0043-03-15 00:00:00 --- -0043-03-16 00:00:00
Friday the 13th                        => 1066-07-13 00:00:00 --- 1066-07-14 00:00:00
five minutes before and after midnight => 1066-10-13 23:55:00 --- 1066-10-14 00:05:00

For the full grammar of time expressions, view the source of the parse function and scroll up. The grammar is provided at the top of the file.

Relative Times

It is common in English to use time expressions which must be interpreted relative to some context. The context may be verb tense, other events in the discourse, or other semantic or pragmatic clues. The two_timer parse function doesn’t attempt to infer context perfectly, but it does make some attempt to get the context right. So, for instance “last Monday through Friday”, said on Saturday, will end on a different day from “next Monday through Friday”. The general rules are

  1. a fully-specified expression in a pair will provide the context for the other expression
  2. a relative expression will be interpreted as appropriate given its order – the second expression describes a time after the first
  3. if neither expression is fully-specified, the first will be interpreted relative to “now” and the second relative to the first
  4. a moment interpreted relative to “now” will be assumed to be before now unless the configuration parameter default_to_past is set to false, in which case it will be assumed to be after now

The rules of interpretation for relative time expressions in ranges will likely be refined further in the future.

Clock Time

The parse function interprets expressions such as “3:00” as referring to time on a 24 hour clock, so “3:00” will be interpreted as “3:00 AM”. This is true even in ranges such as “3:00 PM to 4”, where the more natural interpretation might be “3:00 PM to 4:00 PM”.

Years Near 0

Since it is common to abbreviate years to the last two digits of the century, two-digit years will be interpreted as abbreviated unless followed by a suffix such as “B.C.E.” or “AD”. They will be interpreted by default as the the nearest appropriate previous year to the current moment, so in 2010 “’11” will be interpreted as 1911, not 2011. If you set the configuration parameter default_to_past to false this is reversed, so “’11” in 2020 will be interpreted as 2111.

The Second Time in Ranges

For single expressions, like “this year”, “today”, “3:00”, or “next month”, the second of the two timestamps is straightforward – it is the end of the relevant temporal unit. “1971” will be interpreted as the first moment of the first day of 1971 through, but excluding, the first moment of the first day of 1972, so the second timestamp will be this first excluded moment.

When the parsed expression describes a range, we’re really dealing with two potentially overlapping pairs of timestamps and the choice of the terminal timestamp gets trickier. The general rule will be that if the second interval is shorter than a day, the first timestamp is the first excluded moment, so “today to 3:00 PM” means the first moment of the day up to, but excluding, 3:00 PM. If the second unit is as big as or larger than a day, which timestamp is used varies according to the preposition. “This week up to Friday” excludes all of Friday. “This week through Friday” includes all of Friday. Prepositions are assumed to fall into either the “to” class or the “through” class. You may also use a series of dashes as a synonym for “through”, so “this week - fri” is equivalent to “this week through Friday”. For the most current list of prepositions in each class, consult the grammar used for parsing, but as of the moment, these are the rules:

        up_to => [["to", "until", "up to", "till"]]
        through => [["up through", "through", "thru"]] | r("-+")

Pay Periods

I’m writing this library in anticipation of, for the sake of amusement, rewriting JobLog in Rust. This means I need the time expressions parsed to include pay periods. Pay periods, though, are defined relative to some reference date – a particular Sunday, say – and have a variable period. two_timer, and JobLog, assume pay periods are of a fixed length and tile the timeline without overlap, so a pay period of a calendrical month is problematic.

If you need to interpret “last pay period”, say, you will need to specify when this pay period began, or when some pay period began or will begin, and a pay period length in days. The parse function has a second optional argument, a Config object, whose chief function outside of testing is to provide this information. So, for example, you could do this:

let (reference_time, _, _) = parse("5/6/69", None).unwrap();
let config = Config::new().pay_period_start(Some(reference_time.date()));
let (t1, t2, _) = parse("next pay period", Some(config)).unwrap();

Ambiguous Year Formats

two_timer will try various year-month-day permutations until one of them parses given that days are in the range 1-31 and months, 1-12. This is the order in which it tries these permutations:

  1. year/month/day
  2. year/day/month
  3. month/day/year
  4. day/month/year

The potential unit separators are /, ., and -. Whitespace is optional.

Timezones

At the moment two_timer only produces “naive” times. Sorry about that.

Optional Features

The regular expression used by two-timer is extremely efficient once compiled but extremely slow to compile. This means that the first use of the regular expression will ocassion a perceptible delay. I wrote two-timer as a component of a Rust re-write of a Perl command line application I also wrote, App::JobLog. Compiling the full time grammar required by two-timer makes the common use cases for the Rust version of the application slower than the Perl version. To address this I added an optional feature to two-timer that one can enable like so:

[dependencies.two_timer]
version = "~2.2"
features = ["small_grammar"]

This will cause two-timer to attempt to parse a time expression initially with a simplified grammar containing only the typical expressions used with JobLog, falling back on the full grammar if this fails. These are

  1. Days of the week, optionally abbreviated
    • Tuesday
    • tue
    • tu
  2. Month names
    • June
    • Jun
  3. Days, months, or fixed periods of time modified by “this” or “last”
    • this month
    • last week
    • this year
    • this pay period
    • last Monday
  4. Certain temporal adverbs
    • now
    • today
    • yesterday });

Structs

  • A collection of parameters that can influence the interpretation of time expressions.

Enums

  • A simple categorization of things that could go wrong.

Functions

  • The moment regarded as the beginning of time.
  • The moment regarded as the end of time.
  • Simply returns whether the given phrase is parsable as a time expression. This is slightly more efficient than parse(expression, None).is_ok() as no parse tree is generated.
  • Converts a time expression into a pair or timestamps and a boolean indicating whether the expression was literally a range, such as “9 to 11”, as opposed to “9 AM”, say.