jiff_tzdb/lib.rs
1/*!
2A crate that embeds data from the [IANA Time Zone Database].
3
4This crate is meant to be a "raw data" library. That is, it primarily exposes
5one routine that permits looking up the raw [TZif] data given a time zone name.
6The data returned is embedded into the compiled library. In order to actually
7use the data, you'll need a TZif parser, such as the one found in [Jiff] via
8`TimeZone::tzif`.
9
10This crate also exposes another routine, [`available`], for iterating over the
11names of all time zones embedded into this crate.
12
13# Should I use this crate?
14
15In general, no. It's first and foremost an implementation detail of Jiff, but
16if you 1) need raw access to the TZif data and 2) need to bundle it in your
17binary, then it's plausible that using this crate is appropriate.
18
19With that said, the _preferred_ way to read TZif data is from your system's
20copy of the Time Zone Database. On macOS and most Linux installations, a copy
21of this data can be found at `/usr/share/zoneinfo`. Indeed, Jiff will use this
22system copy whenever possible, and not use this crate at all. The system copy
23is preferred because the Time Zone Database is occasionally updated (perhaps a
24few times per year), and it is usually better to rely on your system updates
25for such things than some random Rust library.
26
27However, some popular environments, like Windows, do not have a standard
28system copy of the Time Zone Database. In those circumstances, Jiff will depend
29on this crate and bundle the time zone data into the binary. This is not an
30ideal solution, but it makes Most Things Just Work Most of the Time on all
31major platforms.
32
33# Data generation
34
35The data in this crate comes from the [IANA Time Zone Database] "data only"
36distribution. [`jiff-cli`] is used to first compile the release into binary
37TZif data using the `zic` compiler, and secondly, converts the binary data into
38a flattened and de-duplicated representation that is embedded into this crate's
39source code.
40
41The conversion into the TZif binary data uses the following settings:
42
43* The "rearguard" data is used (see below).
44* The binary data itself is compiled using the "slim" format. Which
45 effectively means that the TZif data primarily only uses explicit
46 time zone transitions for historical data and POSIX time zones for
47 current time zone transition rules. This doesn't have any impact
48 on the actual results. The reason that there are "slim" and "fat"
49 formats is to support legacy applications that can't deal with
50 POSIX time zones. For example, `/usr/share/zoneinfo` on my modern
51 Archlinux installation (2025-02-27) is in the "fat" format.
52
53The reason that rearguard data is used is a bit more subtle and has to do with
54a difference in how the IANA Time Zone Database treats its internal "daylight
55saving time" flag and what people in the "real world" consider "daylight
56saving time." For example, in the standard distribution of the IANA Time Zone
57Database, `Europe/Dublin` has its daylight saving time flag set to _true_
58during Winter and set to _false_ during Summer. The actual time shifts are the
59same as, e.g., `Europe/London`, but which one is actually labeled "daylight
60saving time" is not.
61
62The IANA Time Zone Database does this for `Europe/Dublin`, presumably, because
63_legally_, time during the Summer in Ireland is called `Irish Standard Time`,
64and time during the Winter is called `Greenwich Mean Time`. These legal names
65are reversed from what is typically the case, where "standard" time is during
66the Winter and daylight saving time is during the Summer. The IANA Time Zone
67Database implements this tweak in legal language via a "negative daylight
68saving time offset." This is somewhat odd, and some consumers of the IANA Time
69Zone Database cannot handle it. Thus, the rearguard format was born for,
70seemingly, legacy programs.
71
72Jiff can handle negative daylight saving time offsets just fine, but we use the
73rearguard format anyway so that the underlying data more accurately reflects
74on-the-ground reality for humans living in `Europe/Dublin`. In particular,
75using the rearguard data enables [localization of time zone names] to be done
76correctly.
77
78[IANA Time Zone Database]: https://www.iana.org/time-zones
79[TZif]: https://datatracker.ietf.org/doc/html/rfc8536
80[Jiff]: https://docs.rs/jiff
81[`jiff-cli`]: https://github.com/BurntSushi/jiff/tree/master/crates/jiff-cli
82[localization of time zone names]: https://github.com/BurntSushi/jiff/issues/258
83*/
84
85#![no_std]
86
87mod tzname;
88
89static TZIF_DATA: &[u8] = include_bytes!("concatenated-zoneinfo.dat");
90
91/// The version of the IANA Time Zone Database that was bundled.
92///
93/// If this bundled database was generated from a pre-existing system copy
94/// of the Time Zone Database, then it's possible no version information was
95/// available.
96pub static VERSION: Option<&str> = tzname::VERSION;
97
98/// Returns the binary TZif data for the time zone name given.
99///
100/// This also returns the canonical name for the time zone. Namely, since this
101/// lookup is performed without regard to ASCII case, the given name may not be
102/// the canonical capitalization of the time zone.
103///
104/// If no matching time zone data exists, then `None` is returned.
105///
106/// In order to use the data returned, it must be fed to a TZif parser. For
107/// example, if you're using [`jiff`](https://docs.rs/jiff), then this would
108/// be the `TimeZone::tzif` constructor.
109///
110/// # Example
111///
112/// Some basic examples of time zones that exist:
113///
114/// ```
115/// assert!(jiff_tzdb::get("America/New_York").is_some());
116/// assert!(jiff_tzdb::get("america/new_york").is_some());
117/// assert!(jiff_tzdb::get("America/NewYork").is_none());
118/// ```
119///
120/// And an example of how the canonical name might differ from the name given:
121///
122/// ```
123/// let (canonical_name, data) = jiff_tzdb::get("america/new_york").unwrap();
124/// assert_eq!(canonical_name, "America/New_York");
125/// // All TZif data starts with the `TZif` header.
126/// assert_eq!(&data[..4], b"TZif");
127/// ```
128pub fn get(name: &str) -> Option<(&'static str, &'static [u8])> {
129 let index = index(name)?;
130 let (canonical_name, ref range) = tzname::TZNAME_TO_OFFSET[index];
131 Some((canonical_name, &TZIF_DATA[range.clone()]))
132}
133
134/// Returns a list of all available time zone names bundled into this crate.
135///
136/// There are no API guarantees on the order of the sequence returned.
137///
138/// # Example
139///
140/// This example shows how to determine the total number of time zone names
141/// available:
142///
143/// ```
144/// assert_eq!(jiff_tzdb::available().count(), 598);
145/// ```
146///
147/// Note that this number may change in subsequent releases of the Time Zone
148/// Database.
149pub fn available() -> TimeZoneNameIter {
150 TimeZoneNameIter { it: tzname::TZNAME_TO_OFFSET.iter() }
151}
152
153/// An iterator over all time zone names embedded into this crate.
154///
155/// There are no API guarantees on the order of this iterator.
156///
157/// This iterator is created by the [`available`] function.
158#[derive(Clone, Debug)]
159pub struct TimeZoneNameIter {
160 it: core::slice::Iter<'static, (&'static str, core::ops::Range<usize>)>,
161}
162
163impl Iterator for TimeZoneNameIter {
164 type Item = &'static str;
165
166 fn next(&mut self) -> Option<&'static str> {
167 self.it.next().map(|&(name, _)| name)
168 }
169}
170
171/// Finds the index of a matching entry in `TZNAME_TO_OFFSET`.
172///
173/// If the given time zone doesn't exist, then `None` is returned.
174fn index(query_name: &str) -> Option<usize> {
175 tzname::TZNAME_TO_OFFSET
176 .binary_search_by(|(name, _)| cmp_ignore_ascii_case(name, query_name))
177 .ok()
178}
179
180/// Like std's `eq_ignore_ascii_case`, but returns a full `Ordering`.
181fn cmp_ignore_ascii_case(s1: &str, s2: &str) -> core::cmp::Ordering {
182 let it1 = s1.as_bytes().iter().map(|&b| b.to_ascii_lowercase());
183 let it2 = s2.as_bytes().iter().map(|&b| b.to_ascii_lowercase());
184 it1.cmp(it2)
185}
186
187#[cfg(test)]
188mod tests {
189 use core::cmp::Ordering;
190
191 use crate::tzname::TZNAME_TO_OFFSET;
192
193 use super::*;
194
195 /// This is a regression test where TZ names were sorted lexicographically
196 /// but case sensitively, and this could subtly break binary search.
197 #[test]
198 fn sorted_ascii_case_insensitive() {
199 for window in TZNAME_TO_OFFSET.windows(2) {
200 let (name1, _) = window[0];
201 let (name2, _) = window[1];
202 assert_eq!(
203 Ordering::Less,
204 cmp_ignore_ascii_case(name1, name2),
205 "{name1} should be less than {name2}",
206 );
207 }
208 }
209}