arrow-cast-guess-precision
Cast integer to timestamp with precision guessing options.
Just replace arrow::compute::cast with arrow_cast_guess_precision::cast and everything done.
use ;
let data = vec!;
let array = from;
let array = cast
.unwrap;
let nanos = array
.as_any
.
.unwrap;
assert_eq!;
The difference to official arrow::compute::cast is that:
- arrow v49 will cast integer directly to timestamp, but this crate(
arrow-cast-guess-precision = "0.3.0") will try to guess from the value. - arrow v48 does not support casting from integers to timestamp (
arrow-cast-guess-precision = "0.2.0").
The guessing method is:
use TimeUnit;
const GUESSING_BOUND_YEARS: i64 = 10000;
const LOWER_BOUND_MILLIS: i64 = 86400 * 365 * GUESSING_BOUND_YEARS;
const LOWER_BOUND_MICROS: i64 = 1000 * 86400 * 365 * GUESSING_BOUND_YEARS;
const LOWER_BOUND_NANOS: i64 = 1000 * 1000 * 86400 * 365 * GUESSING_BOUND_YEARS;
const
Users could set ARROW_CAST_GUESSING_BOUND_YEARS environment at build-time to control the guessing bound.
here is a sample list based on individual environment values:
| value | lower bound | Upper Bound |
|---|---|---|
| 100 | 1970-02-06t12:00:00 | 2069-12-07T00:00:00 |
| 200 | 1970-03-15t00:00:00 | 2169-11-13T00:00:00 |
| 500 | 1970-07-02t12:00:00 | 2469-09-01T00:00:00 |
| 1000 | 1971-01-01T00:00:00 | 2969-05-03T00:00:00 |
| 2000 | 1972-01-01t00:00:00 | 3968-09-03T00:00:00 |
| 5000 | 1974-12-31t00:00:00 | 6966-09-06T00:00:00 |
| 10000 | 1979-12-30t00:00:00 | +11963-05-13T00:00:00 |
We use ARROW_CAST_GUESSING_BOUND_YEARS=1000 by default, just because 1000 milliseconds is 1 second so that the lower bound starts with 1971-01-01T00:00:00 which is one year after ZERO unix timestamp, and the upper bound is enough (even 100-years is enough though).
Like arrow::compute::cast, this crate also supports casting with specific options, checkout CastOptions.
License: MIT