Crate geoarrow_array

Source
Expand description

§geoarrow-array

The central type in Apache Arrow are arrays, which are a known-length sequence of values all having the same type. This crate provides concrete implementations of each type defined in the GeoArrow specification, as well as a GeoArrowArray trait that can be used for type-erasure.

In order to minimize overhead of dynamic downcasting, the array types in this crate are defined “natively” and there’s a O(1) conversion process that needs to happen to convert between a GeoArrow array type and an arrow array type.

§Building a GeoArrow Array

Use builders to construct GeoArrow arrays. These builders offer a push-based interface to construct arrays from a series of objects that implement geo-traits.

let point_type = PointType::new(CoordType::Separated, Dimension::XY, Default::default());
let mut builder = PointBuilder::new(point_type);

builder.push_point(Some(&geo_types::point!(x: 0., y: 1.)));
builder.push_point(Some(&geo_types::point!(x: 2., y: 3.)));
builder.push_point(Some(&geo_types::point!(x: 4., y: 5.)));

let array: PointArray = builder.finish();

let point_0: Point<'_> = array.get(0).unwrap().unwrap();
assert_eq!(point_0.coord().unwrap().x_y(), (0., 1.));

Converting a builder to an array via finish() is always O(1).

§Converting to and from arrow Arrays

The geoarrow crates depend on and are designed to be used in combination with the upstream Arrow crates. As such, we have easy integration to convert between representations of each crate.

Note that an Array or ArrayRef only maintains information about the physical DataType and will lose any extension type information. Because of this, it’s imperative to store an Array and Field together since the Field persists the Arrow extension metadata. A RecordBatch holds an Array and Field together for each column, so a RecordBatch will persist extension metadata.

§Converting to GeoArrow Arrays

If you have an Array and Field but don’t know the geometry type of the array, you can use from_arrow_array:

fn use_from_arrow_array(array: &dyn Array, field: &Field) {
    let geoarrow_array: Arc<dyn GeoArrowArray> = from_arrow_array(array, field).unwrap();
    match geoarrow_array.data_type() {
        GeoArrowType::Point(_) => {
            let array: &PointArray = geoarrow_array.as_point();
        }
        _ => todo!("handle other geometry types"),
    }
}

If you know the geometry type of your array, you can use one of its TryFrom implementations to convert directly to that type. This means you don’t have to downcast on the GeoArrow side from an Arc<dyn GeoArrowArray>.

fn convert_to_point_array(array: &dyn Array, field: &Field) {
    let point_array = PointArray::try_from((array, field)).unwrap();
}

§Converting to arrow Arrays

You can use the to_array_ref or into_array_ref methods on GeoArrowArray to convert to an ArrayRef.

Alternatively, if you have a concrete GeoArrow array type, you can use IntoArray to convert to a concrete arrow array type.

The easiest way today to access an arrow Field is to use IntoArray::ext_type and then call to_field on the result. We like to make this process simpler in the future.

§Downcasting a GeoArrow array

Arrays are often passed around as a dynamically typed &dyn GeoArrowArray or Arc<dyn GeoArrowArray>.

While these arrays can be passed directly to compute functions, it is often the case that you wish to interact with the concrete arrays directly.

This requires downcasting to the concrete type of the array. Use the cast::AsGeoArrowArray extension trait to do this ergonomically.

use geoarrow_array::cast::AsGeoArrowArray;
use geoarrow_array::{GeoArrowArrayAccessor, GeoArrowArray};

fn iter_line_string_array(array: &dyn GeoArrowArray) {
    for row in array.as_line_string().iter() {
        // do something with each row
    }
}

Modules§

array
The concrete array definitions.
builder
Push-based APIs for constructing arrays.
capacity
Counters for managing buffer lengths for each geometry array type.
cast
Helper functions for downcasting dyn GeoArrowArray to concrete types and for converting between GeoArrow array representations.
crs
Defines CRS transforms used for writing GeoArrow data to file formats that require different CRS representations.
error
Defines GeoArrowError, representing all errors returned by this crate.
scalar
Scalar references onto a parent GeoArrow array.

Macros§

downcast_geoarrow_array
Downcast a GeoArrowArray to a concrete-typed array based on its GeoArrowType.

Enums§

GeoArrowType
A type enum representing all possible GeoArrow geometry types, including both “native” and “serialized” encodings.

Traits§

GeoArrowArray
A base trait for all GeoArrow arrays.
GeoArrowArrayAccessor
A trait for accessing the values of a GeoArrowArray.
IntoArrow
Convert GeoArrow arrays into their respective arrow arrays.