photom
Rust library for loading, structuring, and querying astronomical observation datasets — with trajectory grouping, multi-observer support, and efficient lookups.
Features
- Serialisation / deserialisation (
serdefeature) — persist an [ObsDataset] to JSON (or any otherserde-compatible format) and restore it without losing observations or custom observers. Runtime-only state (MPC network cache) is automatically re-initialised on deserialisation. - Polars ingestion (
polarsfeature) — load observations from aDataFrameorLazyFramewith full schema validation. - Parallel iteration (
parallelfeature) — iterate over observations, nights, and trajectories in parallel via rayon, with zero data copying. - ADES ingestion (
adesfeature) — load observations directly from MPC ADES XML files, with automatic MPC observer resolution. - MPC 80-column ingestion (
mpc_80_colfeature) — load observations from the classic MPC fixed-width 80-column ASCII format. - Parquet ingestion via DataFusion (
datafusionfeature) — load observations from any Parquet file reachable by URI (file://,http://,https://,hdfs://) using Apache Arrow / DataFusion. - Multi-observer support — MPC observatory codes (resolved lazily from the MPC website), custom geodetic sites (interned and deduplicated), or unknown observer.
- Trajectory grouping — group observations by a
traj_idcolumn; supports both integer (UInt32) and string (String) identifiers. - Three astrometric error models — FCCT14, CBM10, and VFCC17, used to assign measurement accuracies to MPC-coded observatories.
Installation
Add photom to your Cargo.toml. Without any optional features:
[]
= "0.1"
Enable individual features as needed:
[]
= { = "0.1", = ["polars", "parallel", "ades", "mpc_80_col", "datafusion", "serde"] }
All features are independent and can be combined freely.
Quick Start
Serialise and deserialise a dataset (serde feature)
ObsDataset implements the standard serde::Serialize / serde::Deserialize
traits and works with any serde-compatible format (JSON, MessagePack, …).
use ObsDataset;
// Serialise — format-agnostic (use any serde serializer).
let json = to_string?;
write?;
// Deserialise with the default index layout (Split — always safe).
let json = read_to_string?;
let restored: ObsDataset = from_str?;
// Binary format (rmp-serde / MessagePack).
let bytes: = to_vec?;
let restored: ObsDataset = from_slice?;
Choosing the index layout at deserialisation
For potentially faster look-ups you can request a contiguous index layout via
[ObsDatasetSeed] (a [serde::de::DeserializeSeed] implementation).
Any format that exposes its Deserializer struct publicly works — both
serde_json and rmp-serde do:
use ;
use DeserializeSeed as _;
// JSON
let mut de = from_str;
let restored = ObsDatasetSeed
.deserialize?;
// MessagePack (rmp-serde — compact binary)
let mut de = new;
let restored = ObsDatasetSeed
.deserialize?;
TryContiguous falls back to Split automatically for any index group whose
observations are not stored contiguously.
What is persisted
| State | Persisted? | Notes |
|---|---|---|
| Observations | Yes | Full list in insertion order |
| Custom geodetic observers | Yes | All sites and their coordinates |
| Astrometric error model | Yes | FCCT14, CBM10, VFCC17, or None |
| MPC network cache | No | Fetched lazily on first use |
| MPC network cache | No | Fetched lazily on first use |
| Trajectory aliases | Yes | Fully round-tripped |
| Night / trajectory indices | Yes | Membership stored per-observation; rebuilt on load |
Load observations from a Polars DataFrame
use ObsDataset;
use ;
let dataset = from_polars?;
for obs in dataset.iter_observations
Load from a LazyFrame
use ObsDataset;
use FromPolarsArgs;
let dataset = from_lazy?;
Load from a Parquet file (DataFusion)
use ObsDataset;
use LoadObsArgs;
let dataset = from_parquet_uri?;
println!;
Load from an ADES XML file
use ObsDataset;
// error_ra and error_dec are optional fallback uncertainties in arcseconds.
let dataset = from_ades?;
Load from an MPC 80-column file
use ObsDataset;
let dataset = from_mpc_80_col?;
Parallel iteration
use ObsDataset;
use ParallelIterator;
let count = dataset.par_iter_observations.count;
if let Some = dataset.par_iter_full_night
Coordinate and astrometric utilities
EquCoord bundles a sky position (RA, Dec) with its 1-σ uncertainties.
All values are stored internally in radians; use from_degrees to supply
degrees.
use EquCoord;
use CartesianCoord;
// Construct from degrees — converted to radians internally.
let a = from_degrees;
let b = from_degrees;
// Great-circle separation via the Vincenty formula (result in radians).
let sep = a.angular_separation;
// Vector-averaging midpoint on the sphere.
let mid = a.spherical_midpoint;
// Lossless projection onto the unit sphere (uncertainties discarded).
let cart = from;
// Recover equatorial angles (errors set to zero).
let back: EquCoord = cart.into;
// Propagate astrometric covariance through the spherical → Cartesian mapping.
// Returns CartesianCoordCov with the full 3×3 covariance matrix.
let cov = a.to_cartesian_cov;
// Inverse: propagate back to equatorial marginal 1-σ errors.
let recovered = cov.to_equatorial;
2-D covariance on the tangent plane
Cov2 is a compact symmetric 2×2 covariance matrix for astrometric error
ellipses expressed in a local tangent-plane frame.
use Cov2;
use EquCoord;
// Build a diagonal covariance from the marginal errors of an EquCoord.
let coord = from_degrees;
let cov = from_equ;
// Semi-axes of the 1-σ confidence ellipse.
let sigma_major = cov.lambda_max.max.sqrt;
let sigma_minor = cov.lambda_min.max.sqrt;
// Mahalanobis distance for an offset vector (radians).
let offset = ;
if let Some = cov.mahalanobis_sq
// Add isotropic process noise q·I (Kalman-style inflation).
let inflated = cov.inflate_isotropic;
Gnomonic (tangent-plane) projection
TangentPlane projects sky positions near a chosen tangent point onto a local
2-D Cartesian frame. Great circles project to straight lines, making this ideal
for short-arc astrometry and kinematic linking.
use EquCoord;
use ;
// Define the tangent point (degrees, converted internally to radians).
let ref_coord = from_degrees;
let plane = new;
// Forward projection: sky → tangent plane.
let target = from_degrees;
let tp = plane.project;
// Inverse projection: tangent plane → sky.
let sky = tp.unproject;
// Squared Euclidean distance between two projected points (radians²).
let other = plane.project;
let d2 = tp.dist2;
// Translate a projected point by a displacement vector.
let v = TangentVec ;
let shifted = tp + v;
DataFrame / Parquet Schema
All column values for ra, ra_err, dec, dec_err, obs_lon, obs_lat, obs_ra_acc, and obs_dec_acc must be supplied in radians. No unit conversion is performed during ingestion.
Mandatory base columns (non-nullable)
| Column | Polars type | Arrow type | Unit | Description |
|---|---|---|---|---|
id |
UInt64 |
UInt64 |
— | Unique observation identifier |
ra |
Float64 |
Float64 |
rad | Right ascension |
ra_err |
Float64 |
Float64 |
rad | 1-σ right ascension uncertainty |
dec |
Float64 |
Float64 |
rad | Declination |
dec_err |
Float64 |
Float64 |
rad | 1-σ declination uncertainty |
magnitude |
Float64 |
Float64 |
mag | Apparent magnitude |
mag_err |
Float64 |
Float64 |
mag | 1-σ magnitude uncertainty |
filter |
String |
Utf8 / UInt8 / UInt16 / UInt32 |
— | Photometric filter label or code |
mjd_tt |
Float64 |
Float64 |
MJD (TT) | Epoch (Modified Julian Date, Terrestrial Time) |
Optional observer columns (nullable; column may be absent)
| Column | Polars type | Arrow type | Unit | Description |
|---|---|---|---|---|
obs_lon |
Float64 |
Float64 |
rad | Geodetic longitude, east of Greenwich |
obs_lat |
Float64 |
Float64 |
rad | Geodetic latitude |
obs_alt |
Float64 |
Float64 |
m | Altitude above the reference ellipsoid |
obs_ra_acc |
Float64 |
Float64 |
rad | 1-σ RA measurement accuracy — required when geodetic triplet is set |
obs_dec_acc |
Float64 |
Float64 |
rad | 1-σ Dec measurement accuracy — required when geodetic triplet is set |
mpc_code_obs |
String |
Utf8 |
— | Three-byte ASCII MPC code (takes precedence over geodetic columns) |
Optional grouping / index columns
| Column | Polars type | Arrow type | Description |
|---|---|---|---|
traj_id |
UInt32 or String |
UInt32 or Utf8 |
Trajectory identifier; nullable — null rows are loaded but not assigned to any trajectory |
night_id |
UInt32 |
UInt32 |
Night identifier; nullable — null rows are included but not assigned to any night |
Observer Resolution
Each row's observer is resolved in the following order of precedence:
mpc_code_obsnon-null →ObserverId::MpcCode(MPC site, resolved lazily from the MPC website).obs_lon,obs_lat, andobs_altall non-null →ObserverId::IntId(custom geodetic site).obs_ra_accandobs_dec_accmust also be non-null.- Otherwise → no observer (
None).
A partially-null geodetic triplet (one or two of the three columns non-null) is always an ingestion error. A complete triplet without accuracy values is also an error.
Ingestion Arguments
FromPolarsArgs (Polars feature)
| Field | Type | Default | Description |
|---|---|---|---|
error_model |
Option<ObsErrorModel> |
None |
Astrometric error model for MPC-coded observatories |
do_rechunk |
Option<bool> |
Some(false) |
Force single-chunk layout before ingestion |
contiguous_choice |
Option<ContiguousChoice> |
Some(ContiguousNight) |
Sort by night or trajectory for compact index ranges |
LoadObsArgs (DataFusion feature)
| Field | Type | Default | Description |
|---|---|---|---|
error_model |
Option<ObsErrorModel> |
None |
Astrometric error model for MPC-coded observatories |
contiguous_choice |
Option<ContiguousChoice> |
Some(ContiguousNight) |
Sort by night or trajectory for compact index ranges |
Type Aliases
| Alias | Underlying type | Unit |
|---|---|---|
Arcseconds |
f64 |
Angle in arcseconds |
Radians |
f64 |
Angle in radians |
Degrees |
f64 |
Angle in degrees |
MJDTT |
f64 |
Modified Julian Date (Terrestrial Time) |
Meters |
f64 |
Distance in metres |
Error Types
| Error type | Feature | Description |
|---|---|---|
PolarsError |
polars |
Schema validation, type mismatch, null in required column, partial geodetic triplet, missing accuracy, invalid MPC code |
LoadObsError |
datafusion |
URI resolution failure, resource not found, DataFusion I/O error, Arrow column error |
AdesError |
ades |
XML parse error, missing mandatory field, unresolvable observatory |
Mpc80ColError |
mpc_80_col |
Parse error in the fixed-width 80-column format |
ObserverError |
— | Invalid float value, MPC code not found or malformed |
Documentation
To compile the documentation locally, run the following command in the terminal:
RUSTDOCFLAGS="--html-in-header /katex-header.html"
Testing Notes
The DataFusion tests require the large-test-fixtures feature to run. The large Parquet fixtures have been excluded from the crates.io package and are gated behind this feature.
To run the full test suite including DataFusion:
All other tests are gated behind their associated features and do not require this additional flag.
Minimum Supported Rust Version
photom requires Rust 1.94.0 or later.
License
This project is licensed under the CeCILL-C Free Software License Agreement.