rss_core 0.6.0

Raster Source Service core library for querying, downloading, and processing remote sensing imagery
# Changelog

All notable changes to this project will be documented in this file.

## [Unreleased]

## [0.5.0] - 2026-05-12

### Security
- **SQL Injection Prevention**: Added `escape_sql_literal()` helper to escape user-supplied scene names in SQL queries (`query.rs`)
- **Externalized Database Credentials**: Replaced hardcoded DB credentials with `option_env!()` compile-time environment variable injection (`query.rs`)
  - `RSS_DB_HOST` (default: `localhost`)
  - `RSS_DB_USER` (default: `postgres`)
  - `RSS_DB_PASS` (default: `""`)
  - `RSS_DB_NAME` (default: `rss`)

### Added
- **`get_vs_get_remote` benchmark**: Compares full pipeline (query → fetch → read) for `.get()` (download to disk via `gdal_translate`) vs `.get_remote()` (VSI direct read from S3/HTTP). Gated behind `bench_live` feature flag for CI compatibility.
- **`async_tiff_vs_vsi` benchmark**: Compares GDAL VSI vs [async-tiff]https://github.com/developmentseed/async-tiff for reading COG blocks from S3. Key findings:
  - Cold-start: comparable (~192ms GDAL vs ~200ms async-tiff)
  - Cached reads: GDAL ~0.6ms (internal block cache), async-tiff ~40ms (connection pooling)
  - Concurrent (5×): async-tiff ~31ms/read (parallel async I/O), GDAL blocks threads
- **FileCache**: TTL-based file caching layer for downloaded imagery — files cached by SHA-256 hash of source URL, output files symlinked to cache entries (saves disk space)
- **Cache Environment Variables**: `RSS_CACHE_DIR` (default: `~/.rss_cache`) and `RSS_CACHE_TTL` (default: 604800s = 7 days) for configuring cache behavior
- **Cache API**: `FileCache` provides `get()`, `download_and_cache()`, `symlink()`, `invalidate()`, `cleanup()`, and `stats()` methods
- **Cache Integration in io.rs**: Both `get()` and `get_async()` now check cache before downloading — cache hit creates symlink, cache miss downloads to cache then symlinks, with graceful fallback to direct download on failure
- **Product Definitions System**: Declarative YAML manifests in `data/products/` replacing hardcoded `COLLECTION_MAPPINGS` — each product defines measurements, cloud mask configuration, STAC collection IDs, and output format preferences
- **Canonical Band Names**: Cross-provider band resolution via `.canonical_bands(["red", "nir"])` — resolve semantic names to provider-specific measurements at query time using `PRODUCT_REGISTRY`
- **Automatic Band Resolution**: `ImageQueryBuilder` now resolves bands internally — users specify bands via `.bands()` (explicit) or `.canonical_bands()` (semantic); if neither is called, all available bands are selected automatically
- **`From<&ImagerySource>` impl**: `lazy_static` sources (`DEA`, `APOLLO`, etc.) can now be passed directly to `ImageQueryBuilder::new()` via `Into` coercion — eliminates `.deref().clone()` pattern
- **Pixel Mask Decoder Helpers**: `cloud_mask` module with `decode_qa_mask()` and `apply_mask()` for decoding QA/classification bands (Sentinel-2 SCL, Landsat QA_PIXEL) into boolean masks inside eorst worker functions
- **`MaskClasses` bitflags**: Configurable pixel classes (CLOUD, CLOUD_SHADOW, WATER, SNOW) with sensible defaults (all masked)
- **`QaType` enum**: Platform selector (S2SCL, LandsatPixelQA) for the decoder dispatch
- **Examples**: `query_canonical_dea.rs` and `query_canonical_pc.rs` demonstrating canonical band resolution across providers
- `by_source_and_collection()` method on `ProductRegistry` for looking up products by source name and collection
- `#[must_use]` annotations on critical functions: `is_landsat()`, `is_sentinel()`, `exist_on_filestore()`, `ImageQueryBuilder::build()`, `escape_sql_literal()`
- 8 unit tests for `escape_sql_literal()` helper function
- 6 unit tests for `BandSpec` behavior (explicit pass-through, canonical resolution, error collection, Apollo restrictions)
- 11 unit tests for `FileCache` (hit/miss, TTL expiry, invalidation, cleanup, stats, env vars, hash uniqueness, symlink creation)
- Root `.gitignore` file to exclude `target/`, `.venv/`, `*.pyc`, `*.so` from git tracking
- `docs/diagnostic-and-plan.md` — Comprehensive diagnostic plan documenting all 7 critical bugs and their fixes
- `From<QvfDate>` implementation for `NaiveDateTime` with proper date/time conversion
- Download tests for DEA, Element84, and PlanetaryComputer using tempfile crate (`io.rs`)
- `configure_gdal_s3_defaults()` function for setting GDAL AWS defaults (`io.rs`)
- `ImagerySourceClap::PlanetaryComputer` variant for CLI support
- Docstrings with examples for public functions in `io.rs`, `qvf.rs`, `stac.rs`, `utils.rs`
- Distributed EORST implementation specification (`docs/eorst-implementation-spec.md`)
- RSS vs ODC comparison documentation (`docs/rss-vs-odc.md`)
- Temporal filter support for STAC collections
- Asset filtering capabilities for STAC queries
- Planetary Computer source support via CLI

### Dependencies
- Added `sha2` (0.11) for SHA-256 URL hashing in FileCache
- Added `hex` (0.4) for hash encoding in FileCache
- Added `bitflags` (2.x) for `MaskClasses` bitmask
- Moved `tokio-test` from dependencies to dev-dependencies

### Changed
- **Breaking**: `ImageQueryBuilder::new()` no longer takes `layers` parameter — use `.bands()` or `.canonical_bands()` instead
- **Breaking**: `ImageQueryBuilder::from_qvf()` signature modernized to accept `impl IntoIterator<Item = impl Into<String>>` for bands parameter
- `is_landsat()` / `is_sentinel()` now use `matches!` macro for correct satellite classification (`qvf.rs`)
- `ImageQueryBuilder::new()` and `from_qvf()` now accept `impl Into<ImagerySource>` for source parameter
- `DEA.deref().clone()` pattern replaced with `DEA.clone()` or direct `DEA` pass throughout codebase
- DEA source changed from `/vsicurl/https://data.dea.ga.gov.au` to `/vsis3/dea-public-data` (`io.rs`)
- Planetary Computer URL corrected to `/vsicurl/https://landsateuwest.blob.core.windows.net` (`io.rs`)
- Element84 landsat collection now uses `/vsis3/usgs-landsat` (`io.rs`)
- Updated pyo3 to use new `Bound<'_, PyModule>` API in `masks.rs`
- Updated all dependencies to latest compatible versions
- Code formatting and cleanup across multiple modules
- Replaced bitwise `&`/`|` with logical `&&`/`||` for proper short-circuit evaluation (`io.rs`)

### Fixed
- **Critical**: `is_landsat()` / `is_sentinel()` always returned `true` due to incorrect `!=` with `|` operator — now uses `matches!` macro (`qvf.rs`)
- **Critical**: `/vsicrul/` typo corrected to `/vsicurl/` in GDAL virtual file prefix — all Planetary Computer async downloads now work (`io.rs`)
- **Critical**: Replaced 3 production `todo!()` panics with proper implementations:
  - `qvf.rs`: `From<QvfDate>` for `NaiveDateTime` conversion
  - `query.rs`: `Intersects::Polygon` for Apollo returns `bail!` error
  - `main.rs`: PlanetaryComputer source uses proper constant
- **Build**: Added missing `bail!` macro import (`query.rs`)
- Fixed AsyncCommand import naming conflict with `std::process::Command` (`io.rs`)
- Removed trailing whitespace and fixed code style issues
- Addressed all clippy warnings
- Fixed Cargo.toml typo

### Refactored
- Extracted `item_with_assets()` from 8 duplicated Item copy blocks in `io.rs` and `query.rs` — replaces the `Item::new()` + field-by-field copy pattern
- Extracted `crop_args()` from 4 duplicated crop window argument blocks in `io.rs` — replaces the `cmd.args([("-projwin"), ...])` pattern
- Extracted `root_vsi_path()` from 3 duplicated root URL resolution match blocks in `io.rs`
- Extracted `build_landsat_sql()` from 5 duplicated Landsat5/7/8/9/All SQL format! blocks in `query.rs` — removes ~40 lines of duplication
- 3 unit tests for `build_landsat_sql()` covering single satellite, all satellites, and each individual Landsat collection

### Removed
- Hardcoded `COLLECTION_MAPPINGS` — replaced by `ProductRegistry` loaded from YAML manifests
- `output_format` field from product definitions (not needed in current workflow)
- `layers` parameter from `ImageQueryBuilder::new()` — replaced by `.bands()` / `.canonical_bands()` builder methods
- `std::ops::Deref` imports from all files (no longer needed after `From<&ImagerySource>` impl)
- Unused imports across codebase
- Unused variables
- Unused test imports in `masks.rs`
- `target/` directory from git tracking (now properly ignored)

### Ignored
- `test_element84_landsat_download` marked as ignored (requires request-pays credentials)
- `test_planetary_computer_landsat_download` marked as ignored (network-dependent integration test)