wblidar
wblidar is the core LiDAR I/O engine for the Whitebox project. It provides fast, standards-focused, pure-Rust read/write support for common point-cloud formats so Whitebox tools can reliably ingest and emit LiDAR data.
Table of Contents
- Mission
- The Whitebox Project
- Is wblidar Only for Whitebox?
- What wblidar Is Not
- Supported Formats
- Design Goals
- Installation
- Compilation Features
- API Overview
- Architecture
- Performance Notes
- Known Limitations
- Validation and Interoperability
- Suggested Additional Sections (Optional)
- License
Mission
- Provide robust LiDAR format I/O for Whitebox applications and workflows.
- Keep codec logic in Rust with minimal external dependencies.
- Prioritize standards compliance, interoperability, and predictable behavior.
The Whitebox Project
Whitebox is a suite of open-source geospatial data analysis software with roots at the University of Guelph, Canada, where Dr. John Lindsay began the project in 2009. Over more than fifteen years it has grown into a widely used platform for geomorphometry, spatial hydrology, LiDAR processing, and remote sensing research. In 2021 Dr. Lindsay and Anthony Francioni founded Whitebox Geospatial Inc. to ensure the project's long-term, sustainable development. Whitebox Next Gen is the current major iteration of that work, and this crate is part of that larger effort.
Whitebox Next Gen is a ground-up redesign that improves on its predecessor in nearly every dimension:
- CRS & reprojection — Full read/write of coordinate reference system metadata across raster, vector, and LiDAR data, with multiple resampling methods for raster reprojection.
- Raster I/O — More robust GeoTIFF handling (including Cloud-Optimized GeoTIFFs), plus newly supported formats such as GeoPackage Raster and JPEG2000.
- Vector I/O — Expanded from Esri Shapefile-only to 11 formats, including GeoPackage, FlatGeobuf, GeoParquet, and other modern interchange formats.
- Vector topology — A new, dedicated topology engine (
wbtopology) enabling robust overlay, buffering, and related operations. - LiDAR I/O — Full support for LAS 1.0–1.5, LAZ, COPC, E57, and PLY via
wblidar, a high-performance, modern LiDAR I/O engine. - Frontends — Whitebox Workflows for Python (WbW-Python), Whitebox Workflows for R (WbW-R), and a QGIS 4-compliant plugin are in active development.
Is wblidar Only for Whitebox?
No. wblidar is developed primarily to support Whitebox, but it is not restricted to Whitebox projects.
- Whitebox-first: API and roadmap decisions prioritize Whitebox I/O needs.
- General-purpose: the crate is usable as a standalone LiDAR I/O engine in other Rust applications.
- Interop-focused: standards-compliant LAS/LAZ/COPC/PLY/E57 support makes it suitable for broader tooling and data pipelines.
What wblidar Is Not
wblidar is an I/O and format layer. It is not intended to be a full LiDAR processing framework.
- Not a filtering/classification framework.
- Not a replacement for Whitebox analysis/processing tools.
- Not a pipeline engine for arbitrary geospatial transformations.
Point-cloud processing, filtering, segmentation, and analysis belong in the Whitebox frontend/tooling layer.
Supported Formats
| Format | Read | Write | Notes |
|---|---|---|---|
| LAS | yes | yes | LAS 1.1-1.5, PDRF 0-15 |
| LAZ | yes | yes | Standards-compliant LASzip v2/v3 Point10/Point14 codecs |
| COPC | yes | yes | COPC 1.0 hierarchy with Point14-family payloads |
| PLY | yes | yes | ASCII, binary little-endian, binary big-endian |
| E57 | yes | yes | ASTM E2807 with CRC-32 page validation |
Design Goals
- Standards first: prefer interoperable, standards-compliant encoding/decoding paths.
- Pure Rust codecs: avoid native/C++ LASzip dependency by implementing core codecs in Rust.
- Streaming I/O APIs: expose incremental read/write interfaces for large files.
- Minimal dependencies: keep dependency surface tight and auditable.
- Whitebox integration: maintain a stable API for Whitebox ingestion/export workflows.
- Predictable behavior: deterministic output where applicable and explicit error modes.
Installation
Crates.io dependency:
[]
= "0.1"
Enable optional features only when needed:
[]
= { = "0.1", = ["copc-http", "parallel"] }
Local workspace/path dependency:
[]
= { = "../wblidar" }
Feature notes:
copc-httpenables HTTP range fetching for remote COPC access.copc-parallelenables Rayon-backed parallel work in COPC writing paths.laz-parallelenables optional parallel LAZ chunk decoding.parallelenables bothcopc-parallelandlaz-parallel.
Compilation Features
wblidar uses optional Cargo features for specific capabilities.
| Feature | Default | Purpose |
|---|---|---|
copc-http |
no | Enables HTTP range fetching support for COPC access (reqwest). |
parallel |
no | Convenience umbrella feature enabling all current parallel paths. |
copc-parallel |
no | Enables Rayon-based parallel work in COPC writing paths (node encoding/sorting thresholds). |
laz-parallel |
no | Enables optional parallel LAZ chunk decode paths where beneficial. |
Example:
Use copc-parallel or laz-parallel individually when you want narrower benchmarking or regression isolation.
API Overview
wblidar exposes two main usage styles:
- Low-level streaming APIs via format-specific readers/writers and
PointReader/PointWritertraits. - Unified frontend API via
PointCloudfor format-agnostic workflows.
1) Stream LAS -> LAS
This example shows minimal-memory, record-by-record conversion between LAS files using the streaming reader/writer traits.
use File;
use ;
use ;
2) Format-Agnostic Read/Write
This example shows the high-level PointCloud API auto-detecting input format and writing multiple output formats.
use ;
3) Read With Diagnostics
This example shows ingest diagnostics for observability, including partial Point14 recovery counters.
use read_with_diagnostics;
4) Reproject a PointCloud
This example shows a straightforward end-to-end reprojection workflow using PointCloud convenience methods.
use PointCloud;
5) Write COPC with Explicit Spatial Ordering
This example shows COPC writing with explicit root geometry and node point ordering configuration.
use File;
use BufWriter;
use ;
6) Optional Parallel LAZ Decode (Feature-Gated)
This example shows feature-gated parallel LAZ decode for high-volume workloads where chunk-level parallelism can improve throughput.
// Requires Cargo feature: parallel or laz-parallel
use File;
use BufReader;
use LazReader;
Architecture
At a high level:
- Common model:
PointRecordis the central in-memory point representation. - Traits:
PointReaderandPointWriterprovide streaming semantics. - Format modules:
las,laz,copc,ply,e57encapsulate format-specific details. - Frontend:
PointCloudand helper functions provide a unified API for common workflows.
Format notes:
- LAS: direct structured read/write with VLR/CRS support.
- LAZ: in-house LASzip-compatible codecs for Point10/Point14 families.
- COPC: LAZ-backed octree hierarchy with COPC metadata/hierarchy pages.
- PLY: ASCII and binary interchange for general point cloud exchange.
- E57: standards-oriented reader/writer with integrity checks.
Performance Notes
wblidaruses SIMD in hot numeric paths where safe and beneficial.- Optional parallelism is feature-gated and thresholded to avoid regressions on small jobs.
- Streaming APIs are the default path for low-memory workflows.
- Some decode/encode paths intentionally trade memory for correctness and interoperability.
Point14 compression_level Behavior
LazWriterConfig::compression_level is now effective for Point14-family LAZ writes.
It tunes the effective chunk target size used during encoding:
- Lower levels favor smaller chunks (often faster writes, sometimes slightly larger files).
- Higher levels favor larger chunks (often slightly better compression, potentially more memory/latency per flush).
Current mapping (base chunk_size = configured chunk_size):
| Level | Effective chunk target |
|---|---|
| 0 | chunk_size / 2 |
| 1 | 2 * chunk_size / 3 |
| 2 | 3 * chunk_size / 4 |
| 3-6 | chunk_size |
| 7 | 5 * chunk_size / 4 |
| 8 | 3 * chunk_size / 2 |
| 9 | 2 * chunk_size |
Notes:
- This behavior currently applies to Point14-family LAZ writes.
- Point10 paths continue to use the configured chunk size directly.
- COPC
compression_levelremains independent of this LAZ chunk-size tuning.
Useful environment knobs:
WBLIDAR_COPC_PARALLEL_MIN_NODES(default:16, requiresparallelorcopc-parallel): Minimum number of COPC nodes required before parallel node encoding is considered. Effective threshold ismax(WBLIDAR_COPC_PARALLEL_MIN_NODES, 2 * rayon_thread_count). Increase to reduce thread overhead on smaller jobs; decrease to parallelize sooner.WBLIDAR_COPC_PARALLEL_MIN_POINTS(default:400000, requiresparallelorcopc-parallel): Minimum total points across candidate COPC nodes before parallel node encoding is used. Increase to keep more workloads serial; decrease to enable parallel encoding for smaller datasets.WBLIDAR_COPC_PARALLEL_SORT_MIN_POINTS(default:80000, requiresparallelorcopc-parallel): Minimum per-node point count before Morton/Hilbert code sorting switches to parallel sort. Increase to favor serial sort on medium nodes; decrease to parallelize sort earlier.WBLIDAR_LAZ_PARALLEL_MIN_CHUNKS(default:4, requiresparallelorlaz-parallel): Minimum non-empty LAZ chunks required beforeread_all_points_parallel()uses parallel decode. Increase to avoid parallel overhead on files with few chunks; decrease to parallelize earlier.WBLIDAR_LAZ_PARALLEL_MIN_POINTS(default:200000, requiresparallelorlaz-parallel): Minimum total points required beforeread_all_points_parallel()uses parallel decode. Increase to keep more files on serial fallback; decrease to use parallel decode more aggressively.
Using the Environment Knobs
Set knobs inline for a single command:
WBLIDAR_COPC_PARALLEL_MIN_NODES=24 \
WBLIDAR_COPC_PARALLEL_MIN_POINTS=600000 \
WBLIDAR_COPC_PARALLEL_SORT_MIN_POINTS=120000 \
For normal builds, prefer --features "parallel"; keep copc-parallel for COPC-only benchmarking or regression isolation.
Set LAZ knobs for a single parallel-decode run:
WBLIDAR_LAZ_PARALLEL_MIN_CHUNKS=8 \
WBLIDAR_LAZ_PARALLEL_MIN_POINTS=500000 \
For normal builds, prefer --features "parallel"; keep laz-parallel for LAZ-only benchmarking or regression isolation.
Export knobs for the current shell session:
Quick starting presets:
| Preset | COPC Min Nodes | COPC Min Points | COPC Sort Min Points | LAZ Min Chunks | LAZ Min Points | When to Use |
|---|---|---|---|---|---|---|
| Conservative | 32 | 1000000 | 160000 | 12 | 1000000 | Prioritize predictable serial behavior on mixed or smaller jobs |
| Balanced (default-like) | 16 | 400000 | 80000 | 4 | 200000 | Good first choice for most workloads |
| Aggressive | 8 | 150000 | 40000 | 2 | 100000 | Favor parallelism earlier on large multi-core systems |
Notes:
- Knobs are read once per process startup; restart your process to apply changed values.
- Knobs only affect feature-enabled code paths (
parallel,copc-parallel, andlaz-parallel).
Known Limitations
wblidarfocuses on I/O and format correctness, not higher-level LiDAR processing algorithms.- COPC payloads are Point14-family; some LAS 1.5-specific fields are promoted or omitted when mapping to COPC-compatible formats.
- Legacy wb-native LAZ DEFLATE paths are intentionally out of scope; standards LASzip-compatible paths are used.
- Some Point14-heavy paths can require substantial memory because layered decode/encode may materialize large in-memory buffers.
- COPC writing is batch-oriented; appending incremental updates to an existing COPC file is not currently supported.
- COPC node ordering is configurable (
Auto,Morton,Hilbert) but not yet auto-tuned per dataset. - Partial Point14 handling defaults to lenient recovery; strict failure mode is opt-in via
WBLIDAR_FAIL_ON_PARTIAL_POINT14. - LAZ parallel decode tuning knobs apply to
read_all_points_parallel(); regular streamingread_point()remains serial. - External interoperability validation is strong but still benefits from broader real-world fixture coverage across toolchains.
- Some advanced paths are feature-gated (
copc-http,parallel, and the granular parallel modes) and are not enabled by default. - Performance characteristics vary by file structure (for example, chunking strategy can limit parallel speedups on some LAZ datasets).
Validation and Interoperability
Internal validation checklists and QA procedures are maintained in docs/internal/. These cover external interoperability workflows (PDAL, LAStools, validate.copc.io) and are intended for maintainers rather than library users.
Suggested Additional Sections (Optional)
If you want to expand this README further, the highest-value additions would be:
- Versioning/compatibility policy (what constitutes a breaking API change).
- Error-handling guide (common
Errorvariants and recovery guidance). - Benchmark methodology (how performance claims are measured).
- Contributing guide pointer (coding conventions, tests, fixture requirements).
License
Licensed under either of Apache License 2.0 or MIT License at your option.