# Contributing to quack-rs
Thank you for contributing! Please read this document before opening a PR.
## Table of Contents
- [Development Prerequisites](#development-prerequisites)
- [Building](#building)
- [Coding Standards](#coding-standards)
- [Quality Gates](#quality-gates)
- [Test Strategy](#test-strategy)
- [Mutation Testing](#mutation-testing)
- [Code Standards](#code-standards)
- [Repository Structure](#repository-structure)
- [PR Checklist](#pr-checklist)
- [Releasing](#releasing)
---
## Development Prerequisites
| Rust | ≥ 1.84.1 (MSRV) | Compiler |
| `rustfmt` | stable | Formatting |
| `clippy` | stable | Linting |
| `cargo-deny` | latest | License/advisory checks |
| DuckDB CLI | 1.4.4, 1.5.0, or 1.5.1 | Live extension testing (required) |
Install the Rust toolchain via [rustup](https://rustup.rs/).
Install DuckDB 1.5.1 (or 1.4.4/1.5.0) via `curl` (no system package manager needed):
```bash
curl -fsSL https://github.com/duckdb/duckdb/releases/download/v1.5.1/duckdb_cli-linux-amd64.zip \
-o /tmp/duckdb.zip \
&& unzip -o /tmp/duckdb.zip -d /tmp/ \
&& chmod +x /tmp/duckdb \
&& /tmp/duckdb --version
# → v1.5.1
```
---
## Building
```bash
# Build the library
cargo build
# Build in release mode (enables LTO + strip)
cargo build --release
# Build the hello-ext example extension
cargo build --release --manifest-path examples/hello-ext/Cargo.toml
```
---
## Coding Standards
### Every file starts with the SPDX header
```rust
// SPDX-License-Identifier: MIT
// Copyright 2026 Tom F. <tomf@tomtomtech.net> (https://github.com/tomtom215)
```
Markdown / TOML / YAML files use the appropriate comment syntax.
### 500-line maximum per file
Source files (`.rs`) should generally stay under 500 lines. If your implementation
is growing beyond this limit, consider splitting it into focused sub-modules with
a thin `mod.rs` that only re-exports. Some files exceed this guideline where
splitting would harm cohesion.
### Thin `mod.rs` files
`mod.rs` files should primarily contain `mod` declarations and `pub use`
re-exports. Shared types that are tightly coupled to a module's children
may live in the parent `mod.rs` when splitting them out would add indirection
without value.
### No `unwrap()` in library code
Use `?`, `map_err`, `ok_or_else`, or explicit `match`. `expect()` is also
forbidden unless the message explains an invariant that is *impossible* to
violate at runtime (documented with `// SAFETY:` style comment).
---
## Quality Gates
**All of the following must pass before merging any pull request:**
```bash
# 1. Tests — zero failures, zero ignored
cargo test
# 2. Integration tests
cargo test --test integration_test
# 3. Linting — zero warnings (warnings are treated as errors)
cargo clippy --all-targets -- -D warnings
# 4. Formatting
cargo fmt -- --check
# 5. Documentation — zero broken links or missing docs
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps
# 6. MSRV — must compile on Rust 1.84.1 (matches CI; excludes benches which use criterion >=1.86)
cargo +1.84.1 check
# 7. Live extension test — build hello-ext, package it, load in DuckDB 1.4.4 or 1.5.0
cargo build --release --manifest-path examples/hello-ext/Cargo.toml
cargo run --bin append_metadata -- \
examples/hello-ext/target/release/libhello_ext.so \
/tmp/hello_ext.duckdb_extension \
--abi-type C_STRUCT --extension-version v0.1.0 \
--duckdb-version v1.2.0 --platform linux_amd64
/tmp/duckdb -unsigned -c "
SET allow_extensions_metadata_mismatch=true;
LOAD '/tmp/hello_ext.duckdb_extension';
SELECT word_count('hello world foo'); -- 3
SELECT first_word('hello world'); -- hello
SELECT list(value ORDER BY value) FROM generate_series_ext(5); -- [0,1,2,3,4]
SELECT CAST('42' AS INTEGER); -- 42
SELECT TRY_CAST('bad' AS INTEGER); -- NULL
"
```
```bash
# 8. Mutation testing — zero surviving mutants in changed files
cargo mutants --file <changed-files>
```
These same checks run in CI (`.github/workflows/ci.yml`) on every push and pull request.
Coverage and mutation testing run in separate workflows.
---
## Test Strategy
### Unit tests
Unit tests live in `#[cfg(test)]` modules within each source file. They test
pure-Rust logic that does not require a live DuckDB instance.
**Constraint**: `libduckdb-sys` with `features = ["loadable-extension"]` makes
every DuckDB C API function go through lazy `AtomicPtr` dispatch. These pointers
are only initialized when `duckdb_rs_extension_api_init` is called from within a
real DuckDB extension load. Calling any DuckDB API function in a unit test will
panic. Move such tests to integration tests or example-extension tests.
### Integration tests (`tests/integration_test.rs`)
Pure-Rust tests that cross module boundaries — e.g., testing `interval` with
`AggregateTestHarness`, or verifying `FfiState` lifecycle across module boundaries.
These still cannot call `duckdb_*` functions, for the same reason as unit tests.
### Property-based tests
Selected modules include `proptest`-based tests for mathematical properties:
- `interval.rs` — overflow edge cases across the full `i32`/`i64` range
- `testing/harness.rs` — sum associativity, identity element for `AggregateState`
### Example-extension tests (`examples/hello-ext/`)
The `hello-ext` example compiles as a `cdylib` and contains `#[cfg(test)]` unit
tests for all pure-Rust logic (`count_words`, `first_word`, `parse_varchar_to_int`,
aggregate state transitions). **Full end-to-end testing against a live DuckDB 1.4.4 or 1.5.0
instance is required** — not left to consumers. This means building the `.so`,
appending the extension metadata footer with `append_metadata`, and running all 19
SQL tests via the DuckDB CLI. See the [Quality Gates](#quality-gates) section for
the exact commands and `examples/hello-ext/README.md` for the full test listing.
### Mutation testing
Mutation testing verifies that your tests actually detect code changes. A mutant
is a small, deliberate modification to the source (e.g., replacing `+` with `-`,
flipping a boolean, returning a default value). If a mutant compiles and all
tests still pass, the test suite has a gap.
```bash
# Install cargo-mutants
cargo install cargo-mutants
# Run mutation tests on all library source
cargo mutants
# Run on a specific file
cargo mutants --file src/interval.rs
# List mutants without running (dry-run)
cargo mutants --list
```
Configuration is in `mutants.toml` at the repository root.
### Test naming convention
Tests follow the pattern: `{component}_{scenario}_{expected_outcome}`
Examples:
- `interval_to_micros_overflow_saturates`
- `error_from_string_preserves_message`
- `aggregate_state_combine_propagates_config`
---
## Code Standards
### Safety documentation
Every `unsafe` block must have a `// SAFETY:` comment that explains:
1. Which invariant the caller guarantees
2. Why the operation is valid given that invariant
Example:
```rust
// SAFETY: `states` is a valid array of `count` pointers, each initialized
// by `init_callback`. We are the only owner of `inner` at this point.
unsafe { drop(Box::from_raw(ffi.inner)) };
```
### No panics across FFI
`unwrap()`, `expect()`, and `panic!()` are forbidden inside any function that
may be called by DuckDB (callbacks and entry points). Use `Option`/`Result` and
the `?` operator throughout. See `entry_point::init_extension` for the canonical
pattern.
### Clippy lint policy
The crate enables `pedantic`, `nursery`, and `cargo` lint groups. Specific lints
are suppressed only where they produce false positives for SDK API patterns:
```toml
[lints.clippy]
module_name_repetitions = "allow" # e.g., AggregateFunctionBuilder
must_use_candidate = "allow" # builder methods
missing_errors_doc = "allow" # unsafe extern "C" callbacks
return_self_not_must_use = "allow" # builder pattern
```
All other warnings are errors in CI.
### Documentation
Every public item must have a doc comment. Private items with non-obvious
semantics should also be documented. Doc comments follow these conventions:
- First line: short summary (noun phrase, no trailing period)
- `# Safety`: mandatory on every `unsafe fn`
- `# Panics`: mandatory if the function can panic in any reachable code path
- `# Errors`: mandatory on functions returning `Result`
- `# Example`: encouraged on public types and key methods
---
## Repository Structure
```
quack-rs/
├── src/
│ ├── lib.rs # Crate root; module declarations; DUCKDB_API_VERSION
│ ├── entry_point.rs # init_extension() / init_extension_v2() + entry_point! / entry_point_v2! macros
│ ├── connection.rs # Connection facade + Registrar trait (version-agnostic registration)
│ ├── config.rs # DbConfig — RAII wrapper for duckdb_config
│ ├── error.rs # ExtensionError, ExtResult<T>
│ ├── interval.rs # DuckInterval, interval_to_micros (checked + saturating)
│ ├── prelude.rs # Convenience re-exports for extension authors
│ ├── sql_macro.rs # SQL macro registration (CREATE MACRO, no FFI)
│ ├── aggregate/
│ │ ├── mod.rs # Re-exports
│ │ ├── builder/ # Builder types for aggregate function registration
│ │ │ ├── mod.rs # Module doc + re-exports
│ │ │ ├── single.rs # AggregateFunctionBuilder (single-signature)
│ │ │ ├── set.rs # AggregateFunctionSetBuilder, OverloadBuilder
│ │ │ └── tests.rs # Unit tests (14 tests)
│ │ ├── callbacks.rs # Type aliases for the 6 callback signatures
│ │ └── state.rs # AggregateState trait, FfiState<T>
│ ├── scalar/
│ │ ├── mod.rs # Re-exports
│ │ └── builder/ # Builder types for scalar function registration
│ │ ├── mod.rs # Module doc + re-exports
│ │ ├── single.rs # ScalarFn type alias, ScalarFunctionBuilder
│ │ ├── set.rs # ScalarFunctionSetBuilder, ScalarOverloadBuilder
│ │ └── tests.rs # Unit tests (13 tests)
│ ├── catalog.rs # Catalog access helpers (requires `duckdb-1-5`)
│ ├── cast/
│ │ ├── mod.rs # Re-exports
│ │ └── builder.rs # CastFunctionBuilder, CastFunctionInfo, CastMode
│ ├── client_context.rs # ClientContext wrapper (requires `duckdb-1-5`)
│ ├── config_option.rs # ConfigOption registration (requires `duckdb-1-5`)
│ ├── copy_function.rs # Copy function registration (requires `duckdb-1-5`)
│ ├── replacement_scan/
│ │ └── mod.rs # ReplacementScanBuilder — SELECT * FROM 'file.xyz' patterns
│ ├── types/
│ │ ├── mod.rs
│ │ ├── type_id.rs # TypeId enum (all DuckDB column types)
│ │ └── logical_type.rs # LogicalType — RAII wrapper for duckdb_logical_type
│ ├── vector/
│ │ ├── mod.rs
│ │ ├── reader.rs # VectorReader — typed reads from a DuckDB data chunk
│ │ ├── writer.rs # VectorWriter — typed writes to a DuckDB result vector
│ │ ├── validity.rs # ValidityBitmap — NULL flag management
│ │ └── string.rs # DuckStringView, read_duck_string (16-byte string format)
│ ├── validate/
│ │ ├── mod.rs # Extension compliance validators + re-exports
│ │ ├── description_yml/ # Parse and validate description.yml metadata
│ │ │ ├── mod.rs # Module doc + re-exports
│ │ │ ├── model.rs # DescriptionYml struct (11 fields)
│ │ │ ├── parser.rs # parse_description_yml, parse_kv, strip_inline_comment
│ │ │ ├── validator.rs # validate_description_yml_str, validate_rust_extension
│ │ │ └── tests.rs # Unit tests (20 tests)
│ │ ├── extension_name.rs # Extension name validation (^[a-z][a-z0-9_-]*$)
│ │ ├── function_name.rs # SQL function name validation
│ │ ├── platform.rs # DuckDB build platform validation
│ │ ├── release_profile.rs # Cargo release profile validation
│ │ ├── semver.rs # Semantic versioning + extension version tiers
│ │ └── spdx.rs # SPDX license identifier validation
│ ├── scaffold/
│ │ ├── mod.rs # ScaffoldConfig, GeneratedFile, generate_scaffold
│ │ ├── templates.rs # Template generators for all 11 scaffold files (pub(super))
│ │ └── tests.rs # Unit tests (29 tests)
│ ├── table_description.rs # TableDescription wrapper (requires `duckdb-1-5`)
│ ├── table/
│ │ ├── mod.rs # Re-exports
│ │ ├── builder.rs # TableFunctionBuilder, type aliases (BindFn, InitFn, ScanFn)
│ │ ├── info.rs # BindInfo, InitInfo, FunctionInfo — callback info wrappers
│ │ ├── bind_data.rs # FfiBindData<T> — type-safe bind-phase data
│ │ └── init_data.rs # FfiInitData<T>, FfiLocalInitData<T>
│ └── testing/
│ ├── mod.rs
│ └── harness.rs # AggregateTestHarness<S> — unit-test aggregate logic
├── tests/
│ └── integration_test.rs # Cross-module pure-Rust integration tests
├── benches/
│ └── interval_bench.rs # Criterion benchmarks for interval conversion
├── examples/
│ └── hello-ext/ # Complete word_count aggregate extension example
│ ├── Cargo.toml
│ └── src/lib.rs
├── book/ # mdBook documentation source
├── .github/workflows/
│ ├── ci.yml # CI: check, test, clippy, fmt, doc, msrv, bench-compile, nightly
│ ├── release.yml # Release pipeline: CI gate, package, publish
│ ├── docs.yml # mdBook build & deploy to GitHub Pages
│ ├── coverage.yml # Test coverage (cargo-llvm-cov → Codecov)
│ ├── mutants.yml # Mutation testing (cargo-mutants)
│ ├── benchmarks.yml # Criterion benchmark execution
│ └── README.md # Workflow overview and quality gate summary
├── CONTRIBUTING.md # This file
├── LESSONS.md # The 16 DuckDB Rust FFI pitfalls, documented in full
└── README.md # Quick start, SDK overview, badge table
```
---
## PR Checklist
- [ ] SPDX header on every new file
- [ ] No file exceeds 500 lines
- [ ] `cargo fmt` passes
- [ ] `cargo clippy --all-targets -- -D warnings` passes
- [ ] `cargo test --all-targets` passes
- [ ] `cargo doc --no-deps` passes without warnings
- [ ] New public types/functions have doc comments
- [ ] New code has tests
- [ ] All `unsafe` blocks have a `// SAFETY:` comment
- [ ] `CHANGELOG.md` updated under `[Unreleased]` (for user-facing changes)
- [ ] Book (`book/src/`) updated if the change affects extension authors
- [ ] New FFI pitfall discovered → added to `LESSONS.md` and `book/src/reference/pitfalls.md`
- [ ] `cargo mutants --file <changed-files>` shows zero surviving mutants for changed files
---
## Releasing
This crate supports `libduckdb-sys = ">=1.4.4, <2"` (DuckDB 1.4.x and 1.5.x).
The bounded range is intentional: the C API (`v1.2.0`) is stable across these releases,
and the `<2` upper bound prevents silent adoption of a future major band.
Before broadening the range to a new major band:
1. Read the DuckDB changelog for C API changes.
2. Check the new C API version string (used in `duckdb_rs_extension_api_init`).
3. Update `DUCKDB_API_VERSION` in `src/lib.rs` if the C API version changed.
4. Audit all callback signatures against the new `bindgen.rs` output.
5. Update the range bounds in `Cargo.toml` (both runtime and dev-deps).
Versions follow [Semantic Versioning](https://semver.org/). Breaking changes to
public API require a major version bump.