Status: v0.11.0 — Core reading, CLI, output writers, Python & Node.js bindings, and write support (with STCS compaction) are production-ready. See CHANGELOG.md.
CQLite provides SQLite-like local access to Apache Cassandra SSTables, enabling developers to read Cassandra 5.0+ data files without cluster dependencies. Built in Rust for performance and safety.
Documentation
Full documentation is at https://pmcfadin.github.io/cqlite/:
| Section | URL |
|---|---|
| User Docs — install, quick start, CLI, Python, Node.js | /cqlite/user-docs/ |
| SSTable Format Guide — binary format deep-dive | /cqlite/sstable-format/ |
| For Agents: Using CQLite — LLM/agent integration | /cqlite/agents-using/ |
| For Agents: Developing CQLite — contributor doctrine, gate contract | /cqlite/agents-developing/ |
Vision
CQLite aims to become the standard tool for Cassandra SSTable manipulation outside of the main Apache Cassandra project, enabling new workflows for data analytics, migration, testing, and edge computing.
Project Leadership
CQLite is designed by Patrick McFadin, Apache Cassandra PMC member with 13 years of Cassandra experience. The project embodies Apache Cassandra community values and will be donated to the Apache Cassandra project upon maturity.
Install
CLI (from crates.io — requires Rust 1.85+)
CLI (prebuilt binaries — no Rust toolchain required)
Each GitHub release attaches a
prebuilt cqlite CLI binary for the common platforms, each with a .sha256
checksum sidecar:
| Platform | Asset |
|---|---|
| macOS (Apple Silicon) | cqlite-aarch64-apple-darwin.tar.gz |
| macOS (Intel) | cqlite-x86_64-apple-darwin.tar.gz |
| Linux x86_64 (glibc) | cqlite-x86_64-unknown-linux-gnu.tar.gz |
| Linux x86_64 (static musl) | cqlite-x86_64-unknown-linux-musl.tar.gz |
| Linux arm64 (glibc) | cqlite-aarch64-unknown-linux-gnu.tar.gz |
| Windows x86_64 | cqlite-x86_64-pc-windows-gnu.zip |
# Example: macOS Apple Silicon
TARGET=aarch64-apple-darwin
Rust library
See Using cqlite-core as a dependency and the API docs.
Language bindings
Quick Start
# Clone the repository
# Build the project
# Run the CLI tool
Python
Node.js
import { Database } from '@cqlite/node';
const db = await Database.open('path/to/sstables', { schema: 'schema.cql' });
const result = await db.execute('SELECT * FROM keyspace.table LIMIT 5');
for (const row of result.rows) {
console.log(row.name);
}
await db.close();
Write Support
CQLite v0.9.0 (M5) ships write support across all interfaces: Rust core, Python,
Node.js, and CLI. Written data flushes to portable Cassandra 5.0 SSTables that
Cassandra can read directly via nodetool refresh.
The schema file below is included in the repository at
test-data/schemas/write-test.cql.
Python
# Open in writable mode — write_dir stores the WAL and flushed SSTables
=
Node.js
const = require;
const db = await ;
await db.;
const path = await db.;
console.log;
await db.;
CLI
# Build with write support
# Write via CQL INSERT
# Flush memtable to SSTable
See docs/write-support.md for the full write guide,
including the Cassandra export workflow and known limitations. To embed
cqlite-core in your own Rust project (dependency line, feature flags, and a
compiling write example), see
docs/using-cqlite-core-as-a-dependency.md.
Feature Flags
cqlite-core gates optional functionality behind Cargo features. The table below
maps the public API you're likely to reach for to the feature that enables it.
| Want… | Enable feature | In defaults? |
|---|---|---|
Read / query path (Database::open, execute, scan, get) |
state_machine |
✅ yes |
| Compression (LZ4 / Snappy / Deflate / Zstd) | all-compression |
✅ yes |
Write path (WriteEngine, Mutation, WriteEngine::write/flush) |
write-support |
✅ yes |
Database::flush / Database::compact (high-level convenience) |
experimental |
❌ opt-in |
CLI ingestion / REPL helpers (cqlite-cli) |
cli-helpers |
❌ opt-in |
| Performance metrics collection | metrics |
❌ opt-in |
Default features are ["all-compression", "state_machine", "write-support"]
(see cqlite-core/Cargo.toml). write-support was folded into the defaults in
#558 — it gates only first-party
code and adds no extra dependencies, so read-only consumers pay nothing for it.
flush/compact on the high-level Database type remain behind experimental;
the equivalent engine-level WriteEngine::flush is part of write-support.
Building with Custom Features
# Default build (read + write + compression)
# Read-only consumer: drop the write path (still zero-cost to keep it, but explicit)
# Opt into high-level Database::flush / compact
# Minimal build (no compression, no query engine)
Features
✅ Complete (M1/M2)
- Cassandra 5+ SSTable format parsing (100% of test tables)
- All CQL types including collections and UDTs
- All compression codecs (LZ4, Snappy, Deflate, Zstd)
- CLI tool with REPL and one-shot query modes
- SELECT with WHERE clause (partition/clustering key equality)
- Output formats: Table, JSON, CSV
✅ M3 Complete (Jan 2026)
- Parquet output format with Snappy compression
- Export command (
cqlite export) - Streaming export for large datasets
- Output formats: CSV, JSON, Parquet, CQL
✅ M4 Complete (Jan 2026)
- Python bindings with full CQL type support
- Node.js bindings with TypeScript definitions
- Streaming API for memory-efficient queries
- pip/npm installable packages (5 platform builds each)
- Type stubs for IDE support (Python mypy, TypeScript)
✅ M5 Complete — v0.9.0 (May 2026)
- Write support: WAL + memtable + flush to Cassandra SSTables
- STCS compaction via
maintenance_step() - Write API in Python, Node.js, and CLI
- Full type coverage: Inet, Varint, Duration, Tuple, Frozen
- E2E readback gate: write → flush → Cassandra
nodetool refresh→ verify
✅ Since v0.9.0 (v0.10 → v0.11.0, Jun 2026)
- Embeddable Parquet writer in
cqlite-core(behind aparquetfeature) +export_parquetin Python/Node - Version-gated reads for the Cassandra 5.0
oaformat; graceful handling ofda(BTI) - Real BTI trie node-type dispatch and schema-typed query result columns
- Published documentation site at pmcfadin.github.io/cqlite
- See CHANGELOG.md for the full per-release detail
📋 Roadmap
- M6: WASM bindings for browser deployment
- M7: Performance validation + v1.0 release
Architecture Highlights
Design Philosophy:
- No cluster dependency - Read and write SSTables directly, with no running Cassandra node
- CQL parser - Native CQL support using an Antlr4 grammar
- Cassandra 5+ focus - Modern 'oa' format with BTI support
- Memory efficient - <128MB usage target for large files
- Self-contained engine - Pure-Rust parsing and writing, including STCS compaction
Getting Involved
CQLite is developed in the open as an Apache-licensed project. We welcome contributions from the Cassandra community!
Development Setup
# Prerequisites
# - Rust 1.85+
# Clone and build
# Fetch test data (JSONL reference files are in git, SSTable binaries fetched separately)
# Run tests
Contributing
- Check Issues: Look for
good-first-issuelabels - Discuss: Join our community discussions
- Code: Follow Rust best practices and include tests
- Test: Ensure compatibility with real Cassandra data
- Document: Update docs for user-facing changes
Current Status
✅ M1 Complete (Dec 2025)
- All SSTable components parsed (Data.db, Index.db, Summary.db, Statistics.db, TOC)
- 33/33 test tables passing (100% validation)
- All 21 CQL primitive types + collections + UDTs + frozen types
- All compression algorithms working
- Tiered test coverage targets (see PRD Section 5.1)
✅ M2 Complete (Jan 2026)
- CLI with one-shot and REPL modes
- SELECT queries with WHERE clause support
- Multiple output formats (Table, JSON, CSV)
✅ M3 Complete (Jan 2026)
- Parquet output format with Snappy compression
- Export command with CSV, JSON, Parquet, CQL formats
- Streaming export for memory-efficient large dataset handling
- Progress bar and statistics for exports
✅ M4 Complete (Jan 2026)
- Python bindings via PyO3 with sync-first API
- Node.js bindings via napi-rs with Promise-based API
- Full CQL type system (20+ types including collections, UDTs)
- Thread-safe database handles
- 500+ tests with 98%+ pass rate across both bindings
✅ M5 Complete — v0.9.0 (May 2026)
- Write support: WAL-backed memtable + flush to portable Cassandra 5.0 SSTables
- STCS compaction (
maintenance_step()) - Write API exposed in Python (
flush_run,maintenance_step,write_stats), Node.js (flushRun,maintenanceStep,writeStats), and CLI (--writable,--write-dir,--flush,maintenance,write-stats,export-sstable) - Type roundtrips verified for all major types including Inet, Varint, Duration, Tuple, Frozen
- E2E validation against live Cassandra 5.0 (write → flush →
nodetool refresh→cqlsh)
See docs/development/PRD.md for milestone details.
Technical Details
Supported Formats
- Cassandra 5.0+: 'oa' format with BTI support
- File Types: Data.db, Index.db, Summary.db, Statistics.db
- Compression: LZ4, Snappy, Deflate, Zstd
Performance Targets
- Parse Speed: 1GB files in <10 seconds
- Memory Usage: <128MB for large SSTables
- Query Latency: Sub-millisecond partition lookups
Language Bindings
- Python: Production-ready sync API (see Python README)
- Node.js: Production-ready Promise API (see Node.js README)
- WASM: Planned (M6+)
Resources
- Documentation site: https://pmcfadin.github.io/cqlite/ — user docs, SSTable format guide, agent integration docs
- API docs (rustdoc): latest tag · published per release tag at
https://pmcfadin.github.io/cqlite/api/<tag>/ - Changelog: CHANGELOG.md — what each tagged release contains
- Performance: Methodology, local repro, and CI gate policy
- CQL Grammar: Patrick's Antlr4 CQL Grammar
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Community
- Questions & ideas: GitHub Discussions
- Bugs & feature requests: GitHub Issues
- Contributing: see CONTRIBUTING.md and our Code of Conduct
CQLite is an independent open-source project, not an Apache Software Foundation project. It is built in the spirit of the Apache Cassandra community, with the goal of contributing it upstream as it matures.
License
Licensed under the Apache License, Version 2.0. See LICENSE for details.
Acknowledgments
Special thanks to the Apache Cassandra community and the many contributors who make projects like this possible. CQLite builds on decades of database engineering innovation from the Cassandra project.
Note: M1 through M5 milestones are complete and the project is at v0.11.0. Core SSTable reading, CLI, output writers (including Parquet), Python and Node.js bindings, and write support with STCS compaction are production-ready. Next: M6 (WASM bindings) and M7 (performance validation + v1.0).