<p align="center">
<img src="website/src/assets/cqlite.png" alt="CQLite" width="480">
</p>
<p align="center"><strong>A high-performance Rust library for local Apache Cassandra SSTable access</strong></p>
<p align="center">
<a href="https://github.com/pmcfadin/cqlite/actions/workflows/ci.yml"><img src="https://github.com/pmcfadin/cqlite/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
<a href="https://crates.io/crates/cqlite-cli"><img src="https://img.shields.io/crates/v/cqlite-cli.svg?label=crates.io%20cqlite-cli" alt="crates.io"></a>
<a href="https://docs.rs/cqlite-core"><img src="https://img.shields.io/docsrs/cqlite-core.svg?label=docs.rs" alt="docs.rs"></a>
<a href="https://pypi.org/project/cqlite-py/"><img src="https://img.shields.io/pypi/v/cqlite-py.svg?label=pypi%20cqlite-py" alt="PyPI"></a>
<a href="https://www.npmjs.com/package/@cqlite/node"><img src="https://img.shields.io/npm/v/@cqlite/node.svg?label=npm%20%40cqlite%2Fnode" alt="npm"></a>
<a href="https://pmcfadin.github.io/cqlite/"><img src="https://img.shields.io/badge/docs-pmcfadin.github.io%2Fcqlite-blue.svg" alt="Docs"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" alt="Apache License"></a>
<a href="https://www.rust-lang.org"><img src="https://img.shields.io/badge/rust-1.85+-red.svg" alt="Rust"></a>
<a href="https://cassandra.apache.org"><img src="https://img.shields.io/badge/cassandra-5.0+-green.svg" alt="Cassandra"></a>
</p>
> **Status**: v0.11.0 โ Core reading, CLI, output writers, Python & Node.js bindings, and write support (with STCS compaction) are production-ready. See [CHANGELOG.md](CHANGELOG.md).
CQLite provides SQLite-like local access to Apache Cassandra SSTables, enabling developers to read Cassandra 5.0+ data files without cluster dependencies. Built in Rust for performance and safety.
## Documentation
Full documentation is at **[https://pmcfadin.github.io/cqlite/](https://pmcfadin.github.io/cqlite/)**:
| User Docs โ install, quick start, CLI, Python, Node.js | [/cqlite/user-docs/](https://pmcfadin.github.io/cqlite/user-docs/) |
| SSTable Format Guide โ binary format deep-dive | [/cqlite/sstable-format/](https://pmcfadin.github.io/cqlite/sstable-format/) |
| For Agents: Using CQLite โ LLM/agent integration | [/cqlite/agents-using/](https://pmcfadin.github.io/cqlite/agents-using/) |
| For Agents: Developing CQLite โ contributor doctrine, gate contract | [/cqlite/agents-developing/](https://pmcfadin.github.io/cqlite/agents-developing/) |
## Vision
CQLite aims to become the standard tool for Cassandra SSTable manipulation outside of the main Apache Cassandra project, enabling new workflows for data analytics, migration, testing, and edge computing.
## Project Leadership
CQLite is designed by **Patrick McFadin**, Apache Cassandra PMC member with 13 years of Cassandra experience. The project embodies Apache Cassandra community values and will be donated to the Apache Cassandra project upon maturity.
## Install
### CLI (from crates.io โ requires Rust 1.85+)
```bash
cargo install cqlite-cli # installs the `cqlite` binary
cqlite --help
```
### CLI (prebuilt binaries โ no Rust toolchain required)
Each [GitHub release](https://github.com/pmcfadin/cqlite/releases) attaches a
prebuilt `cqlite` CLI binary for the common platforms, each with a `.sha256`
checksum sidecar:
| macOS (Apple Silicon) | `cqlite-aarch64-apple-darwin.tar.gz` |
| macOS (Intel) | `cqlite-x86_64-apple-darwin.tar.gz` |
| Linux x86_64 (glibc) | `cqlite-x86_64-unknown-linux-gnu.tar.gz` |
| Linux x86_64 (static musl) | `cqlite-x86_64-unknown-linux-musl.tar.gz` |
| Linux arm64 (glibc) | `cqlite-aarch64-unknown-linux-gnu.tar.gz` |
| Windows x86_64 | `cqlite-x86_64-pc-windows-gnu.zip` |
```bash
# Example: macOS Apple Silicon
TARGET=aarch64-apple-darwin
curl -fsSLO https://github.com/pmcfadin/cqlite/releases/latest/download/cqlite-$TARGET.tar.gz
curl -fsSLO https://github.com/pmcfadin/cqlite/releases/latest/download/cqlite-$TARGET.tar.gz.sha256
shasum -a 256 -c cqlite-$TARGET.tar.gz.sha256 # verify (use sha256sum -c on Linux)
tar xzf cqlite-$TARGET.tar.gz
./cqlite --help
```
### Rust library
```bash
cargo add cqlite-core # use cqlite-core as a dependency
```
See [Using cqlite-core as a dependency](docs/using-cqlite-core-as-a-dependency.md) and the [API docs](https://docs.rs/cqlite-core).
### Language bindings
```bash
pip install cqlite-py # Python
npm install @cqlite/node # Node.js
```
## Quick Start
```bash
# Clone the repository
git clone https://github.com/pmcfadin/cqlite.git
cd cqlite
# Build the project
cargo build --release
# Run the CLI tool
cargo run --package cqlite-cli -- \
--schema test-data/schemas/basic-types.cql \
--data-dir test-data/datasets/sstables \
--query "SELECT * FROM test_basic.simple_table LIMIT 5" \
--out json
```
### Python
```bash
pip install cqlite-py
```
```python
import cqlite
with cqlite.open('path/to/sstables', schema='schema.cql') as db:
for row in db.execute('SELECT * FROM keyspace.table LIMIT 5'):
print(row.to_dict())
```
### Node.js
```bash
npm install @cqlite/node
```
```typescript
import { Database } from '@cqlite/node';
const db = await Database.open('path/to/sstables', { schema: 'schema.cql' });
const result = await db.execute('SELECT * FROM keyspace.table LIMIT 5');
for (const row of result.rows) {
console.log(row.name);
}
await db.close();
```
## Write Support
CQLite v0.9.0 (M5) ships write support across all interfaces: Rust core, Python,
Node.js, and CLI. Written data flushes to portable Cassandra 5.0 SSTables that
Cassandra can read directly via `nodetool refresh`.
The schema file below is included in the repository at
`test-data/schemas/write-test.cql`.
### Python
```python
import cqlite
# Open in writable mode โ write_dir stores the WAL and flushed SSTables
with cqlite.open(
'test-data/datasets/sstables',
schema='test-data/schemas/write-test.cql',
writable=True,
write_dir='/tmp/my-writes',
) as db:
db.execute(
"INSERT INTO test_basic.simple_table (id, name, age) "
"VALUES (11111111-1111-1111-1111-111111111111, 'Alice', 30)"
)
path = db.flush_run()
print(f'Flushed SSTable: {path}')
```
### Node.js
```javascript
const { Database } = require('@cqlite/node');
const db = await Database.open('test-data/datasets/sstables', {
schema: 'test-data/schemas/write-test.cql',
writable: true,
writeDir: '/tmp/my-writes',
});
await db.execute(
"INSERT INTO test_basic.simple_table (id, name, age) " +
"VALUES (22222222-2222-2222-2222-222222222222, 'Bob', 25)"
);
const path = await db.flushRun();
console.log('Flushed SSTable:', path);
await db.close();
```
### CLI
```bash
# Build with write support
cargo build --package cqlite-cli --features write-support
# Write via CQL INSERT
cargo run --package cqlite-cli --features write-support -- \
--writable --write-dir /tmp/my-writes \
--schema test-data/schemas/write-test.cql \
--execute "INSERT INTO test_basic.simple_table (id, name, age) \
VALUES (33333333-3333-3333-3333-333333333333, 'Carol', 28)"
# Flush memtable to SSTable
cargo run --package cqlite-cli --features write-support -- \
--writable --write-dir /tmp/my-writes \
--schema test-data/schemas/write-test.cql \
--flush
```
See [docs/write-support.md](docs/write-support.md) for the full write guide,
including the Cassandra export workflow and known limitations. To embed
`cqlite-core` in your own Rust project (dependency line, feature flags, and a
compiling write example), see
[docs/using-cqlite-core-as-a-dependency.md](docs/using-cqlite-core-as-a-dependency.md).
## Feature Flags
`cqlite-core` gates optional functionality behind Cargo features. The table below
maps the public API you're likely to reach for to the feature that enables it.
| Read / query path (`Database::open`, `execute`, `scan`, `get`) | `state_machine` | โ
yes |
| Compression (LZ4 / Snappy / Deflate / Zstd) | `all-compression` | โ
yes |
| Write path (`WriteEngine`, `Mutation`, `WriteEngine::write`/`flush`) | `write-support` | โ
yes |
| `Database::flush` / `Database::compact` (high-level convenience) | `experimental` | โ opt-in |
| CLI ingestion / REPL helpers (`cqlite-cli`) | `cli-helpers` | โ opt-in |
| Performance metrics collection | `metrics` | โ opt-in |
Default features are `["all-compression", "state_machine", "write-support"]`
(see `cqlite-core/Cargo.toml`). `write-support` was folded into the defaults in
[#558](https://github.com/pmcfadin/cqlite/issues/558) โ it gates only first-party
code and adds **no extra dependencies**, so read-only consumers pay nothing for it.
`flush`/`compact` on the high-level `Database` type remain behind `experimental`;
the equivalent engine-level `WriteEngine::flush` is part of `write-support`.
### Building with Custom Features
```bash
# Default build (read + write + compression)
cargo build
# Read-only consumer: drop the write path (still zero-cost to keep it, but explicit)
cargo build -p cqlite-core --no-default-features --features all-compression,state_machine
# Opt into high-level Database::flush / compact
cargo build -p cqlite-core --features experimental
# Minimal build (no compression, no query engine)
cargo build -p cqlite-core --no-default-features
```
## Features
### โ
Complete (M1/M2)
- [x] Cassandra 5+ SSTable format parsing (100% of test tables)
- [x] All CQL types including collections and UDTs
- [x] All compression codecs (LZ4, Snappy, Deflate, Zstd)
- [x] CLI tool with REPL and one-shot query modes
- [x] SELECT with WHERE clause (partition/clustering key equality)
- [x] Output formats: Table, JSON, CSV
### โ
M3 Complete (Jan 2026)
- [x] Parquet output format with Snappy compression
- [x] Export command (`cqlite export`)
- [x] Streaming export for large datasets
- [x] Output formats: CSV, JSON, Parquet, CQL
### โ
M4 Complete (Jan 2026)
- [x] Python bindings with full CQL type support
- [x] Node.js bindings with TypeScript definitions
- [x] Streaming API for memory-efficient queries
- [x] pip/npm installable packages (5 platform builds each)
- [x] Type stubs for IDE support (Python mypy, TypeScript)
### โ
M5 Complete โ v0.9.0 (May 2026)
- [x] Write support: WAL + memtable + flush to Cassandra SSTables
- [x] STCS compaction via `maintenance_step()`
- [x] Write API in Python, Node.js, and CLI
- [x] Full type coverage: Inet, Varint, Duration, Tuple, Frozen
- [x] E2E readback gate: write โ flush โ Cassandra `nodetool refresh` โ verify
### โ
Since v0.9.0 (v0.10 โ v0.11.0, Jun 2026)
- [x] Embeddable Parquet writer in `cqlite-core` (behind a `parquet` feature) + `export_parquet` in Python/Node
- [x] Version-gated reads for the Cassandra 5.0 `oa` format; graceful handling of `da` (BTI)
- [x] Real BTI trie node-type dispatch and schema-typed query result columns
- [x] Published documentation site at [pmcfadin.github.io/cqlite](https://pmcfadin.github.io/cqlite/)
- See [CHANGELOG.md](CHANGELOG.md) for the full per-release detail
### ๐ Roadmap
- [ ] M6: WASM bindings for browser deployment
- [ ] M7: Performance validation + v1.0 release
## Architecture Highlights
**Design Philosophy:**
- **No cluster dependency** - Read and write SSTables directly, with no running Cassandra node
- **CQL parser** - Native CQL support using an Antlr4 grammar
- **Cassandra 5+ focus** - Modern 'oa' format with BTI support
- **Memory efficient** - <128MB usage target for large files
- **Self-contained engine** - Pure-Rust parsing and writing, including STCS compaction
## Getting Involved
CQLite is developed in the open as an Apache-licensed project. We welcome contributions from the Cassandra community!
### Development Setup
```bash
# Prerequisites
# - Rust 1.85+
# Clone and build
git clone https://github.com/pmcfadin/cqlite.git
cd cqlite
cargo build
# Fetch test data (JSONL reference files are in git, SSTable binaries fetched separately)
bash test-data/scripts/fetch-datasets.sh
# Run tests
env CQLITE_DATASETS_ROOT=$PWD/test-data/datasets cargo test --package cqlite-core
```
### Contributing
1. **Check Issues**: Look for `good-first-issue` labels
2. **Discuss**: Join our community discussions
3. **Code**: Follow Rust best practices and include tests
4. **Test**: Ensure compatibility with real Cassandra data
5. **Document**: Update docs for user-facing changes
## Current Status
### โ
M1 Complete (Dec 2025)
- All SSTable components parsed (Data.db, Index.db, Summary.db, Statistics.db, TOC)
- 33/33 test tables passing (100% validation)
- All 21 CQL primitive types + collections + UDTs + frozen types
- All compression algorithms working
- Tiered test coverage targets (see [PRD Section 5.1](docs/development/PRD.md#51--tiered-coverage-targets))
### โ
M2 Complete (Jan 2026)
- CLI with one-shot and REPL modes
- SELECT queries with WHERE clause support
- Multiple output formats (Table, JSON, CSV)
### โ
M3 Complete (Jan 2026)
- Parquet output format with Snappy compression
- Export command with CSV, JSON, Parquet, CQL formats
- Streaming export for memory-efficient large dataset handling
- Progress bar and statistics for exports
### โ
M4 Complete (Jan 2026)
- Python bindings via PyO3 with sync-first API
- Node.js bindings via napi-rs with Promise-based API
- Full CQL type system (20+ types including collections, UDTs)
- Thread-safe database handles
- 500+ tests with 98%+ pass rate across both bindings
### โ
M5 Complete โ v0.9.0 (May 2026)
- Write support: WAL-backed memtable + flush to portable Cassandra 5.0 SSTables
- STCS compaction (`maintenance_step()`)
- Write API exposed in Python (`flush_run`, `maintenance_step`, `write_stats`),
Node.js (`flushRun`, `maintenanceStep`, `writeStats`), and CLI (`--writable`,
`--write-dir`, `--flush`, `maintenance`, `write-stats`, `export-sstable`)
- Type roundtrips verified for all major types including Inet, Varint, Duration, Tuple, Frozen
- E2E validation against live Cassandra 5.0 (write โ flush โ `nodetool refresh` โ `cqlsh`)
See [docs/development/PRD.md](docs/development/PRD.md) for milestone details.
## Technical Details
### Supported Formats
- **Cassandra 5.0+**: 'oa' format with BTI support
- **File Types**: Data.db, Index.db, Summary.db, Statistics.db
- **Compression**: LZ4, Snappy, Deflate, Zstd
### Performance Targets
- **Parse Speed**: 1GB files in <10 seconds
- **Memory Usage**: <128MB for large SSTables
- **Query Latency**: Sub-millisecond partition lookups
### Language Bindings
- **Python**: Production-ready sync API (see [Python README](bindings/python/README.md))
- **Node.js**: Production-ready Promise API (see [Node.js README](bindings/node/README.md))
- **WASM**: Planned (M6+)
## Resources
- **Documentation site**: [https://pmcfadin.github.io/cqlite/](https://pmcfadin.github.io/cqlite/) โ user docs, SSTable format guide, agent integration docs
- **API docs (rustdoc)**: [latest tag](https://pmcfadin.github.io/cqlite/api/latest/) ยท published per release tag at `https://pmcfadin.github.io/cqlite/api/<tag>/`
- **Changelog**: [CHANGELOG.md](CHANGELOG.md) โ what each tagged release contains
- **Performance**: [Methodology, local repro, and CI gate policy](docs/performance.md)
- **CQL Grammar**: [Patrick's Antlr4 CQL Grammar](https://github.com/pmcfadin/cassandra-antlr4-grammar)
- **Issues**: [GitHub Issues](https://github.com/pmcfadin/cqlite/issues)
- **Discussions**: [GitHub Discussions](https://github.com/pmcfadin/cqlite/discussions)
## Community
- **Questions & ideas**: [GitHub Discussions](https://github.com/pmcfadin/cqlite/discussions)
- **Bugs & feature requests**: [GitHub Issues](https://github.com/pmcfadin/cqlite/issues)
- **Contributing**: see [CONTRIBUTING.md](CONTRIBUTING.md) and our [Code of Conduct](CODE_OF_CONDUCT.md)
CQLite is an independent open-source project, not an Apache Software Foundation
project. It is built in the spirit of the Apache Cassandra community, with the
goal of contributing it upstream as it matures.
## License
Licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) for details.
## Acknowledgments
Special thanks to the Apache Cassandra community and the many contributors who make projects like this possible. CQLite builds on decades of database engineering innovation from the Cassandra project.
---
**Note**: M1 through M5 milestones are complete and the project is at **v0.11.0**. Core SSTable reading, CLI, output writers (including Parquet), Python and Node.js bindings, and write support with STCS compaction are production-ready. Next: M6 (WASM bindings) and M7 (performance validation + v1.0).