sqlrite 1.0.2

RAG-oriented SQLite wrapper for AI agent workloads
Documentation
# Submission Notes

## Current framing

The manuscript is now strongest as:

- an arXiv systems paper
- a software and engineering paper
- a reproducibility-oriented report on an embedded-first retrieval engine

## What the paper now argues

The current argument is narrower and stronger than the earlier draft.

It does **not** argue that SQLRite is universally the fastest retrieval engine.
It argues that:

- SQLRite is an embedded-first retrieval system built around SQL-native access.
- SQLRite is currently competitive on its target filtered embedded workload.
- SQLRite's hybrid retrieval path can produce a better quality-throughput tradeoff on a judged public dataset than the pgvector baseline used here.

## Evidence the paper depends on

Primary:

- `embedded_competitive_snapshot.json`
- `public_dataset_results.json`

Supporting:

- `embedded_competitive_snapshot.md`
- `public_dataset_results.md`
- `run_public_dataset_eval.py`

## Claim discipline

Keep these boundaries explicit in the final submission:

- benchmark leadership is workload-specific
- embedded mode is SQLRite's strongest deployment path
- compact HTTP is a lower-overhead served path, not a replacement for the embedded claim
- public evaluation is still limited to SciFact in the current revision
- deterministic local embeddings improve reproducibility but do not replace a broader production-model evaluation

## What would strengthen the next revision

1. Add at least one more BEIR dataset.
2. Repeat the public benchmark on a second machine class.
3. Add a tighter apples-to-apples service benchmark that compares compact HTTP against equivalent low-overhead served paths.
4. Add one chart showing the embedded-versus-served SQLRite gap directly.