Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
citenexus-core
CiteNexus's Rust engine — one core, FFI for all languages
(SPEC-PORTS-v1 §3.4). Ships alongside the Python
library in this repo; the Python extractors remain the behavior reference and
tests/core/test_rust_parity.py proves byte-identical output through the
real C ABI.
What's in (and coming)
| Area | Status |
|---|---|
| extract — txt · csv · md · html · docx · pptx (OOXML-direct) | ✅ implemented, parity-tested |
| extract — pdf (pdfium, runtime-bound) | behind the pdf feature |
store — Lance (upsert/search/scan/drop, merge-insert by eu_id) |
✅ implemented; tests/core/test_rust_store_parity.py proves Rust-written tables are read (scan + search) by Python's LanceVectorStore and vice versa — same URI, same bytes |
detect — fastText lid.176 (pure-Rust fasttext crate) |
✅ implemented — dense lid.176.bin only: the crate's quantized (.ftz) inference diverges from upstream in 0.8.0, so quantized models are refused with an error (see src/detect.rs) |
The core is the engine, not the brain: orchestration, cite-or-abstain, hooks, and model IO stay in each host language. Boundary: JSON in/out, no callbacks.
C ABI
char* ; // -> ExtractedDoc JSON or {"error": ...}
// store — opaque handle, JSON rows, {"error": ...} on failure
void* ; // NULL on failure
char* ; // {"ok":true}
char* ; // rows + _distance
char* ; // limit < 0 = all
char* ; // {"ok":true}
void ;
// detect — fastText lid.176 (dense .bin; caller supplies the model path)
void* ; // NULL on failure
char* ; // {"language":"fr","confidence":0.98}
void ;
void ; // releases every char* above
const char* ;
Bindings: cgo (Go, required) · napi-rs (TS, parity path) · pyo3/ctypes (Python).
Develop
Build prerequisite: protoc (lance's build scripts generate protobuf code) —
brew install protobuf on macOS. The lid.176 real-model tests skip unless
assets/models/lid.176.bin exists (or CITENEXUS_LID176_PATH points at it);
nothing is downloaded at test time.