recoco-core 0.2.1

Recoco-core is the core library of Recoco; it's nearly identical to the main ReCoco crate, which is a simple wrapper around recoco-core and other sub-crates.
Documentation

recoco

This is the core package for ReCoco, core provides direct access to ReCoco's complete functionality. ReCoco is a rust-only fork of CocoIndex

When to use this crate

Use recoco when you want:

  • ✅ To use multiple ReCoco components together
  • ✅ Feature parity with the full ReCoco ecosystem
  • ✅ Easy dependency management

Use individual crates (recoco-core, recoco-utils, recoco-splitters) when you want:

  • ⚡ Fine-grained dependency control
  • ⚡ Minimal compile times
  • ⚡ Only specific functionality (e.g., just utils)

Installation

[dependencies]
recoco = { version = "0.2", features = ["function-split", "source-postgres"] }

📦 Feature Flags

This crate mirrors all features from recoco-core. Enable only what you need to keep dependencies minimal.

🎯 Default Features

recoco = "0.2"  # Includes: persistence, server, source-local-file

📦 Feature Bundles (Convenience)

Feature Description Use When
full Everything (⚠️ very heavy) You need all functionality
all-sources All data source connectors Working with multiple data sources
all-targets All data target connectors Writing to multiple databases
all-functions All transformation functions Need all data processing capabilities
all-llm-providers All LLM provider integrations Working with multiple AI APIs
all-splitter-languages All Tree-sitter grammars Processing many programming languages

📥 Sources (Data Ingestion)

Feature Description
source-local-file Local filesystem (✅ default)
source-postgres PostgreSQL with CDC
source-s3 Amazon S3
source-azure Azure Blob Storage
source-gdrive Google Drive

📤 Targets (Data Persistence)

Feature Description
target-postgres PostgreSQL database
target-qdrant Qdrant vector database
target-neo4j Neo4j graph database
target-kuzu Kùzu embedded graph database

⚙️ Functions (Data Transformations)

Feature Description
function-split Text splitting (recursive, semantic)
function-embed Generate text embeddings
function-extract-llm LLM-based data extraction
function-detect-lang Programming language detection
function-json JSON/JSON5 parsing

🤖 LLM Providers

Feature Provider
provider-openai OpenAI (GPT-4, etc.)
provider-anthropic Anthropic (Claude)
provider-azure Azure OpenAI
provider-gemini Google Gemini
provider-bedrock AWS Bedrock
provider-ollama Ollama (local LLMs)
provider-voyage Voyage AI (embeddings)
provider-litellm LiteLLM (unified gateway)
provider-openrouter OpenRouter (multi-provider)
provider-vllm vLLM (inference server)

🔤 Splitter Languages (Tree-sitter Grammars)

Enable specific programming language support for code splitting:

Feature Language
splitter-language-c C
splitter-language-c-sharp C#
splitter-language-cpp C++
splitter-language-css CSS
splitter-language-fortran Fortran
splitter-language-go Go
splitter-language-html HTML
splitter-language-java Java
splitter-language-javascript JavaScript
splitter-language-json JSON
splitter-language-kotlin Kotlin
splitter-language-markdown Markdown
splitter-language-php PHP
splitter-language-python Python
splitter-language-r R
splitter-language-ruby Ruby
splitter-language-rust Rust
splitter-language-scala Scala
splitter-language-sql SQL
splitter-language-swift Swift
splitter-language-toml TOML
splitter-language-typescript TypeScript
splitter-language-xml XML
splitter-language-yaml YAML

🏗️ Core Features

Feature Description
persistence Database-backed state tracking (✅ default)
server HTTP server components (✅ default)
json-schema JSON Schema support

🎯 Common Use Cases

Local File Processing

recoco = { version = "0.2", default-features = false, features = [
    "source-local-file",
    "function-split",
    "splitter-language-rust"
]}

RAG Pipeline with OpenAI

recoco = { version = "0.2", features = [
    "source-s3",
    "function-split",
    "function-embed",
    "provider-openai",
    "target-qdrant",
    "all-splitter-languages"
]}

Database ETL

recoco = { version = "0.2", features = [
    "source-postgres",
    "target-postgres",
    "function-json"
]}

Multi-Cloud Data Sync

recoco = { version = "0.2", features = [
    "all-sources",
    "all-targets",
    "batching"
]}

Lightweight Transient Processing (No Database)

recoco = { version = "0.2", default-features = false, features = [
    "function-split",
    "function-json"
]}

📚 Documentation

🔧 Development

This crate is part of the ReCoco workspace:

# Build with specific features
cargo build -p recoco --features "function-split,source-postgres"

# Test with all features
cargo test -p recoco --features full

# Run examples
cargo run -p recoco --example transient --features function-split

📄 License

Apache-2.0. See main repository for details.