recoco
This is the core package for ReCoco, core provides direct access to ReCoco's complete functionality. ReCoco is a rust-only fork of CocoIndex
When to use this crate
Use recoco when you want:
- ✅ To use multiple ReCoco components together
- ✅ Feature parity with the full ReCoco ecosystem
- ✅ Easy dependency management
Use individual crates (recoco-core, recoco-utils, recoco-splitters) when you want:
- ⚡ Fine-grained dependency control
- ⚡ Minimal compile times
- ⚡ Only specific functionality (e.g., just utils)
Installation
[]
= { = "0.2", = ["function-split", "source-postgres"] }
📦 Feature Flags
This crate mirrors all features from recoco-core. Enable only what you need to keep dependencies minimal.
🎯 Default Features
= "0.2" # Includes: persistence, server, source-local-file
📦 Feature Bundles (Convenience)
| Feature | Description | Use When |
|---|---|---|
full |
Everything (⚠️ very heavy) | You need all functionality |
all-sources |
All data source connectors | Working with multiple data sources |
all-targets |
All data target connectors | Writing to multiple databases |
all-functions |
All transformation functions | Need all data processing capabilities |
all-llm-providers |
All LLM provider integrations | Working with multiple AI APIs |
all-splitter-languages |
All Tree-sitter grammars | Processing many programming languages |
📥 Sources (Data Ingestion)
| Feature | Description |
|---|---|
source-local-file |
Local filesystem (✅ default) |
source-postgres |
PostgreSQL with CDC |
source-s3 |
Amazon S3 |
source-azure |
Azure Blob Storage |
source-gdrive |
Google Drive |
📤 Targets (Data Persistence)
| Feature | Description |
|---|---|
target-postgres |
PostgreSQL database |
target-qdrant |
Qdrant vector database |
target-neo4j |
Neo4j graph database |
target-kuzu |
Kùzu embedded graph database |
⚙️ Functions (Data Transformations)
| Feature | Description |
|---|---|
function-split |
Text splitting (recursive, semantic) |
function-embed |
Generate text embeddings |
function-extract-llm |
LLM-based data extraction |
function-detect-lang |
Programming language detection |
function-json |
JSON/JSON5 parsing |
🤖 LLM Providers
| Feature | Provider |
|---|---|
provider-openai |
OpenAI (GPT-4, etc.) |
provider-anthropic |
Anthropic (Claude) |
provider-azure |
Azure OpenAI |
provider-gemini |
Google Gemini |
provider-bedrock |
AWS Bedrock |
provider-ollama |
Ollama (local LLMs) |
provider-voyage |
Voyage AI (embeddings) |
provider-litellm |
LiteLLM (unified gateway) |
provider-openrouter |
OpenRouter (multi-provider) |
provider-vllm |
vLLM (inference server) |
🔤 Splitter Languages (Tree-sitter Grammars)
Enable specific programming language support for code splitting:
| Feature | Language |
|---|---|
splitter-language-c |
C |
splitter-language-c-sharp |
C# |
splitter-language-cpp |
C++ |
splitter-language-css |
CSS |
splitter-language-fortran |
Fortran |
splitter-language-go |
Go |
splitter-language-html |
HTML |
splitter-language-java |
Java |
splitter-language-javascript |
JavaScript |
splitter-language-json |
JSON |
splitter-language-kotlin |
Kotlin |
splitter-language-markdown |
Markdown |
splitter-language-php |
PHP |
splitter-language-python |
Python |
splitter-language-r |
R |
splitter-language-ruby |
Ruby |
splitter-language-rust |
Rust |
splitter-language-scala |
Scala |
splitter-language-sql |
SQL |
splitter-language-swift |
Swift |
splitter-language-toml |
TOML |
splitter-language-typescript |
TypeScript |
splitter-language-xml |
XML |
splitter-language-yaml |
YAML |
🏗️ Core Features
| Feature | Description |
|---|---|
persistence |
Database-backed state tracking (✅ default) |
server |
HTTP server components (✅ default) |
json-schema |
JSON Schema support |
🎯 Common Use Cases
Local File Processing
= { = "0.2", = false, = [
"source-local-file",
"function-split",
"splitter-language-rust"
]}
RAG Pipeline with OpenAI
= { = "0.2", = [
"source-s3",
"function-split",
"function-embed",
"provider-openai",
"target-qdrant",
"all-splitter-languages"
]}
Database ETL
= { = "0.2", = [
"source-postgres",
"target-postgres",
"function-json"
]}
Multi-Cloud Data Sync
= { = "0.2", = [
"all-sources",
"all-targets",
"batching"
]}
Lightweight Transient Processing (No Database)
= { = "0.2", = false, = [
"function-split",
"function-json"
]}
📚 Documentation
- Main README: ../../README.md
- API Docs: docs.rs/recoco
- Examples: examples/
- recoco-utils: ../recoco-utils/README.md
🔧 Development
This crate is part of the ReCoco workspace:
# Build with specific features
# Test with all features
# Run examples
📄 License
Apache-2.0. See main repository for details.