Skip to main content

Crate vectorless

Crate vectorless 

Source
Expand description

§Vectorless

§Vectorless

A hierarchical, reasoning-native document intelligence engine.

Replace your vector database with LLM-powered tree navigation. No embeddings. No vector search. Just reasoning.

§Overview

Traditional RAG systems chunk documents into flat vectors, losing structure. Vectorless preserves your document’s hierarchy and uses an LLM to navigate it — like a human skimming a table of contents, then drilling into relevant sections.

§Architecture

┌─────────────────────────────────────────────────────────────────┐
│                          client                                  │
│                     (Engine, EngineBuilder)                      │
└────────────────────────────┬────────────────────────────────────┘
                             │
          ┌──────────────────┼──────────────────┐
          ▼                  ▼                  ▼
    ┌──────────┐       ┌───────────┐      ┌──────────┐
    │  index   │       │ retrieval │      │ storage  │
    │ (write)  │       │  (read)   │      │ (persist)│
    └────┬─────┘       └─────┬─────┘      └────┬─────┘
         │                   │                 │
         └───────────┬───────┘                 │
                     ▼                         │
               ┌───────────┐                   │
               │  domain   │                   │
               │(Tree/Node)│                   │
               └─────┬─────┘                   │
                     │                         │
      ┌──────────────┼──────────────┐          │
      ▼              ▼              ▼          │
 ┌────────┐    ┌──────────┐   ┌────────┐      │
 │ parser │    │   llm    │   │ config │◄─────┘
 └────────┘    └──────────┘   └────────┘

§Features

  • 🌳 Tree-Based Indexing — Documents as hierarchical trees, not flat chunks
  • 🧠 LLM Navigation — Reasoning-based traversal to find relevant content
  • 🚀 Zero Infrastructure — No vector database, no embedding models
  • 📄 Multi-Format — Markdown, PDF, DOCX support
  • 💾 Persistent Workspace — LRU-cached storage with lazy loading
  • 🔄 Retry & Fallback — Resilient LLM calls with automatic recovery

§Quick Start

use vectorless::{EngineBuilder, Engine};
use vectorless::client::IndexContext;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create client
    let client = EngineBuilder::new()
        .with_workspace("./workspace")
        .build()
        .await?;

    // Index a document
    let doc_id = client.index(IndexContext::from_path("./document.md")).await?;

    // Query with natural language
    let result = client.query(&doc_id, "What is this about?").await?;
    println!("{}", result.content);

    Ok(())
}

§Modules

ModuleDescription
clientHigh-level API (Engine, EngineBuilder)
[domain]Core domain types (DocumentTree, TreeNode, NodeId)
indexDocument indexing pipeline
retrievalRetrieval strategies and search algorithms
configConfiguration management
llmLLM client with retry & fallback
parserDocument parsers (Markdown, PDF, DOCX)
storageWorkspace persistence
throttleRate limiting

Re-exports§

pub use client::BuildError;
pub use client::DocumentInfo;
pub use client::Engine;
pub use client::EngineBuilder;
pub use client::IndexContext;
pub use client::IndexMode;
pub use client::IndexOptions;
pub use client::IndexSource;
pub use client::IndexedDocument;
pub use error::Error;
pub use error::Result;
pub use document::DocumentStructure;
pub use document::DocumentTree;
pub use document::NodeId;
pub use document::StructureNode;
pub use document::TocConfig;
pub use document::TocEntry;
pub use document::TocNode;
pub use document::TocView;
pub use document::TreeNode;
pub use util::estimate_tokens;
pub use util::estimate_tokens_fast;
pub use config::Config;
pub use config::ConfigLoader;
pub use config::RetrievalConfig;
pub use config::SummaryConfig;
pub use llm::LlmClient;
pub use llm::LlmConfig;
pub use llm::LlmConfigs;
pub use llm::LlmError;
pub use llm::LlmPool;
pub use llm::RetryConfig;
pub use parser::DocumentFormat;
pub use parser::DocumentParser;
pub use parser::DocxParser;
pub use parser::MarkdownParser;
pub use parser::ParseResult;
pub use parser::PdfParser;
pub use parser::RawNode;
pub use index::pipeline::CustomStageBuilder;
pub use index::pipeline::PipelineOrchestrator;
pub use index::ChangeDetector;
pub use index::ChangeSet;
pub use index::IndexContext as PipelineIndexContext;
pub use index::IndexInput;
pub use index::IndexMetrics;
pub use index::IndexMode as PipelineIndexMode;
pub use index::IndexResult;
pub use index::IndexStage;
pub use index::PartialUpdater;
pub use index::PipelineExecutor;
pub use index::PipelineOptions;
pub use index::SummaryStrategy;
pub use retrieval::ContextBuilder;
pub use retrieval::NavigationDecision;
pub use retrieval::NavigationStep;
pub use retrieval::PipelineRetriever;
pub use retrieval::PruningStrategy;
pub use retrieval::QueryComplexity;
pub use retrieval::RetrievalContext;
pub use retrieval::RetrievalResult;
pub use retrieval::RetrieveOptions;
pub use retrieval::RetrieveResponse;
pub use retrieval::Retriever;
pub use retrieval::RetrieverError;
pub use retrieval::RetrieverResult;
pub use retrieval::SearchPath;
pub use retrieval::StrategyPreference;
pub use retrieval::SufficiencyLevel;
pub use retrieval::TokenEstimation;
pub use retrieval::format_for_llm;
pub use retrieval::format_for_llm_async;
pub use retrieval::format_tree_for_llm;
pub use retrieval::format_tree_for_llm_async;
pub use storage::DocumentMeta as StorageDocumentMeta;
pub use storage::PersistedDocument;
pub use storage::Workspace;
pub use throttle::ConcurrencyConfig;
pub use throttle::ConcurrencyController;
pub use throttle::RateLimiter;

Modules§

client
High-level client API for document indexing and retrieval.
config
Configuration management for vectorless.
document
Document types - pure data structures for document tree representation.
error
Error types for the vectorless library.
index
Index Pipeline module.
llm
Unified LLM client module.
metrics
Unified metrics collection for Vectorless.
parser
Document parsing module.
retrieval
Retrieval system for Vectorless document trees.
storage
Storage module for persisting document indices.
throttle
Concurrency control for LLM API calls.
util
Utility functions and helpers.