Crate swiftide

source ·
Expand description

§Swiftide

Swiftide is a data indexing and processing library, tailored for Retrieval Augmented Generation (RAG). When building applications with large language models (LLM), these LLMs need access to external resources. Data needs to be transformed, enriched, split up, embedded, and persisted. It is build in Rust, using parallel, asynchronous streams and is blazingly fast.

Part of the bosun.ai project. An upcoming platform for autonomous code improvement.

We <3 feedback: project ideas, suggestions, and complaints are very welcome. Feel free to open an issue.

Read more about the project on the swiftide website

§Features

  • Extremely fast streaming indexing pipeline with async, parallel processing
  • Integrations with OpenAI, Redis, Qdrant, FastEmbed, Treesitter and more
  • A variety of loaders, transformers, and embedders and other common, generic tools
  • Bring your own transformers by extending straightforward traits
  • Splitting and merging pipelines
  • Jinja-like templating for prompts
  • Store into multiple backends
  • tracing supported for logging and tracing, see /examples and the tracing crate for more information.

§Example

use swiftide::loaders::FileLoader;
use swiftide::transformers::{ChunkMarkdown, Embed, MetadataQAText};
use swiftide::integrations::qdrant::Qdrant;
use swiftide::indexing::Pipeline;

 Pipeline::from_loader(FileLoader::new(".").with_extensions(&["md"]))
         .then_chunk(ChunkMarkdown::from_chunk_range(10..512))
         .then(MetadataQAText::new(openai_client.clone()))
         .then_in_batch(10, Embed::new(openai_client.clone()))
         .then_store_with(
             Qdrant::try_from_url(qdrant_url)?
                 .batch_size(50)
                 .vector_size(1536)
                 .collection_name("swiftide-examples".to_string())
                 .build()?,
         )
         .run()
         .await

§Feature flags

Swiftide has little features enabled by default as there are some dependency heavy integrations.

Either use the ‘all’ feature flag (not recommended), or enable the integrations that you need. Each integration has a similarly named feature flag.

Re-exports§

Modules§

  • This module serves as the main entry point for the indexing components in the Swiftide project. It re-exports the essential structs and types from the indexing_node, indexing_pipeline, and indexing_stream modules, providing a unified interface for the indexing functionality.
  • ingestionDeprecated
    Deprecated re-export of indexing, use that instead.
  • Integrations with various platforms and external services.
  • The loaders module provides functionality for loading files from a specified directory. It includes the FileLoader struct which is used to filter and stream files based on their extensions.
  • Storage implementations for persisting data
  • Prompts templating and management
  • Traits in Swiftide allow for easy extendability
  • Various transformers for chunking, embedding and transforming data