Crate xapian_rs

source ·
Expand description

§xapian-rs

GitHub crates.io docs.rs

xapian-rs provides a set of low-level, mostly-ergonomic Rust bindings for the Xapian search library.

The bindings are provided by a mix of auto-generation (via autocxx) and manual generation (via cxx). When necessary, small C++ shims are implemented to work around incompatibilities between these tools and the Xapian codebase.

§Status / Stability

xapian-rs is currently immature, untested and incomplete. During the 0.x version series, no stability guarantees are provided for the API, and it may change or break at any time. A small, limited real-world use case has been implemented in pantry, which exercises an interesting but small subset of the capabilities of Xapian:

  • Indexing
  • Searching
  • Faceting

Some functionality is not provided at this time, including (but not limited to):

  • KeyMaker
  • Custom RangeProcessor implementations

§Design

Where possible, xapian-rs tries to provide simple and ergonomic interactions with idiomatic Rust code. However, Xapian is a C++ codebase which uses C++ idioms, and this does have some consequences on the current design (as do limitations of the autocxx and cxx):

  • Xapian primarily uses exceptions for error handling. autocxx does not currently support catching exceptions (though cxx does). In the current version, any Xapian exception will trigger a panic in Rust code. This will improve as the library evolves.
  • Xapian uses C++ strings very heavily. C++ strings provide no encoding guarantees, while Rust strings are guaranteed to be valid UTF-8. These bindings currently handle this in a way that is inconsistent (though at times convenient). This will become more well-defined as the library evolves.
  • Several Xapian types are exposed in a way that allows implementation via Rust traits. At present, these traits are generally implemented via &self references, and therefore interior mutability is often needed to implement interesting functionality.
  • Some of these traits intentionally leak memory when passed to FFI today. This will improve as the library evolves.

§Examples

Several examples are provided in the examples directory. The tests directory’s integration tests are also useful.

Structs§

  • A read-only Xapian database
  • Various flags to modify writable database behavior
  • A newtype wrapper representing a valid (non-zero) Xapian document ID
  • A document in a Xapian database
  • An ESet represents a set of terms that may be useful for expanding the current query
  • The primary interface to retrieve information from Xapian.
  • An individual expansion term from an ESet, with access to position and frequency information
  • A list of search results with associated metadata
  • An individual match item from the iterator yielded by MSet::matches
  • A newtype wrapper representing a valid document position
  • A parsed query, ready for use in a search
  • A type for building Query objects from strings
  • An RSet is used to hold documents marked as explicitly relevant to the current search
  • A bitflag representation of flags supported by a RangeProcessor
  • A newtype wrapper representing a valid Xapian slot number (aka valueno)
  • An instance of a Stemming algorithm
  • An individual term, with access to position and frequency information
  • An instance of a Xapian TermGenerator, which can be used to index text with optional stemming
  • A Xapian database that can be read or written to

Enums§

  • A flag indicating how to handle the database already existing (or not)
  • The type of backend to use for the database
  • A newtype wrapper for the three primary built-in RangeProcessors.
  • A strategy to apply to a Stem instance

Traits§

  • An ExpandDecider can be used to reject terms from an ESet
  • A FieldProcessor can be used to customize the handling of query fields
  • A trait representing the ability to be loaded from a Xapian document value.
  • A MatchDecider can be used to reject documents from an MSet
  • A MatchSpy can be used to accumulate information seen during the match.
  • Determines whether a given term matches a stopword. Stopwords are not typically indexed or included in parsed queries.
  • A trait representing the ability to be stored as a Xapian document value. Useful for features such as faceting and other forms of advanced field-level filtering.