Struct tantivy::query::QueryParser

source ·
pub struct QueryParser { /* private fields */ }
Expand description

Tantivy’s Query parser

The language covered by the current parser is extremely simple.

  • simple terms: “e.g.: Barack Obama will be seen as a sequence of two tokens Barack and Obama. By default, the query parser will interpret this as a disjunction (see .set_conjunction_by_default()) and will match all documents that contains either “Barack” or “Obama” or both. Since we did not target a specific field, the query parser will look into the so-called default fields (as set up in the constructor).

    Assuming that the default fields are body and title, and the query parser is set with conjunction as a default, our query will be interpreted as. (body:Barack OR title:Barack) AND (title:Obama OR body:Obama). By default, all tokenized and indexed fields are default fields.

    It is possible to explicitly target a field by prefixing the text by the fieldname:. Note this only applies to the term directly following. For instance, assuming the query parser is configured to use conjunction by default, body:Barack Obama is not interpreted as body:Barack AND body:Obama but as body:Barack OR (body:Barack OR text:Obama) .

  • boolean operators AND, OR. AND takes precedence over OR, so that a AND b OR c is interpreted as (a AND b) OR c.

  • In addition to the boolean operators, the -, + can help define. These operators are sufficient to express all queries using boolean operators. For instance x AND y OR z can be written ((+x +y) z). In addition, these operators can help define “required optional” queries. (+x y) matches the same document set as simply x, but y will help refining the score.

  • negative terms: By prepending a term by a -, a term can be excluded from the search. This is useful for disambiguating a query. e.g. apple -fruit

  • must terms: By prepending a term by a +, a term can be made required for the search.

  • phrase terms: Quoted terms become phrase searches on fields that have positions indexed. e.g., title:"Barack Obama" will only find documents that have “barack” immediately followed by “obama”. Single quotes can also be used. If the text to be searched contains quotation mark, it is possible to escape them with a \.

  • range terms: Range searches can be done by specifying the start and end bound. These can be inclusive or exclusive. e.g., title:[a TO c} will find all documents whose title contains a word lexicographically between a and c (inclusive lower bound, exclusive upper bound). Inclusive bounds are [], exclusive are {}.

  • set terms: Using the IN operator, a field can be matched against a set of literals, e.g. title: IN [a b cd] will match documents where title is either a, b or cd, but do so more efficiently than the alternative query title:a OR title:b OR title:c does.

  • date values: The query parser supports rfc3339 formatted dates. For example "2002-10-02T15:00:00.05Z" or some_date_field:[2002-10-02T15:00:00Z TO 2002-10-02T18:00:00Z}

  • all docs query: A plain * will match all documents in the index.

Parts of the queries can be boosted by appending ^boostfactor. For instance, "SRE"^2.0 OR devops^0.4 will boost documents containing SRE instead of devops. Negative boosts are not allowed.

It is also possible to define a boost for a some specific field, at the query parser level. (See set_field_boost(...)). Typically you may want to boost a title field.

Additionally, specific fields can be marked to use fuzzy term queries for each literal via the QueryParser::set_field_fuzzy method.

Phrase terms support the ~ slop operator which allows to set the phrase’s matching distance in words. "big wolf"~1 will return documents containing the phrase "big bad wolf".

Phrase terms also support the * prefix operator which switches the phrase’s matching to consider all documents which contain the last term as a prefix, e.g. "big bad wo"* will match "big bad wolf".

Implementations§

source§

impl QueryParser

source

pub fn new( schema: Schema, default_fields: Vec<Field>, tokenizer_manager: TokenizerManager ) -> QueryParser

Creates a QueryParser, given

  • schema - index Schema
  • default_fields - fields used to search if no field is specifically defined in the query.
source

pub fn for_index(index: &Index, default_fields: Vec<Field>) -> QueryParser

Creates a QueryParser, given

  • an index
  • a set of default fields used to search if no field is specifically defined in the query.
source

pub fn set_conjunction_by_default(&mut self)

Set the default way to compose queries to a conjunction.

By default, the query happy tax payer is equivalent to the query happy OR tax OR payer. After calling .set_conjunction_by_default() happy tax payer will be interpreted by the parser as happy AND tax AND payer.

source

pub fn set_field_boost(&mut self, field: Field, boost: Score)

Sets a boost for a specific field.

The parse query will automatically boost this field.

If the query defines a query boost through the query language (e.g: country:France^3.0), the two boosts (the one defined in the query, and the one defined in the QueryParser) are multiplied together.

source

pub fn set_field_fuzzy( &mut self, field: Field, prefix: bool, distance: u8, transpose_cost_one: bool )

Sets the given field to use fuzzy term queries

If set, the parse will produce queries using fuzzy term queries with the given parameters for each literal matched against the given field.

See the FuzzyTermQuery::new and FuzzyTermQuery::new_prefix methods for the meaning of the individual parameters.

source

pub fn parse_query( &self, query: &str ) -> Result<Box<dyn Query>, QueryParserError>

Parse a query

Note that parse_query returns an error if the input is not a valid query.

source

pub fn parse_query_lenient( &self, query: &str ) -> (Box<dyn Query>, Vec<QueryParserError>)

Parse a query leniently

This variant parses invalid query on a best effort basis. If some part of the query can’t reasonably be executed (range query without field, searching on a non existing field, searching without precising field when no default field is provided…), they may get turned into a “match-nothing” subquery.

In case it encountered such issues, they are reported as a Vec of errors.

source

pub fn build_query_from_user_input_ast( &self, user_input_ast: UserInputAst ) -> Result<Box<dyn Query>, QueryParserError>

Build a query from an already parsed user input AST

This can be useful if the user input AST parsed using query_grammar needs to be inspected before the query is re-interpreted w.r.t. index specifics like field names and tokenizers.

source

pub fn build_query_from_user_input_ast_lenient( &self, user_input_ast: UserInputAst ) -> (Box<dyn Query>, Vec<QueryParserError>)

Build leniently a query from an already parsed user input AST.

See also QueryParser::build_query_from_user_input_ast

Trait Implementations§

source§

impl Clone for QueryParser

source§

fn clone(&self) -> QueryParser

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> Downcast for T
where T: Any,

source§

fn into_any(self: Box<T>) -> Box<dyn Any>

Convert Box<dyn Trait> (where Trait: Downcast) to Box<dyn Any>. Box<dyn Any> can then be further downcast into Box<ConcreteType> where ConcreteType implements Trait.
source§

fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>

Convert Rc<Trait> (where Trait: Downcast) to Rc<Any>. Rc<Any> can then be further downcast into Rc<ConcreteType> where ConcreteType implements Trait.
source§

fn as_any(&self) -> &(dyn Any + 'static)

Convert &Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot generate &Any’s vtable from &Trait’s.
source§

fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)

Convert &mut Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot generate &mut Any’s vtable from &mut Trait’s.
source§

impl<T> DowncastSync for T
where T: Any + Send + Sync,

source§

fn into_any_arc(self: Arc<T>) -> Arc<dyn Any + Sync + Send>

Convert Arc<Trait> (where Trait: Downcast) to Arc<Any>. Arc<Any> can then be further downcast into Arc<ConcreteType> where ConcreteType implements Trait.
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> Pointable for T

source§

const ALIGN: usize = _

The alignment of pointer.
§

type Init = T

The type for initializers.
source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
source§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
source§

impl<T> Fruit for T
where T: Send + Downcast,