Struct Query

Source

pub struct Query<'a> {
    pub text: String,
    pub tokens: Vec<TokenId>,
    pub line_by_pos: Vec<usize>,
    pub unknowns_by_pos: HashMap<Option<i32>, usize>,
    pub stopwords_by_pos: HashMap<Option<i32>, usize>,
    pub shorts_and_digits_pos: HashSet<usize>,
    pub high_matchables: BitSet,
    pub low_matchables: BitSet,
    pub is_binary: bool,
    pub spdx_lines: Vec<(String, usize, usize)>,
    pub index: &'a LicenseIndex,
    /* private fields */
}

Expand description

Query holds:

Known token IDs (tokens existing in the index dictionary)
Token positions and their corresponding line numbers (line_by_pos)
Unknown tokens (tokens not in dictionary) tracked per position
Stopwords tracked per position
Positions with short/digit-only tokens
High and low matchable token positions (for tracking what’s been matched)

Based on Python Query class at: reference/scancode-toolkit/src/licensedcode/query.py (lines 155-295)

Fields§

§text: String

The original input text.

Corresponds to Python: self.query_string (line 215)

§tokens: Vec<TokenId>

Token IDs for known tokens (tokens found in the index dictionary)

Corresponds to Python: self.tokens = [] (line 228)

§line_by_pos: Vec<usize>

Mapping from token position to line number (1-based)

Each token position in self.tokens maps to the line number where it appears. This is used for match position reporting.

Corresponds to Python: self.line_by_pos = [] (line 231)

§unknowns_by_pos: HashMap<Option<i32>, usize>

Mapping from token position to count of unknown tokens after that position

Unknown tokens are those not found in the dictionary. We track them by counting how many unknown tokens appear after each known position. Unknown tokens before the first known token are tracked at position -1 (using the key None in Rust).

Corresponds to Python: self.unknowns_by_pos = {} (line 236)

§stopwords_by_pos: HashMap<Option<i32>, usize>

Mapping from token position to count of stopwords after that position

Similar to unknown_tokens, but for stopwords.

Corresponds to Python: self.stopwords_by_pos = {} (line 244)

§shorts_and_digits_pos: HashSet<usize>

Set of positions with single-character or digit-only tokens

These tokens have special handling in matching.

Corresponds to Python: self.shorts_and_digits_pos = set() (line 249)

§high_matchables: BitSet

High-value matchable token positions (legalese tokens)

These are tokens with ID < len_legalese.

Corresponds to Python: self.high_matchables (line 293)

§low_matchables: BitSet

Low-value matchable token positions (non-legalese tokens)

These are tokens with ID >= len_legalese.

Corresponds to Python: self.low_matchables (line 294)

§is_binary: bool

True if the query is detected as binary content

Corresponds to Python: self.is_binary = False (line 225)

§spdx_lines: Vec<(String, usize, usize)>

SPDX-License-Identifier lines found during tokenization.

Each tuple is (spdx_text, start_token_pos, end_token_pos). Used for creating LicenseMatches with correct token positions.

Corresponds to Python: self.spdx_lines = [] (line 507)

§index: &'a LicenseIndex

Reference to the license index for dictionary access and metadata

Implementations§

Source §

impl<'a> Query<'a>

Source

pub fn from_extracted_text( text: &str, index: &'a LicenseIndex, binary_derived: bool, ) -> Result<Self, Error>

Source

pub fn query_runs(&self) -> Vec<QueryRun<'_>>

Iterate over query runs.

Corresponds to Python: query.query_runs property iteration

Source

pub fn line_for_pos(&self, pos: usize) -> Option<usize>

Get the length of the query in tokens.

Get the line number for a token position.

§Arguments

pos - The token position

§Returns

The line number (1-based)

Source

pub fn is_empty(&self) -> bool

Check if the query is empty (no known tokens).

Source

pub fn whole_query_run(&self) -> QueryRun<'a>

Get a query run covering the entire query.

Corresponds to Python: whole_query_run() method (lines 306-317)

Source

pub fn subtract(&mut self, span: &PositionSpan)

Subtract matched span positions from matchables.

This removes the positions from both high and low matchables.

§Arguments

span - The span of positions to subtract

Corresponds to Python: subtract() method (lines 328-334)

Source

pub fn matched_text(&self, start_line: usize, end_line: usize) -> String

Extract matched text for a given line range.

Returns the text from the original input between start_line and end_line (both inclusive, 1-indexed).

§Arguments

start_line - Starting line number (1-indexed)
end_line - Ending line number (1-indexed)

§Returns

The matched text, or empty string if lines are out of range

Corresponds to Python: matched_text() method in match.py (lines 757-795)

Trait Implementations§

Source §

impl<'a> Debug for Query<'a>

Source §

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<'a> UnwindSafe for Query<'a>

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T, U> ExactFrom<T> for U
where U: TryFrom<T>,

Source §

fn exact_from(value: T) -> U

Source §

impl<T, U> ExactInto for T
where U: ExactFrom<T>,

Source §

fn exact_into(self) -> U

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

impl<T, U> OverflowingInto for T
where U: OverflowingFrom<T>,

Source §

fn overflowing_into(self) -> (U, bool)

Source §

impl<T> Pointable for T

Source §

const ALIGN: usize

The alignment of pointer.

Source §

type Init = T

The type for initializers.

Source §

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more

Source §

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more

Source §

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more

Source §

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more

Source §

impl<T, U> RoundingInto for T
where U: RoundingFrom<T>,

Source §

fn rounding_into(self, rm: RoundingMode) -> (U, Ordering)

Source §

impl<T> Same for T

Source §

type Output = T

Should always be Self

Source §

impl<T, U> SaturatingInto for T
where U: SaturatingFrom<T>,

Source §

fn saturating_into(self) -> U

Source §

impl<T> ToDebugString for T
where T: Debug,

Source §

fn to_debug_string(&self) -> String

Returns the String produced by Ts Debug implementation.

§Examples

use malachite_base::strings::ToDebugString;

assert_eq!([1, 2, 3].to_debug_string(), "[1, 2, 3]");
assert_eq!(
    [vec![2, 3], vec![], vec![4]].to_debug_string(),
    "[[2, 3], [], [4]]"
);
assert_eq!(Some(5).to_debug_string(), "Some(5)");

Source §

impl<T, U> TryFrom for T
where U: Into<T>,

Source §

type Error = Infallible

The type returned in the event of a conversion error.

Source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

Source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

Source §

type Error = >::Error

The type returned in the event of a conversion error.

Source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

Source §

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source §

fn vzip(self) -> V

Source §

impl<T, U> WrappingInto for T
where U: WrappingFrom<T>,

Source §

Struct Query Copy item path

Fields§

Implementations§

impl<'a> Query<'a>

pub fn from_extracted_text( text: &str, index: &'a LicenseIndex, binary_derived: bool, ) -> Result<Self, Error>

pub fn query_runs(&self) -> Vec<QueryRun<'_>>

pub fn line_for_pos(&self, pos: usize) -> Option<usize>

§Arguments

§Returns

pub fn is_empty(&self) -> bool

pub fn whole_query_run(&self) -> QueryRun<'a>

pub fn subtract(&mut self, span: &PositionSpan)

§Arguments

pub fn matched_text(&self, start_line: usize, end_line: usize) -> String

§Arguments

§Returns

Trait Implementations§

impl<'a> Debug for Query<'a>

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Auto Trait Implementations§

impl<'a> Freeze for Query<'a>

impl<'a> RefUnwindSafe for Query<'a>

impl<'a> Send for Query<'a>

impl<'a> Sync for Query<'a>

impl<'a> Unpin for Query<'a>

impl<'a> UnsafeUnpin for Query<'a>

impl<'a> UnwindSafe for Query<'a>

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T, U> ExactFrom<T> for Uwhere U: TryFrom<T>,

fn exact_from(value: T) -> U

impl<T, U> ExactInto<U> for Twhere U: ExactFrom<T>,

fn exact_into(self) -> U

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>where F: FnOnce(&Self) -> bool,

impl<T, U> OverflowingInto<U> for Twhere U: OverflowingFrom<T>,

fn overflowing_into(self) -> (U, bool)

impl<T> Pointable for T

const ALIGN: usize

type Init = T

unsafe fn init(init: <T as Pointable>::Init) -> usize

unsafe fn deref<'a>(ptr: usize) -> &'a T

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

unsafe fn drop(ptr: usize)

impl<T, U> RoundingInto<U> for Twhere U: RoundingFrom<T>,

fn rounding_into(self, rm: RoundingMode) -> (U, Ordering)

impl<T> Same for T

type Output = T

impl<T, U> SaturatingInto<U> for Twhere U: SaturatingFrom<T>,

fn saturating_into(self) -> U

impl<T> ToDebugString for Twhere T: Debug,

fn to_debug_string(&self) -> String

§Examples

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,

fn vzip(self) -> V

impl<T, U> WrappingInto<U> for Twhere U: WrappingFrom<T>,

fn wrapping_into(self) -> U

Struct Query

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> ExactFrom<T> for U
where U: TryFrom<T>,

impl<T, U> ExactInto<U> for T
where U: ExactFrom<T>,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<T, U> OverflowingInto<U> for T
where U: OverflowingFrom<T>,

impl<T, U> RoundingInto<U> for T
where U: RoundingFrom<T>,

impl<T, U> SaturatingInto<U> for T
where U: SaturatingFrom<T>,

impl<T> ToDebugString for T
where T: Debug,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

impl<T, U> WrappingInto<U> for T
where U: WrappingFrom<T>,