Cylon

Struct Cylon 

Source
pub struct Cylon { /* private fields */ }
Expand description

A Cylon is an NFA that recognizes rules from a compiled robots.txt file. By providing it a URL path, it can decide whether or not the robots file that compiled it allows or disallows that path.

The performance is on average O(n ^ k), where n is the length of the path and k is the average number of transitions from one prefix. This exponontial runtime is acceptable in most cases because k tends to be very small.

Contrast that with the naive approach of matching each rule individually. If you can match a rule in O(n) time and there are p rules of length q, then the performance will be O(n * p * q). However the NFA is likely more efficient, because it can avoid matching the same prefix multiple times. If there are x prefixes and each prefix is used y times, then the naive approach must make O(x * y) comparisons whereas the NFA only makes O(y) comparisons.

In general robots.txt files have a lot of shared prefixes due to the nature of URLs. That is why the pre-compiled NFA will be faster in most cases. However there is an upfront cost of compiling the NFA which is not present when doing naive matching. That cost can be amortized by caching the compiled Cylon for subsequent uses.

Implementations§

Source§

impl Cylon

Source

pub fn allow<T: AsRef<[u8]>>(&self, path: T) -> bool

Match whether the rules allow or disallow the target path.

Source

pub fn compile(rules: Vec<Rule<'_>>) -> Self

Trait Implementations§

Source§

impl Clone for Cylon

Source§

fn clone(&self) -> Cylon

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for Cylon

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for Cylon

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for Cylon

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

§

impl Freeze for Cylon

§

impl RefUnwindSafe for Cylon

§

impl Send for Cylon

§

impl Sync for Cylon

§

impl Unpin for Cylon

§

impl UnwindSafe for Cylon

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,