Struct Cylon

Source

pub struct Cylon { /* private fields */ }

Expand description

A Cylon is an NFA that recognizes rules from a compiled robots.txt file. By providing it a URL path, it can decide whether or not the robots file that compiled it allows or disallows that path.

The performance is on average O(n ^ k), where n is the length of the path and k is the average number of transitions from one prefix. This exponontial runtime is acceptable in most cases because k tends to be very small.

Contrast that with the naive approach of matching each rule individually. If you can match a rule in O(n) time and there are p rules of length q, then the performance will be O(n * p * q). However the NFA is likely more efficient, because it can avoid matching the same prefix multiple times. If there are x prefixes and each prefix is used y times, then the naive approach must make O(x * y) comparisons whereas the NFA only makes O(y) comparisons.

In general robots.txt files have a lot of shared prefixes due to the nature of URLs. That is why the pre-compiled NFA will be faster in most cases. However there is an upfront cost of compiling the NFA which is not present when doing naive matching. That cost can be amortized by caching the compiled Cylon for subsequent uses.

Cylon

Struct Cylon Copy item path

Implementations§

impl Cylon

pub fn allow<T: AsRef<[u8]>>(&self, path: T) -> bool

pub fn compile(rules: Vec<Rule<'_>>) -> Self

Trait Implementations§

impl Clone for Cylon

fn clone(&self) -> Cylon

fn clone_from(&mut self, source: &Self)

impl Debug for Cylon

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl<'de> Deserialize<'de> for Cylon

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where __D: Deserializer<'de>,

impl Serialize for Cylon

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>where __S: Serializer,

Auto Trait Implementations§

impl Freeze for Cylon

impl RefUnwindSafe for Cylon

impl Send for Cylon

impl Sync for Cylon

impl Unpin for Cylon

impl UnwindSafe for Cylon

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<T> DeserializeOwned for Twhere T: for<'de> Deserialize<'de>,

Struct Cylon

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,