ContentHashedIndex

Struct ContentHashedIndex 

Source
pub struct ContentHashedIndex {
    pub duplicates_found: usize,
    pub unique_items: usize,
    /* private fields */
}
Expand description

Content-hashed index for deduplication of code units.

Used to avoid indexing identical code multiple times. The index normalizes whitespace before hashing, so code with different formatting but identical content will be detected as duplicates.

§Examples

use go_brrr::semantic::ContentHashedIndex;

let mut index = ContentHashedIndex::new();

// First occurrence is added
assert!(index.add("def foo(): pass", "src/a.py", "foo", 10));

// Identical content is detected as duplicate
assert!(!index.add("def foo(): pass", "src/b.py", "foo", 20));

// Check stats
let (unique, duplicates) = index.stats();
assert_eq!(unique, 1);
assert_eq!(duplicates, 1);

Fields§

§duplicates_found: usize

Number of duplicate items detected

§unique_items: usize

Number of unique items indexed

Implementations§

Source§

impl ContentHashedIndex

Source

pub fn new() -> Self

Create a new empty content-hashed index.

Source

pub fn check_duplicate(&self, content: &str) -> Option<&CodeLocation>

Check if content is a duplicate, returning the original location if so.

§Arguments
  • content - Code content to check
§Returns

Some(&CodeLocation) if this content was already seen, None otherwise.

Source

pub fn add( &mut self, content: &str, file: &str, function_name: &str, line: usize, ) -> bool

Add content to the index.

§Arguments
  • content - Code content to add
  • file - Source file path
  • function_name - Name of the function or code unit
  • line - Line number (1-indexed)
§Returns

true if this is new content (was added), false if duplicate (was not added).

Source

pub fn stats(&self) -> (usize, usize)

Get deduplication statistics.

§Returns

Tuple of (unique_items, duplicates_found).

Source

pub fn len(&self) -> usize

Get the number of unique items in the index.

Source

pub fn is_empty(&self) -> bool

Check if the index is empty.

Source

pub fn clear(&mut self)

Clear the index and reset statistics.

Source

pub fn dedup_ratio(&self) -> f64

Get the deduplication ratio (duplicates / total).

Returns 0.0 if no items have been processed.

Trait Implementations§

Source§

impl Clone for ContentHashedIndex

Source§

fn clone(&self) -> ContentHashedIndex

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ContentHashedIndex

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for ContentHashedIndex

Source§

fn default() -> ContentHashedIndex

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> FromRef<T> for T
where T: Clone,

Source§

fn from_ref(input: &T) -> T

Converts to this type from a reference to the input type.
Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> IntoRequest<T> for T

Source§

fn into_request(self) -> Request<T>

Wrap the input message T in a tonic::Request
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more