pub enum CorpusClass {
Code,
Mixed,
Docs,
}Expand description
Index-time classification of the corpus by file mix.
Drives the corpus-aware rerank gate: docs and mixed corpora get the L-12 cross-encoder fired (when the query is NL-shaped); pure code corpora skip it because the ms-marco-trained model is out-of-domain for code regardless of impl quality.
Variants§
Code
Less than 30% of chunks are in prose files. Pure or near-pure code corpora — rerank skipped.
Mixed
Between 30% and 70% prose chunks. Mixed corpora — rerank fires on NL queries to recover the prose-dominant relevance signal.
Docs
At least 70% prose chunks. Documentation, book sets, knowledge bases — rerank fires by default.
Implementations§
Source§impl CorpusClass
impl CorpusClass
Sourcepub fn classify(chunks: &[CodeChunk]) -> Self
pub fn classify(chunks: &[CodeChunk]) -> Self
Classify a chunk set by the fraction of chunks dominated by prose.
A chunk counts as “prose” when either of the following holds:
- its file extension is in
crate::encoder::ripvec::ranking::is_prose_path(e.g..md,.rst,.txt), OR - its content is dominated by docstring/comment text per
[
chunk_is_prose_dominated] (the I#64 / B-0028 path — docstring-heavy Python, JS-doc-heavy code, etc.).
The second branch matters: a Mnemosyne-class Python corpus where
every class has a substantial docstring is classified as Code
by the file-extension test alone, even though >50% of its bytes
are prose. The within-chunk content test catches this case so
Auto-policy rerank fires on NL queries against such corpora.
Empty input is classified as Code (degenerate but defined).
Sourcepub fn rerank_eligible(self) -> bool
pub fn rerank_eligible(self) -> bool
Whether the cross-encoder rerank should run on this corpus for a non-symbol NL query. Pure code corpora skip rerank; mixed and docs corpora enable it.
Trait Implementations§
Source§impl Clone for CorpusClass
impl Clone for CorpusClass
Source§fn clone(&self) -> CorpusClass
fn clone(&self) -> CorpusClass
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreimpl Copy for CorpusClass
Source§impl Debug for CorpusClass
impl Debug for CorpusClass
Source§impl<'de> Deserialize<'de> for CorpusClass
impl<'de> Deserialize<'de> for CorpusClass
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
impl Eq for CorpusClass
Source§impl PartialEq for CorpusClass
impl PartialEq for CorpusClass
Source§fn eq(&self, other: &CorpusClass) -> bool
fn eq(&self, other: &CorpusClass) -> bool
self and other values to be equal, and is used by ==.Source§impl Serialize for CorpusClass
impl Serialize for CorpusClass
impl StructuralPartialEq for CorpusClass
Auto Trait Implementations§
impl Freeze for CorpusClass
impl RefUnwindSafe for CorpusClass
impl Send for CorpusClass
impl Sync for CorpusClass
impl Unpin for CorpusClass
impl UnsafeUnpin for CorpusClass
impl UnwindSafe for CorpusClass
Blanket Implementations§
Source§impl<T> ArchivePointee for T
impl<T> ArchivePointee for T
Source§type ArchivedMetadata = ()
type ArchivedMetadata = ()
Source§fn pointer_metadata(
_: &<T as ArchivePointee>::ArchivedMetadata,
) -> <T as Pointee>::Metadata
fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> DeserializeOwned for Twhere
T: for<'de> Deserialize<'de>,
Source§impl<T> Downcast for Twhere
T: Any,
impl<T> Downcast for Twhere
T: Any,
Source§fn into_any(self: Box<T>) -> Box<dyn Any>
fn into_any(self: Box<T>) -> Box<dyn Any>
Box<dyn Trait> (where Trait: Downcast) to Box<dyn Any>, which can then be
downcast into Box<dyn ConcreteType> where ConcreteType implements Trait.Source§fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
Rc<Trait> (where Trait: Downcast) to Rc<Any>, which can then be further
downcast into Rc<ConcreteType> where ConcreteType implements Trait.Source§fn as_any(&self) -> &(dyn Any + 'static)
fn as_any(&self) -> &(dyn Any + 'static)
&Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot
generate &Any’s vtable from &Trait’s.Source§fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
&mut Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot
generate &mut Any’s vtable from &mut Trait’s.Source§impl<T> DowncastSend for T
impl<T> DowncastSend for T
Source§impl<T> DowncastSync for T
impl<T> DowncastSync for T
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
key and return true if they are equal.Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
impl<T> Fruit for T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> LayoutRaw for T
impl<T> LayoutRaw for T
Source§fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
Source§impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
Source§unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
Source§fn resolve_niched(out: Place<NichedOption<T, N1>>)
fn resolve_niched(out: Place<NichedOption<T, N1>>)
out indicating that a T is niched.