Expand description
§Scribe Scanner
High-performance file system scanning and indexing capabilities for the Scribe library. This crate provides efficient tools for discovering, filtering, and analyzing files in large codebases with git integration and parallel processing.
§Features
- Fast Repository Traversal: Efficient file discovery using
walkdirandignore - Git Integration: Prefer
git ls-fileswhen available, with fallback to filesystem walk - Language Detection: Automatic detection for 25+ programming languages
- Parallel Processing: Memory-efficient parallel file processing using Rayon
- Binary Detection: Libmagic-compatible content detection to skip non-text files
§Usage
use scribe_scanner::{Scanner, ScanOptions};
use std::path::Path;
let scanner = Scanner::new();
let options = ScanOptions::default()
.with_git_integration(true)
.with_parallel_processing(true);
let results = scanner.scan(Path::new("."), options).await?;
println!("Scanned {} files", results.len());Re-exports§
pub use git_integration::GitCommitInfo;pub use git_integration::GitFileInfo;pub use git_integration::GitIntegrator;pub use language_detection::DetectionStrategy;pub use language_detection::LanguageDetector;pub use language_detection::LanguageHints;pub use metadata::FileMetadata;pub use metadata::MetadataExtractor;pub use metadata::SizeStats;pub use scanner::ScanOptions;pub use scanner::ScanProgress;pub use scanner::ScanResult;pub use scanner::Scanner;pub use aho_corasick_reference_index::AhoCorasickReferenceIndex;pub use aho_corasick_reference_index::IndexConfig;pub use aho_corasick_reference_index::IndexMetrics;pub use filtering::DirectoryFilter;pub use filtering::FileFilter;pub use filtering::FilterReason;pub use filtering::FilterResult;pub use parallel::ParallelConfig;pub use parallel::ParallelController;pub use parallel::ParallelMetrics;pub use parallel::WorkItem;pub use performance::ErrorType;pub use performance::PerfTimer;pub use performance::PerformanceMonitor;pub use performance::PerformanceReport;pub use performance::PerformanceSnapshot;pub use performance::PERF_MONITOR;
Modules§
- aho_
corasick_ reference_ index - High-performance file reference indexing using Aho-Corasick multi-pattern search.
- filtering
- High-performance file filtering with early content reads and strict pre-filtering.
- git_
integration - Git integration for enhanced file discovery and status tracking.
- language_
detection - Advanced programming language detection for 25+ languages.
- metadata
- File metadata extraction and analysis.
- parallel
- Bounded parallelism with backpressure control and adaptive batching.
- performance
- Performance instrumentation and monitoring for the scanning system.
- scanner
- Core scanning functionality for efficient file system traversal.
Macros§
- perf_
timer - Macro for easy performance timing
Structs§
- File
Scanner - High-level scanner facade providing convenient access to all scanning functionality
- Scanner
Stats - Statistics about the scanning process
Constants§
- VERSION
- Current version of the scanner crate