Expand description
dircat is a library and command-line tool for recursively concatenating
directory contents into a single, well-formatted Markdown file.
It is designed for speed, developer convenience, and seamless integration with tools that consume Markdown, such as Large Language Models (LLMs) or documentation systems.
As a library, it provides a modular, three-stage pipeline:
- Discover: Find all relevant files based on a rich set of filtering criteria.
- Process: Read and transform file content (e.g., removing comments).
- Format: Generate the final Markdown output.
This design allows programmatic use of its components, such as using the file discovery logic or content filters independently.
§Example: Library Usage
The following example demonstrates how to use the dircat library to
discover, process, and format files from a temporary directory.
use dircat::{execute};
use dircat::cancellation::CancellationToken;
use dircat::config::{self, ConfigBuilder};
use dircat::progress::ProgressReporter;
use std::fs;
use std::sync::Arc;
use tempfile::tempdir;
// 1. Set up a temporary directory with some files.
let temp_dir = tempdir()?;
let input_path_str = temp_dir.path().to_str().unwrap();
fs::write(temp_dir.path().join("file1.txt"), "Hello, world!")?;
fs::write(temp_dir.path().join("file2.rs"), "fn main() { /* comment */ }")?;
// 2. Create a Config object programmatically using the builder.
let config = ConfigBuilder::new()
.input_path(input_path_str)
.remove_comments(true)
.summary(true)
.build()?;
// 3. Set up a cancellation token for graceful interruption.
let token = CancellationToken::new();
// 4. Execute the main logic to get the processed file data. This is the recommended
// approach for library use, as it guarantees a deterministically sorted result.
let progress: Option<Arc<dyn ProgressReporter>> = None; // No progress bar in this example
let dircat_result = dircat::execute(&config, &token, progress)?;
// For more granular control, you could also use the individual stages.
// Note: `process` does not preserve order, so you would need to collect and sort
// the results yourself to match the output of `execute`.
// let resolved = config::resolve_input(&config.input_path, &config.git_branch, config.git_depth, &config.git_cache_path, progress)?;
// let discovered_files = dircat::discover(&config, &resolved, &token)?;
// let mut processed_files: Vec<_> = dircat::process(discovered_files, &config, &token)?.collect::<Result<_,_>>()?;
// processed_files.sort_by_key(|fi| (fi.is_process_last, fi.process_last_order, fi.relative_path.clone()));
// 5. Format the output into a buffer.
let formatter = dircat::output::MarkdownFormatter; // The default formatter
let mut output_buffer: Vec<u8> = Vec::new();
let output_opts = dircat::OutputConfig::from(&config);
dircat_result.format_with(&formatter, &output_opts, &mut output_buffer)?;
// 6. Print the result.
let output_string = String::from_utf8(output_buffer)?;
println!("{}", output_string);
// The output would look something like this:
// ## File: file1.txt
// ```txt
// Hello, world!
// ```
//
// ## File: file2.rs
// ```rs
// fn main() { }
// ```
//
// ---
// Processed Files: (2)
// - file1.txt
// - file2.rsRe-exports§
pub use cancellation::CancellationToken;pub use config::Config;pub use config::ConfigBuilder;pub use config::DiscoveryConfig;pub use config::OutputConfig;pub use config::OutputDestination;pub use config::ProcessingConfig;pub use core_types::FileCounts;pub use core_types::FileInfo;pub use discovery::discover_files;pub use processing::process_content;pub use processing::process_files;pub use filtering::check_process_last;pub use filtering::is_file_type;pub use filtering::is_likely_text;pub use filtering::is_likely_text_from_buffer;pub use filtering::is_lockfile;pub use filtering::passes_extension_filters;pub use filtering::passes_size_filter;pub use output::MarkdownFormatter;pub use output::OutputFormatter;pub use processing::calculate_counts;pub use processing::filters::remove_comments;pub use processing::filters::remove_empty_lines;pub use processing::filters::ContentFilter;pub use processing::filters::RemoveCommentsFilter;pub use processing::filters::RemoveEmptyLinesFilter;pub use git::download_directory_via_api;pub use git::get_repo;pub use git::is_git_url;pub use git::parse_clone_url;pub use git::parse_github_folder_url;pub use git::parse_github_folder_url_with_hint;pub use git::ParsedGitUrl;
Modules§
- cancellation
- Provides a token-based mechanism for graceful cancellation.
- cli
- config
- Defines the core
Configstruct and related types for application configuration. - constants
- Defines global constants used throughout the application.
- core_
types - Defines core data structures used throughout the application pipeline.
- discovery
- Discovers files based on configuration, applying filters in parallel.
- errors
- Defines application-specific error types.
- filtering
- Provides standalone functions for file filtering logic.
- git
- Handles cloning, caching, and updating of git repositories.
- output
- Handles the formatting and writing of the final output.
- prelude
- A prelude for conveniently importing the most common types.
The
dircatprelude for convenient library use. - processing
- Handles the processing stage of the
dircatpipeline. - progress
- Defines a trait for reporting progress of long-running operations.
- signal
- Provides signal handling for graceful shutdown.
Structs§
- Dircat
Result - Represents the successful result of a dircat execution.