Crate dircat

Crate dircat 

Source
Expand description

dircat is a library and command-line tool for recursively concatenating directory contents into a single, well-formatted Markdown file.

It is designed for speed, developer convenience, and seamless integration with tools that consume Markdown, such as Large Language Models (LLMs) or documentation systems.

As a library, it provides a modular, three-stage pipeline:

  1. Discover: Find all relevant files based on a rich set of filtering criteria.
  2. Process: Read and transform file content (e.g., removing comments).
  3. Format: Generate the final Markdown output.

This design allows programmatic use of its components, such as using the file discovery logic or content filters independently.

§Example: Library Usage

The following example demonstrates how to use the dircat library to discover, process, and format files from a temporary directory.

use dircat::{execute};
use dircat::cancellation::CancellationToken;
use dircat::config::{self, ConfigBuilder};
use dircat::progress::ProgressReporter;
use std::fs;
use std::sync::Arc;
use tempfile::tempdir;

// 1. Set up a temporary directory with some files.
let temp_dir = tempdir()?;
let input_path_str = temp_dir.path().to_str().unwrap();
fs::write(temp_dir.path().join("file1.txt"), "Hello, world!")?;
fs::write(temp_dir.path().join("file2.rs"), "fn main() { /* comment */ }")?;

// 2. Create a Config object programmatically using the builder.
let config = ConfigBuilder::new()
    .input_path(input_path_str)
    .remove_comments(true)
    .summary(true)
    .build()?;

// 3. Set up a cancellation token for graceful interruption.
let token = CancellationToken::new();

// 4. Execute the main logic to get the processed file data. This is the recommended
//    approach for library use, as it guarantees a deterministically sorted result.
let progress: Option<Arc<dyn ProgressReporter>> = None; // No progress bar in this example
let dircat_result = dircat::execute(&config, &token, progress)?;

// For more granular control, you could also use the individual stages.
// Note: `process` does not preserve order, so you would need to collect and sort
// the results yourself to match the output of `execute`.
// let resolved = config::resolve_input(&config.input_path, &config.git_branch, config.git_depth, &config.git_cache_path, progress)?;
// let discovered_files = dircat::discover(&config, &resolved, &token)?;
// let mut processed_files: Vec<_> = dircat::process(discovered_files, &config, &token)?.collect::<Result<_,_>>()?;
// processed_files.sort_by_key(|fi| (fi.is_process_last, fi.process_last_order, fi.relative_path.clone()));

// 5. Format the output into a buffer.
let formatter = dircat::output::MarkdownFormatter; // The default formatter
let mut output_buffer: Vec<u8> = Vec::new();
let output_opts = dircat::OutputConfig::from(&config);
dircat_result.format_with(&formatter, &output_opts, &mut output_buffer)?;

// 6. Print the result.
let output_string = String::from_utf8(output_buffer)?;
println!("{}", output_string);

// The output would look something like this:
// ## File: file1.txt
// ```txt
// Hello, world!
// ```
//
// ## File: file2.rs
// ```rs
// fn main() {  }
// ```
//
// ---
// Processed Files: (2)
// - file1.txt
// - file2.rs

Re-exports§

pub use cancellation::CancellationToken;
pub use config::Config;
pub use config::ConfigBuilder;
pub use config::DiscoveryConfig;
pub use config::OutputConfig;
pub use config::OutputDestination;
pub use config::ProcessingConfig;
pub use core_types::FileCounts;
pub use core_types::FileInfo;
pub use discovery::discover_files;
pub use processing::process_content;
pub use processing::process_files;
pub use filtering::check_process_last;
pub use filtering::is_file_type;
pub use filtering::is_likely_text;
pub use filtering::is_likely_text_from_buffer;
pub use filtering::is_lockfile;
pub use filtering::passes_extension_filters;
pub use filtering::passes_size_filter;
pub use output::MarkdownFormatter;
pub use output::OutputFormatter;
pub use processing::calculate_counts;
pub use processing::filters::remove_comments;
pub use processing::filters::remove_empty_lines;
pub use processing::filters::ContentFilter;
pub use processing::filters::RemoveCommentsFilter;
pub use processing::filters::RemoveEmptyLinesFilter;
pub use git::download_directory_via_api;
pub use git::get_repo;
pub use git::is_git_url;
pub use git::parse_clone_url;
pub use git::parse_github_folder_url;
pub use git::parse_github_folder_url_with_hint;
pub use git::ParsedGitUrl;

Modules§

cancellation
Provides a token-based mechanism for graceful cancellation.
cli
config
Defines the core Config struct and related types for application configuration.
constants
Defines global constants used throughout the application.
core_types
Defines core data structures used throughout the application pipeline.
discovery
Discovers files based on configuration, applying filters in parallel.
errors
Defines application-specific error types.
filtering
Provides standalone functions for file filtering logic.
git
Handles cloning, caching, and updating of git repositories.
output
Handles the formatting and writing of the final output.
prelude
A prelude for conveniently importing the most common types. The dircat prelude for convenient library use.
processing
Handles the processing stage of the dircat pipeline.
progress
Defines a trait for reporting progress of long-running operations.
signal
Provides signal handling for graceful shutdown.

Structs§

DircatResult
Represents the successful result of a dircat execution.

Functions§

discover
Discovers files based on the provided configuration.
execute
Executes the discovery and processing stages of the dircat pipeline.
process
Processes a list of discovered files.
run
Executes the complete dircat pipeline: discover, process, and format.