file_declutter/lib.rs
1//! # File Declutter
2//!
3//! [![badge github]][url github]
4//! [![badge crates.io]][url crates.io]
5//! [![badge docs.rs]][url docs.rs]
6//! [![badge license]][url license]
7//!
8//! [badge github]: https://img.shields.io/badge/github-FloGa%2Ffile--declutter-green
9//! [badge crates.io]: https://img.shields.io/crates/v/file-declutter
10//! [badge docs.rs]: https://img.shields.io/docsrs/file-declutter
11//! [badge license]: https://img.shields.io/crates/l/file-declutter
12//!
13//! [url github]: https://github.com/FloGa/file-declutter
14//! [url crates.io]: https://crates.io/crates/file-declutter
15//! [url docs.rs]: https://docs.rs/file-declutter
16//! [url license]: https://github.com/FloGa/file-declutter/blob/develop/LICENSE
17//!
18//! > Reorganizes files into nested folders based on their filenames.
19//!
20//! *File Declutter* is a little command line tool that helps you bring order to
21//! your large directories. It can "declutter" a flat list of files by
22//! redistributing them into nested subdirectories based on their filenames. It's
23//! particularly useful for organizing large numbers of files in a single
24//! directory (for example images, documents, etc.) into a more manageable
25//! structure.
26//!
27//! This crate is split into an [Application](#application) part and a
28//! [Library](#library) part.
29//!
30//! ## Motivation
31//!
32//! The need for this little tool derived from a situation where I was confronted
33//! with a flat directory of 500k files and more. Well, to be frank, it was one of
34//! my other creations, namely [*DedupeFS*][dedupefs github], which presents files
35//! as 1MB chunks, named after their checksums. By using this over my media hard
36//! drive, I ended up with so many files in one directory that my file manager
37//! just refused to work with it.
38//!
39//! Since the file names are SHA-1 hashes, they consist of a somewhat evenly
40//! distributed sequence of the hexadecimal numbers 0 to f. So, by putting all
41//! files that start with 0 into a separate folder, and all that start with 1 into
42//! a different folder, and so on, I can already split the number of files per
43//! subdirectory by 16. If I repeat this step in each subdirectory, I can split
44//! them again by 16.
45//!
46//! This is one possible scenario where *File Declutter* really comes in handy.
47//!
48//! [dedupefs github]: https://github.com/FloGa/dedupefs
49//!
50//! ## Application
51//!
52//! ### Installation
53//!
54//! This tool can be installed easily through Cargo via `crates.io`:
55//!
56//! ```shell
57//! cargo install --locked file-declutter
58//! ```
59//!
60//! Please note that the `--locked` flag is necessary here to have the exact same
61//! dependencies as when the application was tagged and tested. Without it, you
62//! might get more up-to-date versions of dependencies, but you have the risk of
63//! undefined and unexpected behavior if the dependencies changed some
64//! functionalities. The application might even fail to build if the public API of
65//! a dependency changed too much.
66//!
67//! Alternatively, pre-built binaries can be downloaded from the [GitHub
68//! releases][gh-releases] page.
69//!
70//! [gh-releases]: https://github.com/FloGa/file-declutter/releases
71//!
72//! ### Usage
73//!
74//! ```text
75//! Usage: file-declutter [OPTIONS] <PATH>
76//!
77//! Arguments:
78//! <PATH> Directory to declutter
79//!
80//! Options:
81//! -l, --levels <LEVELS> Number of nested subdirectory levels [default: 3]
82//! -r, --remove-empty-directories Remove empty directories after moving files
83//! -h, --help Print help
84//! -V, --version Print version
85//! ```
86//!
87//! To declutter a directory into three levels, you would go with:
88//!
89//! ```shell
90//! file-declutter --levels 3 path/to/large-directory
91//! ```
92//!
93//! To "restore" the directory, or rather, flatten a list of files in
94//! subdirectories, you can use:
95//!
96//! ```shell
97//! file-declutter --levels 0 --remove-empty-directories path/to/decluttered-directory
98//! ```
99//!
100//! **Warning:** Please note that flattening a directory tree in this way will
101//! result in **data loss** when there are files with the same names! So if you
102//! have two files `dir/a/123.txt` and `dir/b/123.txt` and you run the above
103//! command over `dir`, you will end up with `dir/123.txt`, where the last file
104//! move has overwritten the previous ones. The file listing could be in arbitrary
105//! order, so you cannot even tell beforehand, which file will "win".
106//!
107//! ## Library
108//!
109//! ### Installation
110//!
111//! To add the `file-clutter` library to your project, you can use:
112//!
113//! ```shell
114//! cargo add file-declutter
115//! ```
116//!
117//! ### Usage
118//!
119//! The following is a short summary of how this library is intended to be used.
120//! The actual functionality lies in a custom Iterator object
121//! `FileDeclutterIterator`, but it is not intended to be instantiated directly.
122//! Instead, the wrapper `FileDeclutter` should be used.
123//!
124//! Once the Iterator is created, you can either iterate over its items to receive
125//! a list of `(source, target)` tuples, with `source` being the original filename
126//! and `target` the "decluttered" one. You can use this to do your own logic over
127//! them. Or you can use the `declutter_files` method to do the actual moving of
128//! the files.
129//!
130//! #### Create Iterator from Path
131//!
132//! This method can be used if you have an actual directory that you want to
133//! completely declutter recursively.
134//!
135//! ```rust no_run
136//! use std::path::PathBuf;
137//!
138//! fn main() {
139//! let files_decluttered = file_declutter::FileDeclutter::new_from_path("/tmp/path")
140//! .levels(1)
141//! .collect::<Vec<_>>();
142//!
143//! // If the specified directory contains the files 13.txt and 23.txt, the following tuples
144//! // will be produced:
145//! let files_expected = vec![
146//! (PathBuf::from("13.txt"), PathBuf::from("1/13.txt")),
147//! (PathBuf::from("23.txt"), PathBuf::from("2/23.txt")),
148//! ];
149//!
150//! assert_eq!(files_expected, files_decluttered);
151//! }
152//! ```
153//!
154//! #### Create Iterator from Existing Iterator
155//!
156//! This method can be used if you have a specific list of files you want to
157//! process.
158//!
159//! ```rust
160//! use std::path::PathBuf;
161//!
162//! fn main() {
163//! let files = vec!["13.txt", "23.txt"];
164//! let files_decluttered = file_declutter::FileDeclutter::new_from_iter(files.into_iter())
165//! .levels(1)
166//! .collect::<Vec<_>>();
167//!
168//! let files_expected = vec![
169//! (PathBuf::from("13.txt"), PathBuf::from("1/13.txt")),
170//! (PathBuf::from("23.txt"), PathBuf::from("2/23.txt")),
171//! ];
172//!
173//! assert_eq!(files_expected, files_decluttered);
174//! }
175//! ```
176//!
177//! #### Oneshot
178//!
179//! This method can be used if you have a single file which you want to have a
180//! decluttered name for.
181//!
182//! ```rust
183//! use std::path::PathBuf;
184//!
185//! fn main() {
186//! let file = "123456.txt";
187//! let file_decluttered = file_declutter::FileDeclutter::oneshot(file, 3);
188//!
189//! let file_expected = PathBuf::from("1/2/3/123456.txt");
190//!
191//! assert_eq!(file_expected, file_decluttered);
192//! }
193//! ```
194
195use std::path::PathBuf;
196
197/// An iterator that transforms a list of file paths into (source, target) pairs, where the target
198/// path is a decluttered version based on the filename's characters.
199pub struct FileDeclutterIterator<I> {
200 inner: I,
201 base: PathBuf,
202 levels: usize,
203}
204
205impl<I> FileDeclutterIterator<I>
206where
207 I: Iterator<Item = PathBuf>,
208{
209 /// Sets the base directory into which files will be moved.
210 pub fn base<P: Into<PathBuf>>(mut self, base: P) -> Self {
211 self.base = base.into();
212 self
213 }
214
215 /// Sets the number of directory levels to create based on the filename.
216 ///
217 /// For example, with `levels = 2` and a file named `abcdef.txt`, the target path would include
218 /// two subdirectories: `a/b/abcdef.txt`.
219 pub fn levels(mut self, levels: usize) -> Self {
220 self.levels = levels;
221 self
222 }
223
224 /// Moves all files to their decluttered target paths.
225 ///
226 /// If `remove_empty_directories` is `true`, the function will attempt to remove any now-empty
227 /// directories after the move operation.
228 ///
229 /// # Errors
230 ///
231 /// Returns an error if directory creation or file renaming fails.
232 pub fn declutter_files(self, remove_empty_directories: bool) -> anyhow::Result<()> {
233 let base = self.base.clone();
234
235 for (source, target) in self {
236 std::fs::create_dir_all(&target.parent().unwrap())?;
237 std::fs::rename(source, target)?;
238 }
239
240 if remove_empty_directories {
241 for dir in walkdir::WalkDir::new(base)
242 .min_depth(1)
243 .contents_first(true)
244 .into_iter()
245 .filter_entry(|f| f.file_type().is_dir())
246 .flatten()
247 {
248 let dir = dir.into_path();
249
250 if dir.read_dir()?.count() == 0 {
251 // Ignore result, it doesn't matter if deletion fails.
252 let _ = std::fs::remove_dir(dir);
253 }
254 }
255 }
256
257 Ok(())
258 }
259}
260
261impl<I> Iterator for FileDeclutterIterator<I>
262where
263 I: Iterator<Item = PathBuf>,
264{
265 type Item = (PathBuf, PathBuf);
266
267 /// Returns the next `(source, target)` file path pair.
268 ///
269 /// The target path is derived from the file name by taking the first `levels` characters and
270 /// using them as nested directories.
271 fn next(&mut self) -> Option<Self::Item> {
272 self.inner.next().map(move |entry| {
273 let sub_dirs = entry.file_name().unwrap().to_string_lossy();
274 let sub_dirs = sub_dirs.chars().take(self.levels).map(String::from);
275
276 let mut target_path = self.base.clone();
277 for sub_dir in sub_dirs {
278 target_path.push(sub_dir);
279 }
280 target_path.push(entry.file_name().unwrap());
281
282 (entry, target_path)
283 })
284 }
285}
286
287/// Entry point for creating decluttering iterators or computing decluttered paths.
288pub struct FileDeclutter;
289
290impl FileDeclutter {
291 /// Creates a `FileDeclutterIterator` from an arbitrary iterator over file paths.
292 ///
293 /// # Examples
294 ///
295 /// ```rust
296 /// # use std::path::PathBuf;
297 /// let files = vec!["13.txt", "23.txt"];
298 /// let files_decluttered = file_declutter::FileDeclutter::new_from_iter(files.into_iter())
299 /// .levels(1)
300 /// .collect::<Vec<_>>();
301 ///
302 /// let files_expected = vec![
303 /// (PathBuf::from("13.txt"), PathBuf::from("1/13.txt")),
304 /// (PathBuf::from("23.txt"), PathBuf::from("2/23.txt")),
305 /// ];
306 ///
307 /// assert_eq!(files_expected, files_decluttered);
308 /// ```
309 pub fn new_from_iter(
310 iter: impl Iterator<Item = impl Into<PathBuf>>,
311 ) -> FileDeclutterIterator<impl Iterator<Item = PathBuf>> {
312 FileDeclutterIterator {
313 inner: iter.map(Into::into),
314 base: Default::default(),
315 levels: Default::default(),
316 }
317 }
318
319 /// Creates a `FileDeclutterIterator` by recursively collecting all files under a given
320 /// directory and setting this directory as the base.
321 ///
322 /// # Examples
323 ///
324 /// ```rust no_run
325 /// # use std::path::PathBuf;
326 /// let files_decluttered = file_declutter::FileDeclutter::new_from_path("/tmp/path")
327 /// .levels(1)
328 /// .collect::<Vec<_>>();
329 ///
330 /// // If the specified directory contains the files 13.txt and 23.txt, the following tuples
331 /// // will be produced:
332 /// let files_expected = vec![
333 /// (PathBuf::from("13.txt"), PathBuf::from("1/13.txt")),
334 /// (PathBuf::from("23.txt"), PathBuf::from("2/23.txt")),
335 /// ];
336 ///
337 /// assert_eq!(files_expected, files_decluttered);
338 /// ```
339 pub fn new_from_path(
340 base: impl Into<PathBuf>,
341 ) -> FileDeclutterIterator<impl Iterator<Item = PathBuf>> {
342 let base = base.into();
343
344 let iter = walkdir::WalkDir::new(&base)
345 .min_depth(1)
346 .into_iter()
347 .flatten()
348 .filter(|f| f.file_type().is_file())
349 .map(|entry| entry.into_path());
350
351 FileDeclutter::new_from_iter(iter).base(base)
352 }
353
354 /// Computes the decluttered path of a single file without moving it.
355 ///
356 /// # Arguments
357 ///
358 /// - `file`: Path to the input file.
359 /// - `levels`: Number of subdirectory levels to use.
360 ///
361 /// # Returns
362 ///
363 /// A `PathBuf` representing the target decluttered location.
364 ///
365 /// # Examples
366 ///
367 /// ```rust
368 /// # use std::path::PathBuf;
369 /// let file = "123456.txt";
370 /// let file_decluttered = file_declutter::FileDeclutter::oneshot(file, 3);
371 ///
372 /// let file_expected = PathBuf::from("1/2/3/123456.txt");
373 ///
374 /// assert_eq!(file_expected, file_decluttered);
375 /// ```
376 pub fn oneshot(file: impl Into<PathBuf>, levels: usize) -> PathBuf {
377 let iter = std::iter::once(file.into());
378 FileDeclutter::new_from_iter(iter)
379 .levels(levels)
380 .next()
381 .unwrap()
382 .1
383 }
384}
385
386#[cfg(test)]
387mod tests {
388 use assert_fs::TempDir;
389 use assert_fs::prelude::*;
390 use rand::Rng;
391
392 use super::*;
393
394 #[test]
395 fn decluttered_from_path_file_names_same() -> anyhow::Result<()> {
396 let temp_dir = TempDir::new()?;
397
398 let mut rng = rand::rng();
399 for _ in 0..100 {
400 let mut file_name = rng
401 .random_range(1_000_000_000u64..10_000_000_000u64)
402 .to_string();
403
404 if rng.random_bool(0.25) {
405 file_name = format!("subdir/{file_name}");
406 }
407
408 let child = temp_dir.child(file_name);
409 child.touch()?;
410 }
411
412 for (source, target) in FileDeclutter::new_from_path(temp_dir.to_path_buf()).levels(1) {
413 assert_ne!(source.parent(), target.parent());
414 assert_eq!(source.file_name(), target.file_name());
415 }
416
417 Ok(())
418 }
419
420 #[test]
421 fn decluttered_from_iter_file_names_same() -> anyhow::Result<()> {
422 let mut rng = rand::rng();
423 let files = (0..100).map(move |_| {
424 let mut file_name = rng
425 .random_range(1_000_000_000u64..10_000_000_000u64)
426 .to_string();
427
428 if rng.random_bool(0.25) {
429 file_name = format!("subdir/{file_name}");
430 }
431
432 file_name
433 });
434
435 for (source, target) in FileDeclutter::new_from_iter(files).levels(1) {
436 assert_ne!(source.parent(), target.parent());
437 assert_eq!(source.file_name(), target.file_name());
438 }
439
440 Ok(())
441 }
442
443 #[test]
444 fn oneshot() -> anyhow::Result<()> {
445 let source = PathBuf::from("123456");
446 let target_expected = PathBuf::from("1/2/3/123456");
447
448 let target_actual = FileDeclutter::oneshot(&source, 3);
449
450 assert_eq!(target_actual, target_expected);
451
452 Ok(())
453 }
454}