remove_dir_all/lib.rs
1//! Reliably remove a directory and all of its children.
2//!
3//! This library provides an alternative implementation of
4//! [`std::fs::remove_dir_all`] from the Rust std library. It varies in the
5//! following ways:
6//! - the `parallel` feature parallelises the deletion. This is useful when high
7//! syscall latency is occurring, such as on Windows (deletion IO accrues to
8//! the process), or network file systems of any kind. This feature is off by
9//! default. When enabled, it will disable itself on MacOS because of the bug
10//! reported in this [blog
11//! post](https://gregoryszorc.com/blog/2018/10/29/global-kernel-locks-in-apfs/).
12//! Use [`RemoverBuilder`] to override this behaviour and force enable/disable
13//! parallelism at runtime.
14//! - It tolerates files not being deleted atomically (this is a Windows
15//! specific behaviour).
16//! - It resets the readonly flag on Windows as needed.
17//!
18//! Like `remove_dir_all` it assumes both that the caller has permission to
19//! delete the files, and that they don't have permission to change permissions
20//! to be able to delete the files: no ACL or chmod changes are made during
21//! deletion. This is because hardlinks can cause such changes to show up and
22//! affect the filesystem outside of the directory tree being deleted.
23//!
24//! The extension trait [`RemoveDir`] can be used to invoke `remove_dir_all` on
25//! an open [`File`](std::fs::File), where it will error if the file is not a
26//! directory, and otherwise delete the contents. This allows callers to be more
27//! confident that what is deleted is what was requested even in the presence of
28//! malicious actors changing the filesystem concurrently.
29//!
30//! The functions [`remove_dir_all`], [`remove_dir_contents`], and
31//! [`ensure_empty_dir`] are intrinsically sensitive to file system races, as
32//! the path to the directory to delete can be substituted by an attacker
33//! inserting a symlink along that path. Relative paths with one path component
34//! are the least fragile, but using [`RemoveDir::remove_dir_contents`] is
35//! recommended.
36//!
37//! ## Features
38//!
39//! - parallel: When enabled, deletion of directories is parallised.
40//! (#parallel)[more details]
41//! - log: Include some log messages about the deletion taking place.
42//!
43//! About the implementation. The implementation prioritises security, then
44//! robustness (e.g. low resource situations), and then finally performance.
45//!
46//! ## Security
47//!
48//! On all platforms directory related race conditions are avoided by opening
49//! paths and then iterating directory contents and deleting names in the
50//! directory with _at style syscalls. This does not entirely address possible
51//! races on unix style operating systems (but see the funlinkat call on
52//! FreeBSD, which could if more widely adopted). It does prevent attackers from
53//! replacing intermediary directories with symlinks in order to fool privileged
54//! code into traversing outside the intended directory tree. This is the same
55//! as the standard library implementation.
56//!
57//! This function is not designed to succeed in the presence of concurrent
58//! actors in the tree being deleted - for instance, adding files to a directory
59//! being deleted can prevent the directory being deleted for an arbitrary
60//! period by extending the directory iterator indefinitely.
61//!
62//! Directory traversal only ever happens downwards. In future, to accommodate
63//! very large directory trees (greater than file descriptor limits deep) the
64//! same path may be traversed multiple times, and the quadratic nature of that
65//! will be mitigated by a cache of open directories. See [#future-plans](Future
66//! Plans)
67//!
68//! ## Robustness
69//!
70//! Every opened file has its type checked through the file handle, and then
71//! unlinked or scanned as appropriate. Syscall overheads are minimised by
72//! trust-but-verify of the node type metadata returned from directory scanning:
73//! only names that appear to be directories get their contents scanned. The
74//! consequence is that if an attacker replaces a non-directory with a
75//! directory, or vice versa, an error will occur - but the `remove_dir_all`
76//! will not escape from the directory tree. On Windows file deletion requires
77//! obtaining a handle to the file, but again the kind metadata from the
78//! directory scan is used to avoid re-querying the metadata. Symlinks are
79//! detected by a failure to open a path with `O_NOFOLLOW`, they are unlinked
80//! with no further processing.
81//!
82//! ## Serial deletion
83//!
84//! Serial deletion occurs recursively - open, read, delete
85//! contents-except-for-directories, repeat.
86//!
87//! Parallel deletion builds on serial deletion by utilising a thread pool for
88//! IO which can block:
89//! - directory scanning
90//! - calls to unlink and fstat
91//! - file handle closing (yes, that can block)
92//!
93//! Parallel is usually a win, but some users may value compile time or size of
94//! compiled code more, so the `parallel` feature is opt-in.
95//!
96//! We suggest permitting the end user to control this choice: when adding
97//! remove-dir-all as a dependency to a library crate, expose a feature
98//! "parallel" that sets `remove-dir-all/parallel`. This will permit the user of
99//! your library to control the parallel feature inside `remove_dir_all`
100//!
101//! e.g.
102//!
103//! ```Cargo.toml
104//! [features]
105//! default = []
106//! parallel = ["remove_dir_all/parallel"]
107//! ...
108//! [dependencies]
109//! remove_dir_all = {version = "0.8"}
110//! ```
111//! ## Future Plans
112//! Open directory handles are kept in a lg-spaced cache after the first 10
113//! levels: level10/skipped1/level12/skipped2/skipped3/skipped4/level16. If
114//! EMFILE is encountered, no more handles are cached, and directories are
115//! opened by re-traversing from the closest previously opened handle. Deletion
116//! should succeed even only 4 file descriptors are available: one to hold the
117//! root, two to iterate individual directories, and one to open-and-delete
118//! individual files, though that will be quadratic in the depth of the tree,
119//! successfully deleting leaves only on each iteration.
120//!
121//! IO Prioritisation:
122//! 1) directory scanning when few paths are queued for deletion (to avoid
123//! ending up accidentally serial) - allowing keeping the other queues full.
124//! 4) close/CloseHandle (free up file descriptors)
125//! 2) rmdir (free up file descriptors)
126//! 3) unlink/SetFileInformationByHandle (to free up directories so they can be
127//! rmdir'd)
128//!
129//! Scanning/unlinking/rmdiring is further biased by depth and lexicographic
130//! order: this minimises the number of directories being worked on in parallel,
131//! so very branchy trees are less likely to exhaust kernel resources or
132//! application memory or thrash the open directory cache.
133
134#![deny(missing_debug_implementations)]
135#![deny(missing_docs)]
136#![deny(rust_2018_idioms)]
137// See under "known problems" https://rust-lang.github.io/rust-clippy/master/index.html#mutex_atomic
138#![allow(clippy::mutex_atomic)]
139
140use std::{io::Result, path::Path};
141
142use normpath::PathExt;
143
144#[cfg(doctest)]
145#[macro_use]
146extern crate doc_comment;
147
148#[cfg(doctest)]
149doctest!("../README.md");
150
151mod _impl;
152
153/// Extension trait adding `remove_dir_all` support to [`std::fs::File`].
154pub trait RemoveDir {
155 /// Remove the contents of the dir.
156 ///
157 /// `debug_root`: identifies the directory contents being removed
158 fn remove_dir_contents(&mut self, debug_root: Option<&Path>) -> Result<()>;
159}
160
161/// Makes `path` an empty directory: if it does not exist, it is created it as
162/// an empty directory (as if with [`std::fs::create_dir`]); if it does exist, its
163/// contents are deleted (as if with [`remove_dir_contents`]).
164///
165/// It is an error if `path` exists but is not a directory, including a symlink
166/// to a directory.
167///
168/// This is subject to file system races: a privileged process could be attacked
169/// by replacing parent directories of the supplied path with a link (e.g. to
170/// /etc). Consider using [`RemoveDir::remove_dir_contents`] instead.
171pub fn ensure_empty_dir<P: AsRef<Path>>(path: P) -> Result<()> {
172 _impl::_ensure_empty_dir_path::<_impl::OsIo, _>(path)
173}
174
175/// Deletes the contents of `path`, but not the directory itself. It is an error
176/// if `path` is not a directory.
177///
178/// This is subject to file system races: a privileged process could be attacked
179/// by replacing parent directories of the supplied path with a link (e.g. to
180/// /etc). Consider using [`RemoveDir::remove_dir_contents`] instead.
181pub fn remove_dir_contents<P: AsRef<Path>>(path: P) -> Result<()> {
182 _impl::_remove_dir_contents_path::<_impl::OsIo, P>(path)
183}
184
185/// Reliably removes a directory and all of its children.
186///
187/// ```rust
188/// use std::fs;
189/// use remove_dir_all::*;
190///
191/// fs::create_dir("./temp/").unwrap();
192/// remove_dir_all("./temp/").unwrap();
193/// ```
194///
195/// Note: calling this on a non-directory (e.g. a symlink to a directory) will
196/// error.
197///
198/// [`RemoveDir::remove_dir_contents`] is somewhat safer and
199/// recommended as the path based version is subject to file system races
200/// determining what to delete: a privileged process could be attacked by
201/// replacing parent directories of the supplied path with a link (e.g. to
202/// /etc). Consider using [`RemoveDir::remove_dir_contents`] instead.
203pub fn remove_dir_all<P: AsRef<Path>>(path: P) -> Result<()> {
204 let path = path.as_ref().normalize()?;
205 _impl::remove_dir_all_path::<_impl::OsIo, _>(path, _impl::default_parallel_mode())
206}
207
208/// How to parallelise remove_dir_all().
209#[derive(Debug, Clone, Copy)]
210enum ParallelMode {
211 /// No parallelism.
212 Serial,
213 /// Parallelise readdir and unlink operations - the default when the parallel feature is enabled.
214 #[cfg(feature = "parallel")]
215 Parallel,
216}
217
218/// Builder for configuring the parallelism of remove_dir_all.
219#[derive(Debug, Clone, Copy)]
220#[non_exhaustive]
221pub struct RemoverBuilder {
222 parallel: ParallelMode,
223}
224
225impl RemoverBuilder {
226 /// Create a new RemoverBuilder.
227 pub fn new() -> Self {
228 Self {
229 parallel: _impl::default_parallel_mode(),
230 }
231 }
232
233 /// Serialise all IO operations.
234 pub fn serial(mut self) -> Self {
235 self.parallel = ParallelMode::Serial;
236 self
237 }
238
239 /// Parallelise the removal of directories.
240 #[cfg(feature = "parallel")]
241 pub fn parallel(mut self) -> Self {
242 self.parallel = ParallelMode::Parallel;
243 self
244 }
245
246 /// Build the Remover.
247 pub fn build(self) -> Remover {
248 Remover {
249 parallel: self.parallel,
250 }
251 }
252}
253
254impl Default for RemoverBuilder {
255 fn default() -> Self {
256 Self::new()
257 }
258}
259
260/// Remover holds configuration for different ways of removing directories.
261#[derive(Debug, Clone, Copy)]
262#[non_exhaustive]
263pub struct Remover {
264 parallel: ParallelMode,
265}
266
267impl Remover {
268 /// Remove the directory and all of its children.
269 pub fn remove_dir_all<P: AsRef<Path>>(&self, path: P) -> Result<()> {
270 let path = path.as_ref().normalize()?;
271 _impl::remove_dir_all_path::<_impl::OsIo, _>(path, self.parallel)
272 }
273}
274
275#[allow(deprecated)]
276#[cfg(test)]
277mod tests {
278 //! functional tests for all platforms
279 //!
280 //! A note on safety: races are notoriously hard to secure merely via tests:
281 //! these tests use a dedicated trait to allow sequencing attack operations,
282 //! much the same as the test clock in Tokio programs. So these 'safe' tests
283 //! are not actually attempting scheduling races, rather they are showing
284 //! that the known attacks don't work. A fuzz based heuristic functional
285 //! test would be a good addition to complement these tests.
286 use super::Result;
287
288 use std::fs::{self, File};
289 use std::io;
290 use std::path::PathBuf;
291
292 use tempfile::TempDir;
293 use test_log::test;
294
295 use crate::ensure_empty_dir;
296 use crate::remove_dir_all;
297 use crate::remove_dir_contents;
298
299 cfg_if::cfg_if! {
300 if #[cfg(windows)] {
301 const ENOTDIR:i32 = windows_sys::Win32::Foundation::ERROR_DIRECTORY as i32;
302 const ENOENT:i32 = windows_sys::Win32::Foundation::ERROR_FILE_NOT_FOUND as i32;
303 const INVALID_INPUT:i32 = windows_sys::Win32::Foundation::ERROR_INVALID_PARAMETER as i32;
304 } else {
305 const ENOTDIR:i32 = libc::ENOTDIR;
306 const ENOENT:i32 = libc::ENOENT;
307 const INVALID_INPUT:i32 = libc::EINVAL;
308 }
309 }
310
311 /// Expect a particular sort of failure
312 fn expect_failure<T>(n: &[i32], r: io::Result<T>) -> io::Result<()> {
313 match r {
314 Err(e)
315 if n.iter()
316 .map(|n| Option::Some(*n))
317 .any(|n| n == e.raw_os_error()) =>
318 {
319 Ok(())
320 }
321 Err(e) => {
322 println!("{e} {:?}, {:?}, {:?}", e.raw_os_error(), e.kind(), n);
323 Err(e)
324 }
325 Ok(_) => Err(io::Error::new(
326 io::ErrorKind::Other,
327 "unexpected success".to_string(),
328 )),
329 }
330 }
331
332 struct Prep {
333 _tmp: TempDir,
334 ours: PathBuf,
335 file: PathBuf,
336 }
337
338 /// Create test setup: t.mkdir/file all in a tempdir.
339 fn prep() -> Result<Prep> {
340 let tmp = TempDir::new()?;
341 let ours = tmp.path().join("t.mkdir");
342 let file = ours.join("file");
343 let nested = ours.join("another_dir");
344 fs::create_dir(&ours)?;
345 fs::create_dir(nested)?;
346 File::create(&file)?;
347 File::open(&file)?;
348 Ok(Prep {
349 _tmp: tmp,
350 ours,
351 file,
352 })
353 }
354
355 #[test]
356 fn mkdir_rm() -> Result<()> {
357 let p = prep()?;
358
359 expect_failure(&[ENOTDIR, INVALID_INPUT], remove_dir_contents(&p.file))?;
360
361 remove_dir_contents(&p.ours)?;
362 expect_failure(&[ENOENT], File::open(&p.file))?;
363
364 remove_dir_contents(&p.ours)?;
365 remove_dir_all(&p.ours)?;
366 expect_failure(&[ENOENT], remove_dir_contents(&p.ours))?;
367 Ok(())
368 }
369
370 #[test]
371 fn ensure_rm() -> Result<()> {
372 let p = prep()?;
373
374 expect_failure(&[ENOTDIR, INVALID_INPUT], ensure_empty_dir(&p.file))?;
375
376 ensure_empty_dir(&p.ours)?;
377 expect_failure(&[ENOENT], File::open(&p.file))?;
378 ensure_empty_dir(&p.ours)?;
379
380 remove_dir_all(&p.ours)?;
381 ensure_empty_dir(&p.ours)?;
382 File::create(&p.file)?;
383
384 Ok(())
385 }
386}