Skip to main content

Builder

Struct Builder 

Source
pub struct Builder<W: Write> { /* private fields */ }
Expand description

A structure for building archives

This structure has methods for building up an archive from scratch into any arbitrary writer.

Implementations§

Source§

impl<W: Write> Builder<W>

Source

pub fn new(obj: W) -> Builder<W>

Create a new archive builder with the underlying object as the destination of all data written. The builder will use HeaderMode::Complete by default.

Source

pub fn mode(&mut self, mode: HeaderMode)

Changes the HeaderMode that will be used when reading fs Metadata for methods that implicitly read metadata for an input Path. Notably, this does not apply to append(Header).

Source

pub fn preserve_absolute(&mut self, preserve: bool)

Peserve absolute path while creating an archive

Control whether symlinks are followed when reading from the filesystem. Defaults to true (but see the note below — you almost certainly want to call follow_symlinks(false)).

When true, symlinks are dereferenced: the archive entry contains the contents of the symlink target rather than the symlink itself, equivalent to GNU tar --dereference (-h). When false (the default for all mainstream tar implementations), symlinks are stored as symlink entries in the archive.

§Why you should almost always use follow_symlinks(false)

Every mainstream tar implementation preserves symlinks by default. GNU tar requires the explicit --dereference (-h) flag to follow them. Go’s archive/tar stores whatever the underlying fs.FS reports and never dereferences on its own. BSD tar behaves the same way. This crate’s default of true is a historical quirk kept for compatibility but is wrong for most use-cases:

  • Symlinks in the source tree are part of its structure and should normally be preserved, not silently replaced by their targets.
  • When true, append_dir_all follows symlinks that point outside src_path just as readily as those inside it. If the archiving process has broader filesystem read access than whoever controls the source tree (e.g. a privileged backup service, a CI runner archiving user-submitted workspaces), an attacker can plant a symlink inside src_path to silently include arbitrary files from the host.

Call follow_symlinks(false) unless you have a specific reason to flatten symlinks into their targets. For the strongest guarantee, open src_path with cap-std and walk the tree with capability-safe I/O, which blocks symlink escapes at the OS level regardless of this setting.

Source

pub fn sparse(&mut self, sparse: bool)

Handle sparse files efficiently, if supported by the underlying filesystem. When true, sparse file information is read from disk and empty segments are omitted from the archive. Defaults to true.

Source

pub fn get_ref(&self) -> &W

Gets shared reference to the underlying object.

Source

pub fn get_mut(&mut self) -> &mut W

Gets mutable reference to the underlying object.

Note that care must be taken while writing to the underlying object. But, e.g. get_mut().flush() is claimed to be safe and useful in the situations when one needs to be ensured that tar entry was flushed to the disk.

Source

pub fn into_inner(self) -> Result<W>

Unwrap this archive, returning the underlying object.

This function will finish writing the archive if the finish function hasn’t yet been called, returning any I/O error which happens during that operation.

Source

pub fn append<R: Read>(&mut self, header: &Header, data: R) -> Result<()>

Adds a new entry to this archive.

This function will append the header specified, followed by contents of the stream specified by data. To produce a valid archive the size field of header must be the same as the length of the stream that’s being written. Additionally the checksum for the header should have been set via the set_cksum method.

Note that this will not attempt to seek the archive to a valid position, so if the archive is in the middle of a read or some other similar operation then this may corrupt the archive.

Also note that after all entries have been written to an archive the finish function needs to be called to finish writing the archive.

§Errors

This function will return an error for any intermittent I/O error which occurs when either reading or writing.

§Examples
use tar::{Builder, Header};

let mut header = Header::new_gnu();
header.set_path("foo").unwrap();
header.set_size(4);
header.set_cksum();

let mut data: &[u8] = &[1, 2, 3, 4];

let mut ar = Builder::new(Vec::new());
ar.append(&header, data).unwrap();
let data = ar.into_inner().unwrap();
Source

pub fn append_data<P: AsRef<Path>, R: Read>( &mut self, header: &mut Header, path: P, data: R, ) -> Result<()>

Adds a new entry to this archive with the specified path.

This function will set the specified path in the given header, which may require appending a GNU long-name extension entry to the archive first. The checksum for the header will be automatically updated via the set_cksum method after setting the path. No other metadata in the header will be modified.

Then it will append the header, followed by contents of the stream specified by data. To produce a valid archive the size field of header must be the same as the length of the stream that’s being written.

Note that this will not attempt to seek the archive to a valid position, so if the archive is in the middle of a read or some other similar operation then this may corrupt the archive.

Also note that after all entries have been written to an archive the finish function needs to be called to finish writing the archive.

§Errors

This function will return an error for any intermittent I/O error which occurs when either reading or writing.

§Examples
use tar::{Builder, Header};

let mut header = Header::new_gnu();
header.set_size(4);
header.set_cksum();

let mut data: &[u8] = &[1, 2, 3, 4];

let mut ar = Builder::new(Vec::new());
ar.append_data(&mut header, "really/long/path/to/foo", data).unwrap();
let data = ar.into_inner().unwrap();
Source

pub fn append_writer<'a, P: AsRef<Path>>( &'a mut self, header: &'a mut Header, path: P, ) -> Result<EntryWriter<'a>>
where W: Seek,

Adds a new entry to this archive and returns an EntryWriter for adding its contents.

This function is similar to Self::append_data but returns a io::Write implementation instead of taking data as a parameter.

Similar constraints around the position of the archive and completion apply as with Self::append_data. It requires the underlying writer to implement Seek to update the header after writing the data.

§Errors

This function will return an error for any intermittent I/O error which occurs when either reading or writing.

§Examples
use std::io::Cursor;
use std::io::Write as _;
use tar::{Builder, Header};

let mut header = Header::new_gnu();

let mut ar = Builder::new(Cursor::new(Vec::new()));
let mut entry = ar.append_writer(&mut header, "hi.txt").unwrap();
entry.write_all(b"Hello, ").unwrap();
entry.write_all(b"world!\n").unwrap();
entry.finish().unwrap();

Adds a new link (symbolic or hard) entry to this archive with the specified path and target.

This function is similar to Self::append_data which supports long filenames, but also supports long link targets using GNU extensions if necessary. You must set the entry type to either EntryType::Link or EntryType::Symlink. The set_cksum method will be invoked after setting the path. No other metadata in the header will be modified.

If you are intending to use GNU extensions, you must use this method over calling Header::set_link_name because that function will fail on long links.

Similar constraints around the position of the archive and completion apply as with Self::append_data.

§Errors

This function will return an error for any intermittent I/O error which occurs when either reading or writing.

§Examples
use tar::{Builder, Header, EntryType};

let mut ar = Builder::new(Vec::new());
let mut header = Header::new_gnu();
header.set_username("foo");
header.set_entry_type(EntryType::Symlink);
header.set_size(0);
ar.append_link(&mut header, "really/long/path/to/foo", "other/really/long/target").unwrap();
let data = ar.into_inner().unwrap();
Source

pub fn append_path<P: AsRef<Path>>(&mut self, path: P) -> Result<()>

Adds a file on the local filesystem to this archive.

This function will open the file specified by path and insert the file into the archive with the appropriate metadata set, returning any I/O error which occurs while writing. The path name for the file inside of this archive will be the same as path, and it is required that the path is a relative path.

Note that this will not attempt to seek the archive to a valid position, so if the archive is in the middle of a read or some other similar operation then this may corrupt the archive.

Also note that after all files have been written to an archive the finish function needs to be called to finish writing the archive.

§Examples
use tar::Builder;

let mut ar = Builder::new(Vec::new());

ar.append_path("foo/bar.txt").unwrap();
Source

pub fn append_path_with_name<P: AsRef<Path>, N: AsRef<Path>>( &mut self, path: P, name: N, ) -> Result<()>

Adds a file on the local filesystem to this archive under another name.

This function will open the file specified by path and insert the file into the archive as name with appropriate metadata set, returning any I/O error which occurs while writing. The path name for the file inside of this archive will be name is required to be a relative path.

Note that this will not attempt to seek the archive to a valid position, so if the archive is in the middle of a read or some other similar operation then this may corrupt the archive.

Note if the path is a directory. This will just add an entry to the archive, rather than contents of the directory.

Also note that after all files have been written to an archive the finish function needs to be called to finish writing the archive.

§Examples
use tar::Builder;

let mut ar = Builder::new(Vec::new());

// Insert the local file "foo/bar.txt" in the archive but with the name
// "bar/foo.txt".
ar.append_path_with_name("foo/bar.txt", "bar/foo.txt").unwrap();
Source

pub fn append_file<P: AsRef<Path>>( &mut self, path: P, file: &mut File, ) -> Result<()>

Adds a file to this archive with the given path as the name of the file in the archive.

This will use the metadata of file to populate a Header, and it will then append the file to the archive with the name path.

Note that this will not attempt to seek the archive to a valid position, so if the archive is in the middle of a read or some other similar operation then this may corrupt the archive.

Also note that after all files have been written to an archive the finish function needs to be called to finish writing the archive.

§Examples
use std::fs::File;
use tar::Builder;

let mut ar = Builder::new(Vec::new());

// Open the file at one location, but insert it into the archive with a
// different name.
let mut f = File::open("foo/bar/baz.txt").unwrap();
ar.append_file("bar/baz.txt", &mut f).unwrap();
Source

pub fn append_dir<P, Q>(&mut self, path: P, src_path: Q) -> Result<()>
where P: AsRef<Path>, Q: AsRef<Path>,

Adds a directory to this archive with the given path as the name of the directory in the archive.

This will use stat to populate a Header, and it will then append the directory to the archive with the name path.

Note that this will not attempt to seek the archive to a valid position, so if the archive is in the middle of a read or some other similar operation then this may corrupt the archive.

Note this will not add the contents of the directory to the archive. See append_dir_all for recursively adding the contents of the directory.

Also note that after all files have been written to an archive the finish function needs to be called to finish writing the archive.

§Examples
use std::fs;
use tar::Builder;

let mut ar = Builder::new(Vec::new());

// Use the directory at one location, but insert it into the archive
// with a different name.
ar.append_dir("bardir", ".").unwrap();
Source

pub fn append_dir_all<P, Q>(&mut self, path: P, src_path: Q) -> Result<()>
where P: AsRef<Path>, Q: AsRef<Path>,

Adds a directory and all of its contents (recursively) to this archive with the given path as the name of the directory in the archive.

Note that this will not attempt to seek the archive to a valid position, so if the archive is in the middle of a read or some other similar operation then this may corrupt the archive.

Also note that after all files have been written to an archive the finish or into_inner function needs to be called to finish writing the archive.

§Security

Call follow_symlinks(false) before this method unless you have an explicit reason to dereference symlinks. All mainstream tar implementations (GNU tar, BSD tar, Go’s archive/tar) preserve symlinks by default; this crate’s default of true is a historical quirk.

When follow_symlinks is true (the current default), this method dereferences every symlink it encounters, including ones whose targets lie outside src_path. When the archiver runs with broader filesystem access than whoever controls the source tree (e.g. a privileged backup or export service), an attacker can plant a symlink inside src_path to silently include arbitrary files the archiver can read, with no indication in the archive that they came from outside the source root.

use tar::Builder;

// Recommended: preserve symlinks as-is, matching GNU tar's default.
let mut ar = Builder::new(writer);
ar.follow_symlinks(false);
ar.append_dir_all("", src_path).unwrap();
ar.finish().unwrap();

With follow_symlinks(false), symlinks inside the source tree are stored as symlink entries in the archive rather than being read through. Note that the resulting archive may then contain symlinks with absolute or ..-relative targets; validate or strip those on extraction if the archive consumer is also untrusted.

For the strongest available guarantee, open src_path using cap-std and walk the directory tree with capability-safe I/O. This prevents symlink escapes at the OS level and protects against TOCTOU races that a purely path-based check cannot close.

§Examples
use std::fs;
use tar::Builder;

let mut ar = Builder::new(Vec::new());

// Use the directory at one location ("."), but insert it into the archive
// with a different name ("bardir").
ar.append_dir_all("bardir", ".").unwrap();
ar.finish().unwrap();

Use append_dir_all with an empty string as the first path argument to create an archive from all files in a directory without renaming.

use std::fs;
use std::path::PathBuf;
use tar::{Archive, Builder};

let tmpdir = tempfile::tempdir().unwrap();
let path = tmpdir.path();
fs::write(path.join("a.txt"), b"hello").unwrap();
fs::write(path.join("b.txt"), b"world").unwrap();

// Create a tarball from the files in the directory
let mut ar = Builder::new(Vec::new());
ar.append_dir_all("", path).unwrap();

// List files in the archive
let archive = ar.into_inner().unwrap();
let archived_files = Archive::new(archive.as_slice())
    .entries()
    .unwrap()
    .map(|entry| entry.unwrap().path().unwrap().into_owned())
    .collect::<Vec<_>>();

assert!(archived_files.contains(&PathBuf::from("a.txt")));
assert!(archived_files.contains(&PathBuf::from("b.txt")));
Source

pub fn finish(&mut self) -> Result<()>

Finish writing this archive, emitting the termination sections.

This function should only be called when the archive has been written entirely and if an I/O error happens the underlying object still needs to be acquired.

In most situations the into_inner method should be preferred.

Source§

impl<T: Write> Builder<T>

Extension trait for Builder to append PAX extended headers.

Source

pub fn append_pax_extensions<'key, 'value>( &mut self, headers: impl IntoIterator<Item = (&'key str, &'value [u8])>, ) -> Result<(), Error>

Append PAX extended headers to the archive.

Takes in an iterator over the list of headers to add to convert it into a header set formatted.

Returns io::Error if an error occurs, else it returns ()

Trait Implementations§

Source§

impl<W: Write> Drop for Builder<W>

Source§

fn drop(&mut self)

Executes the destructor for this type. Read more
Source§

fn pin_drop(self: Pin<&mut Self>)

🔬This is a nightly-only experimental API. (pin_ergonomics)
Execute the destructor for this type, but different to Drop::drop, it requires self to be pinned. Read more

Auto Trait Implementations§

§

impl<W> Freeze for Builder<W>
where W: Freeze,

§

impl<W> RefUnwindSafe for Builder<W>
where W: RefUnwindSafe,

§

impl<W> Send for Builder<W>
where W: Send,

§

impl<W> Sync for Builder<W>
where W: Sync,

§

impl<W> Unpin for Builder<W>
where W: Unpin,

§

impl<W> UnsafeUnpin for Builder<W>
where W: UnsafeUnpin,

§

impl<W> UnwindSafe for Builder<W>
where W: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.