Skip to content

Commit

Permalink
Update tar-compress docs (#698)
Browse files Browse the repository at this point in the history
- Call finish to finalize the archive [recommended by tar](https://docs.rs/tar/0.4.41/tar/struct.Builder.html#method.append_dir_all).
- Show an example of adding an archive **without** renaming it.
- Callout differences between `tar::Builder` defaults and `tar(1)` defaults

## Context

On seeing these docs I was unsure if there was another, better way, to add all the contents of one directory into an archive. After researching, I see people use this function by call it with an empty string. I feel this is a common operation and would like to see it spelled out in the documentation.

In addition, I hit an edge case where `tar(1)` produced significantly smaller files than the `tar` crate, reproduction: https://github.com/schneems/tar_comparison/blob/bfd420a012b46e80435cf4e7c67ca1661357fde3/README.md. It turns out that the problem was that the directory contained symlinks to other files in the same directory. The `follow_symlinks(true)` behavior is on by default and is the opposite of the `tar(1)` default. It caused the program to duplicate the same file multiple times which significantly increased the archive size. I believe someone looking for "How to  Compress a directory into tarball" is likely looking to replicate `tar czf` behavior and would like to know the main differences.
  • Loading branch information
schneems authored Nov 28, 2024
1 parent a269da9 commit c3776f5
Showing 1 changed file with 25 additions and 1 deletion.
26 changes: 25 additions & 1 deletion src/compression/tar/tar-compress.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Compress `/var/log` directory into `archive.tar.gz`.

Creates a [`File`] wrapped in [`GzEncoder`]
and [`tar::Builder`]. </br>Adds contents of `/var/log` directory recursively into the archive
under `backup/logs`path with [`Builder::append_dir_all`].
under `backup/logs` path with [`Builder::append_dir_all`].
[`GzEncoder`] is responsible for transparently compressing the
data prior to writing it into `archive.tar.gz`.

Expand All @@ -21,11 +21,35 @@ fn main() -> Result<(), std::io::Error> {
let enc = GzEncoder::new(tar_gz, Compression::default());
let mut tar = tar::Builder::new(enc);
tar.append_dir_all("backup/logs", "/var/log")?;
tar.finish()?;
Ok(())
}
```

To add the contents without renaming them, an empty string can be used as the first argument of [`Builder::append_dir_all`]:

```rust,edition2018,no_run
use std::fs::File;
use flate2::Compression;
use flate2::write::GzEncoder;
fn main() -> Result<(), std::io::Error> {
let tar_gz = File::create("archive.tar.gz")?;
let enc = GzEncoder::new(tar_gz, Compression::default());
let mut tar = tar::Builder::new(enc);
tar.append_dir_all("", "/var/log")?;
tar.finish()?;
Ok(())
}
```

The default behavior of [`tar::Builder`] differs from the GNU `tar` utility's defaults [tar(1)],
notably [`tar::Builder::follow_symlinks(true)`] is the equivalent of `tar --dereference`.

[tar(1)]: https://man7.org/linux/man-pages/man1/tar.1.html
[`Builder::append_dir_all`]: https://docs.rs/tar/*/tar/struct.Builder.html#method.append_dir_all
[`File`]: https://doc.rust-lang.org/std/fs/struct.File.html
[`GzEncoder`]: https://docs.rs/flate2/*/flate2/write/struct.GzEncoder.html
[`tar::Builder`]: https://docs.rs/tar/*/tar/struct.Builder.html
[`tar::Builder::follow_symlinks(true)`]: https://docs.rs/tar/latest/tar/struct.Builder.html#method.follow_symlinks

0 comments on commit c3776f5

Please sign in to comment.