Compressed backups

Backups take a lot of space, which can cost quite a lot of money. To reduce the space needed, the backups can be compressed. There are several ways of doing this. Some programs have support for for compression built in; for example, the --gzip (-z) option for GNU tar pipes the whole backup through the gzip compression program, before writing it to the backup medium.

Unfortunately, compressed backups can cause trouble. Due to the nature of how compression works, if a single bit is wrong, all the rest of the compressed data will be unusable. Some backup programs have some built in error correction, but no method can handle a large number of errors. This means that if the backup is compressed the way GNU tar does it, with the whole output compressed as a unit, a single error makes all the rest of the backup lost. Backups must be reliable, and this method of compression is not a good idea.

An alternative way is to compress each file separately. This still means that the one file is lost, but all other files are unharmed. The lost file would have been corrupted anyway, so this situation is not much worse than not using compression at all. The afio program (a variant of cpio) can do this.

Compression takes some time, which may make the backup program unable to write data fast enough for a tape drive. [1] This can be avoided by buffering the output (either internally, if the backup program if smart enough, or by using another program), but even that might not work well enough. This should only be a problem on slow computers.

Notes

[1]

If a tape drive doesn't data fast enough, it has to stop; this makes backups even slower, and can be bad for the tape and the drive.