Blog Post:

Multi-Tiered Linux Backup System – Part II

Overview of Backup System

Being able to verify critical backups is just as important as creating those backups.  Many novice and experienced computer users discover that their backups are incomplete just after disaster happens since they didn’t regularly verify their backups.

For the designed backup system, verification of the existing backups on BK2,N is performed before a backup is placed onto it. The reasoning behind this decision is that to make backup corruption detection quick and an important part of the backup system.  By detecting changes in the backup drive, we can detect corruption in not only the BK2,N, but also in BK1 by comparing the archives to the snapshot that they were created with. With a verified archive, one can detect corruption in the snapshot and correct it.

Note: This post is a continuation from Multi-Tiered Linux Backup System Part I

Verification of Disk Archives for corruption

Disk Archive (DAR) files can be checked for errors using “dar -t filename”, excluding the slice number and extension.  For example to verify the dar archive filename.20150316.N.dar, one would run “dar -t filename.20150316”.  If corruption is detected in the archive, it can be corrected using an associated parity file using the par2 command “par2 r filename.20150316.1.par”

Verification of raw files for corruption

Data that is less important (such as multimedia) is stored in its native format (*.txt, *.doc, *.mp3, etc…).  A basic file comparison can be performed using the checksum feature of rsync, so that if a file changes it can be detected and the change brought to the user’s attention.  The raw file format may not provide a method for recovering from corruption, but multiple drives or snapshots can be compared to determine which copy is the corrupted one and to correct it.

Generation of new DAR and parity files

It is important to note that the DarBackup step contains multiple sets of dar images. Once the backup has been verified to be good, a new backup can be created and written to the BK2,N harddrive.  First the existing Dar’s are compared against the newest snapshot to see if a new dar needs to be created.  If the files haven’t changed, then the existing dar and par2 files are hard linked to the latest dar folder. If new archives are needed, then they are generated with respect to the latest snapshot stored.

Symlink remaining RAW and Dar folders

Since folders have been manually specified to be backed up using dar, and backed up in their raw format, a special folder is created called SymlinkedBackup, which contains a symlink to either a folder in a specific snapshot, or a folder containing a particular dar and par2 files. The aim of this folder is to provide a central place that can be rsync’ed to an BK2,N.

Copy SymlinkedBackup to BK2,N

The final phase is to rsync the SymlinkedBackup to the new BK2,N drive. Since the folders in SymlinkedBackup are all symlinks to either raw files or dar archives, it is very tempting to use the copy-link rsync option and copy the entire SymlinkedBackup. The problem with that idea is that Linux’s wine creates symlink’s to the root filesystem and to other places as well. Excluding .wine from the copy isn’t enough, since certificates in /etc are also symlinks, which would create unnecessary storage requirements.

Eventually each individual script will be combined into a single backup management system, but for now it works well enough to perform the backup and verification tasks assigned.

Leave a Reply

Your email address will not be published. Required fields are marked *

Get in Touch

If you have a product design that you would like to discuss, a technical problem in need of a solution, or if you just wish you could add more capabilities to your existing engineering team, please contact us.