rsnapshot Backups
This guide will assist you in setting up an rsnapshot backup server on your network. It will briefly explain setting up passwordless logins via SSH for rsync using SSH keys. For quite some time, I was using the utility Synbak to automate backing up my laptop at home to my desktop. Before that, a simple rsync script was doing the task. Although Synbak is a wonderful utility for simplifying backups at home, it's not quite as flexible as rsnapshot. (Synbak's power lies in it's ease of installation and use.) This guide assumes using RPM packages extending the author's documentation for specifics on installation using CentOS/Fedora.
1. System:
CentOS 4.x/5.x & Fedora 7 (Should work for any RHEL/Fedora OS)
2. References:
rsnapshot website
rsnapshot documentation
RPMForge on CentOS
3. System Setup
Installation of rsnapshot using Yum and RPM differs if you're using Fedora and CentOS, but only slightly.
3.1. 1) CentOS 4.x/5.x
For installation on CentOS 4.x/5.x, you'll have to install and enable RPMForge's third-party repository. I would advise setting up yum-priorities as well. Fortunately the CentOS wiki has this included in the RPMForge guide located at the wiki in an easy to follow guide. Once you get things setup, install rsnapshot from the command line.
# yum install rsnapshot
This should pull down a few Perl dependencies if you don't already have them installed. Some of the packages will be pulled from base and the remaining from rpmforge's repository.
3.2. 2) Fedora
Installation on my Fedora 7 desktop did not require any special repository configurations. Simply install it from the command line.
# yum install rsnapshot
4. Setting up the configuration environment
After installation, an example configuration file can be found at /etc/rsnapshot.conf, but chances are that you will be backing up more than one host, so we won't use it. Instead, create a directory to house your host configuration files. Feel free to substitute paths to your own likeness.
# mkdir /etc/rsnapshot
You're also going to need a place to stow away these backups, so create a place to store the files.
# mkdir /srv/backups/snapshots
Last, you'll need a place to store the log files that rsnapshot generates that you will setup later on. Create the directory.
# mkdir /var/log/rsnapshot
5. Setting up SSH passwordless login
5.1. 1) rsnapshot server end
On the rsnapshot server, you'll need to create a SSH keypair. This can be accomplished using ssh-keygen.
# ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/root/.ssh/id_dsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_dsa. Your public key has been saved in /root/.ssh/id_dsa.pub. The key fingerprint is: snip
5.2. 2) Installing keys on hosts
After you get a key created on the rsnapshot server, you can easily append the public key to the appropriate file remotely if you already have SSH access. Do not append the other non-public key. Run the following from the rsnapshot server to the remote host you wish to backup.
# cat ~/.ssh/id_dsa.pub | ssh root@remote_host "cat >> ~/.ssh/authorized_keys2"
You may wish to turn off password logins via SSH now on the remote host, but that's for you to decide. If you decide to do so, edit /etc/ssh/sshd_config. Make sure you turn PasswordAuthentication and PermitEmptyPasswords to say no. Also, I'm not a security expert, but you should change permissions on your .ssh directories and files to something like below. Please correct me if I have the permissions listed incorrectly.
# chmod 700 .ssh; chmod 400 .ssh/authorized_keys2
6. Configuring rsnapshot hosts
Now that you have a basic place to store configuration files, a place to store the actual backups, and access to the remote host across a passwordless login, host configuration files can be created. I'll use my backup examples for my laptop at home.
6.1. 1) Host .conf file
Create a file that will hold the rsnapshot specific configurations for the host that you want backed up. Substitute your hostname or name for the file as needed.
# vim /etc/rsnapshot/laptop.rsnapshot.conf
There are many options to explain, but here is my configuration file. Enter the following options and explanation will follow.
config_version 1.2 snapshot_root /srv/backups/snapshots/laptop/ cmd_cp /bin/cp cmd_rm /bin/rm cmd_rsync /usr/bin/rsync cmd_ssh /usr/bin/ssh cmd_logger /usr/bin/logger cmd_du /usr/bin/du interval daily 7 interval weekly 4 interval monthly 3 verbose 2 loglevel 4 logfile /var/log/rsnapshot/laptop.log exclude_file /etc/rsnapshot/laptop.exclude rsync_long_args --delete --numeric-ids --delete-excluded lockfile /var/run/rsnapshot.pid backup root@laptop:/ laptop/ backup_script /etc/rsnapshot/laptop.dump_databases.sh laptop_databases/
NOTE: Please notice above the spacing between configuration options. These are not actually spaces, but instead tabs. Also, a trailing slash is required on directories. This is explained at the top of the example file /etc/rsnapshot.conf.
All of the options can be viewed from the example configuration file. The argument explanations are as follows:
config_version 1.2 = Configuration file version
snapshot_root = Destination on where to store snapshots
cmd_cp = Path to copy command
cmd_rm = Path to remove command
cmd_rsync = Path to rsync
cmd_ssh = Path to SSH
cmd_logger = Path to shell command interface to syslog
cmd_du = Path to disk usage command
interval hourly = How many hourly backups to keep.
interval daily = How many daily backups to keep.
interval weekly = How many weekly backups to keep.
interval monthly = How many monthly backups to keep.
verbose = Self-explanatory
loglevel = Self-explanatory
logfile = Path to logfile
ssh_args = Optional SSH arguments, such as a different port (-p )
exclude_file = Path to the exclude file (will be explained in more detail)
rsync_long_args = Long arguments to pass to rsync
lockfile = Self-explanatory
backup = Full path to what to be backed up followed by relative path of placement.
backup_script = Full path to an executable script followed by relative path of placement.
Now that you have a working configuration host example, let's move on to the exclude file.
6.2. 2) Host .exclude file
I'm going to take the option to exclude everything, and then only specifically allow what I want backed up. The reason for this is to to allow absolutely only what you want, and automatically deny everything else. I know it sounds backwards, but it works.
Create the file:
# vim /etc/rsnapshot/laptop.exclude
Using exclude files can be tricky due to recursion. I don't pretend to fully understand rsync's use of recursion, so my example may not be what you're looking for. This allows you to explicity allow only what you want, and then exclude everything else. Due to the way rsync uses recursion, and depending on how you lay out the pattern structure (meaning * matches differently than **), then you have to list the directory first.
Once you get a list of what you want, add ( - * ) to exclude everything else. This will only backup what you listed. My exclude file looks like the following:
+ /boot + /etc + /home + /opt + /root + /usr + /usr/java + /usr/local - /usr/* - /var/cache + /var + /srv - /*
A good reference for me was to read this blog post found at kurup.org. Also, read the rsync manual to better understand pattern matching and recursion. If you're already a recursion expert, please post a note explaining further or correcting any of my mistakes.
6.3. 3) Host backup script
The last configuration option to explain is the call to the executable script I referenced to dumping databases. This option allows you to place a script to be run at the end of the rsync to add to the archive. For instance, you can connect to the remote host and do database dumps for MySQL and/or PostgreSQL. Create the necessary database dump script.
# vim /etc/rsnapshot/laptop.dump_databases.sh
Add the appropriate information. This script will dump two MySQL databases and one PostgreSQL database I have running on my laptop.
#MySQL datbases for db in wordpress testpress do ssh root@laptop "mysqldump --opt $db" > $db.mysqldump done #PostgreSQL databases for db in docmgr do ssh root@laptop "pg_dump $db" > $db.pgsqldump done
Be sure to make the script executable. You can also place the script anywhere else you'd like if you don't want executables in /etc, for instance /usr/local/bin/ or /usr/local/sbin.
# chmod +x /etc/rsnapshot/laptop.dump_databases.sh
In the above laptop.rsnapshot.conf file, I specified a backup script as follows, remember tabs:
backup_script /etc/rsnapshot/laptop.dump_databases.sh laptop_databases/
This will places the database dumps in the directory laptop_databases within each daily, weekly, and monthly backup snapshot. This should complete a basic host configuration. Next you can setup automating things.
7. Setting up a crontab
Automation is easily attainable using a simple crontab entry. Open up crontab and add the following:
# crontab -e #MAILTO="" ##Supresses output MAILTO=me ################################################################### #minute (0-59), # #| hour (0-23), # #| | day of the month (1-31), # #| | | month of the year (1-12), # #| | | | day of the week (0-6 with 0=Sunday)# #| | | | | commands # ################################################################### 15 02 * * * /usr/bin/rsnapshot -c /etc/rsnapshot/laptop.rsnapshot.conf daily 15 03 * * Sun /usr/bin/rsnapshot -c /etc/rsnapshot/laptop.rsnapshot.conf weekly 30 03 1 * * /usr/bin/rsnapshot -c /etc/rsnapshot/laptop.rsnapshot.conf monthly
If you don't want to receive any mail, change MAILTO to MAILTO="" or alternatively be a good boy or girl and read the crontab manual. This setup runs a daily backup at 2:15am, a weekly backup on Sunday at 3:15am, and a monthly backup on the first of the month at 3:30am.
When it's all said and done, you'll have a directory layout like below. The rsnapshot and rsync utilities make use of hardlinks, so it looks like you have a full filesystem in each backup, but you really don't. It's very smart and wise about disk usage. Even though it sounds like everything might not be there, it is.
# ll /srv/backups/snapshots/laptop drwxr-xr-x 4 root root 4096 Jan 3 01:47 daily.0 drwxr-xr-x 4 root root 4096 Dec 7 01:46 daily.1 drwxr-xr-x 4 root root 4096 Dec 6 01:48 daily.2 drwxr-xr-x 4 root root 4096 Dec 5 01:46 daily.3 drwxr-xr-x 4 root root 4096 Dec 4 01:46 daily.4 drwxr-xr-x 4 root root 4096 Dec 3 01:46 daily.5 drwxr-xr-x 4 root root 4096 Dec 2 01:46 daily.6 drwxr-xr-x 4 root root 4096 Oct 28 01:46 monthly.0 drwxr-xr-x 4 root root 4096 Sep 30 01:46 monthly.1 drwxr-xr-x 4 root root 4096 Sep 1 01:46 monthly.2 drwxr-xr-x 4 root root 4096 Nov 25 01:46 weekly.0 drwxr-xr-x 4 root root 4096 Nov 18 01:46 weekly.1 drwxr-xr-x 4 root root 4096 Nov 11 01:46 weekly.2 drwxr-xr-x 4 root root 4096 Nov 4 01:46 weekly.3
If you set up database dumps, you'll find a laptop_databases directory within each directory listed above. For example:
# cd /srv/backups/snapshots/laptop/daily.0/laptop_databases # ll -rw-r--r-- 1 root root 692310 2008-01-04 02:15 wordpress.mysqldump -rw-r--r-- 4 root root 291379 2008-01-02 22:49 testpress.mysqldump -rw-r--r-- 2 root root 842345 2008-01-03 21:12 docmgr.pgsqldump
Also, you can use the hourly option if you need to backup directories and databases on an hourly basis, or if you're extremely paranoid about data. I don't have a need for this, but after this guide, I'm sure you can figure out how to add in the hourly parameters should you find it useful.
To add more hosts, just follow this guide and complete the exact same process with new host names in the files you create.
8. Running rsnapshot manually
To run a manual snapshot, you can run rsnapshot directly from the command line with the appropriate parameter.
# /usr/bin/rsnapshot -c /etc/rsnapshot/laptop.rsnapshot.conf {hourly,daily,weekly,monthly}
9. rsnapshot reports
If you're into noise, then rsnapshot has a nifty little reporting script that's easily setup. This will send you a small e-mail with a few details as to what occurred during the backup. It's a very simple report. First, copy the script over to somewhere such as /usr/local/bin and make it executable.
# cp /usr/share/doc/rsnapshot-1.3.0/utils/rsnapreport.pl /usr/local/bin # chmod +x /usr/local/bin/rsnapreport.pl
Next, add --stats to the rsync's long arguments section in your host rsnapshot file. Remember the spaces are not spaces but tabs.
# vim /etc/rsnapshot/laptop.rsnapshot.conf rsync_long_args --stats --delete --numeric-ids --delete-excluded
Last, edit the crontab entries that were setup earlier to pass the results of the rsnapshot run through the rsnapreport.pl script.
# crontab -e #MAILTO="" ########################################################## #minute (0-59), # #| hour (0-23), # #| | day of the month (1-31), # #| | | month of the year (1-12), # #| | | | day of the week (0-6 with 0=Sunday) # #| | | | | commands # ########################################################## 15 02 * * * /usr/bin/rsnapshot -c /etc/rsnapshot/laptop.rsnapshot.conf daily 2>&1 | \ /usr/local/bin/rsnapreport.pl | mail -s "laptop daily" me@myemail.com 15 03 * * Sun /usr/bin/rsnapshot -c /etc/rsnapshot/laptop.rsnapshot.conf weekly 2>&1 | \ /usr/local/bin/rsnapreport.pl | mail -s "laptop weekly" me@myemail.com 15 04 1 * * /usr/bin/rsnapshot -c /etc/rsnapshot/laptop.rsnapshot.conf monthly 2>&1 | \ /usr/local/bin/rsnapreport.pl | mail -s "laptop monthly" me@myemail.com
Don't forget the redirection of 2>&1 (standard errors), or they will be lost to stderr. This means redirect standard errors to the same place as standard output. That's it for reporting. You should now get a report to your e-mail address the next rsnapshot run. It will look like this output:
SOURCE TOTAL FILES FILES TRANS TOTAL MB MB TRANS LIST GEN TIME FILE XFER TIME ----------------------------------------------------------------------------------------------------- laptop:/ 59076 1353 17279.45 7169.38 20.361 second 0.000 seconds
10. Final thoughts
At home, I'm on a desktop system with 2 - 200G hard drives in a RAID1 mirror. This is used to backup both my laptop and an asterisk system I have running. If you're smart, you'd follow suit with running a RAID1 mirror or some redundant disk scheme to use as your rsnapshot server. Good luck.