Meta Backup DreamHost on DreamHost Backups

DreamHost provides a snapshot backup service. However, this backup service seems a little shabby:

You may request a backup once every 30 days.

Due to some technical limitations, any users or databases greater than 4GB will not be included in these backups!

We also recommend that you maintain your own off-line backups locally.

Of course, they should complement such a faint backup service with something. It turns out that they do. DreamHost offers one Backups User per account (I’ll simply call it DreamHost Backups):

At DreamHost, you may only keep website-related content on your regular users.
You do, however, get one user per account where anything legal may be stored; your Backups User.

This user cannot have any websites pointed to it, nor may you share files via it… it is only to be used as an off-site backup for your personal files.
As such, we keep no backups of files on this account. These are already supposed to be your backups… not your only copy!
(Of course, you should always keep your own copies of all data stored with us.. we make no guarantees!)

Every full DreamHost Hosting plan includes 50GB of backups space!
(Additional usage will be charged at the rate of 10 cents / GB a month: the best backup deal on the net!)

You may access your backup user via ftp, sftp, scp, and rsync!

Surely, anything legal includes your legal web sites on DreamHost. And 50GB, in my opinion, is large enough for personal or small business web sites1. So why not a meta backup of your DreamHost web sites on DreamHost Backups?2

Let’s just do it. I’m sure to get your hands dirty along the way. So be warned before reading on.

Mainly, I grabbed the pith from the clever Snapshot-Style Backups:

rm -rf backup.3
mv backup.2 backup.3
mv backup.1 backup.2
rsync -a --delete --link-dest=../backup.1 source_directory/  backup.0/

and adapted it to off-site backup, which entails more effort.

The core application used is rsync which is fortunately supported by DreamHost Backups.

But to incrementally build multiple versions of snapshot, only rsync is not enough. We also need at least two other shell commands: mv and rm.

Neither of them is available due to the lack of shell support on DreamHost Backups. What we have are only ftp, sftp, scp, and rsync, remember?

So, what we can do now is just make good use of what we have. In fact, we’ll need to make extreme use of two of them.

rsync is one. Which is the other one? scp can be first eliminated, for copy is already perfectly handled by rsync. Between the left two, I’ll take the safe side. So, sftp.

All ftp clients can do renaming, they just rename mv rename (quirk intended). To our great joy, sftp support batch mode, which means we can put all the renames in a script, and feed it to sftp which then cranks at it dutifully without any further interfere from us.

rm is the only knot left, which turns out to be a really hard one.

sftp supports rm, but only for files; it also supports rmdir, but only for empty directories. What we need is recursive rm3, and we are going to simulate one, by rsync, of course.

NOTICE: this is the only valuable part of this fluff, so attention please.

rsync -avz --delete --include='/backup.3' --exclude='/*' any_source_dir_without_backup.3 backups_user@backups_host:path

It is mind-boggling, I know. So let me explain it:

  • “–delete” — to delete the missing files/directories from destination;
  • “–exclude=’/*’” — to skip all files/directories when comparing source and destination, except for;
  • “–include=’/backup.3′” — the one to be deleted, must be put before the exclude-all clause, otherwise it would be excluded first and have even no chance to be checked for whether it should be included;
  • because “–delete” has effect only for the missing part, ‘/backup.3′ must be missing at source.

Another point maybe of interest is that you’d better make a proper filter for rsync to use when syncing. It will save you both bandwidth and disk space.

I exclude my locally installed softwares (~/bin, ~/include, ~/lib, ~/share, ~/var), sensitive information (~/.ssh/, ~/.my.cnf; actually, I exclude all dot files under my home directory, i.e., ~/., since it is web sites not development host that are being backed up), and other stuffs that just make no sense to be backed up, like source packages of the softwares (~/soft), temporary swap files of Vim (.swp), and backup files spewed by Emacs (*~). I put all these exclusive filter rules in a script and feed it as the value of the ‘–exclude-from’ option to rsync.

That’s all. All the other utilities, such as mysqldump for mysql backup, and cron for scheduled automatic backup, are all explained very clearly on wiki.dreamhost.com.

Lastly, I’ll wrap all these up into several scripts (assumed to be put in ~/backup_utils), for you to take away:

# Remember to make it executable, via "chmod +x ~/backup_utils/mybackup".

# dump mysql: the password is stored in ~/.my.cnf which is not to be backed up for security
mysqldump -u mysql_user -h mysql_host -A >| mysql_dump_path

# rsync and sftp both use ssh whose keys are stored in ~/.ssh/ which is not to be backed up for security
# delete the oldest
rsync -avz --delete --include='/backup.3' --exclude='/*' any_source_dir_without_backup.3 backups_user@backups_host:path
# rename olds to olders
sftp -b ~/backup_utils/sftpcmds backups_user@backups_host:path
# send the new
rsync -avz --exclude-from="$HOME/backup_utils/rsync.filter" --delete --link-dest='../backup.1' ~/ backups_user@backups_host:path/backup.0/

rename backup.2 backup.3
rename backup.1 backup.2
rename backup.0 backup.1

/bin/
/include/
/lib/
/libexec/
/share/
/soft/
/var/
/.*
*~
*.swp

daily backup
0 0 * * * ~/backup_utils/mybackup

PS: While I am writing this article about DreamHost backup, my backup script is backing up drafts of the article. What a meta backup!


  1. If you are running web sites larger than 50GB, I’m sure you know everything I’m talking about here much better than me, so this article is obviously not for you. 

  2. Notice that DreamHost Backups are hosted on different servers from web hosting servers, which makes much more sense for our meta backup. 

  3. lftp supports recursive rm though not very smoothly; besides, it is not pre-installed on DreamHost.