Tuesday, August 19, 2008

The New Backup Strategy

Before Leopard, I had a specific backup strategy for both my volatile content (things like source code, documents, etc) and my entire hard drive, which was a snapshot backup. The volatile documents solution was Subversion, kept on a remote, universally accessible machine. The snapshot backup was handled by SuperDuper, which backs up your entire drive in a bootable state, essentially creating a complete snapshot of your hard drive. A couple of problems reared up because of this setup.
  1. I had to handle the "package" files (the ones created by Apple iWork, like Keynote and Numbers) files specially, because Subversion didn't like the way the applications managed the contents. Basically, I had to zip/unzip them for version control. This wasn't terrible (I automated the process to a large degree), but still a little annoying.
  2. My Subversion repository was huge, because I had my entire Documents folder in it. However, most of the documents were there just so that I could have a backup, not because I wanted to version them. The only files I really versioned where the source files and related content.
The advent of Leopard and Time Machine changed my strategy. First, I separated out the Documents stuff for which I really only wanted backups and let Time Machine handle them. I put all my versionable files (like source files) in a new, much smaller Subversion repository. And, even though I have Time Machine, I still use SuperDuper to create backups. The reason I use both:
  • I want the hourly, behind the scenes backup provided by Time Machine.
  • I want to be able to browse backwards in time to look at previous versions of those files, and the Time Machine UI is gorgeous for that.
  • Time Machine alone isn't sufficient. To restore from Time Machine, you have to boot your machine from a startup disk, then restore the backup. Yuck! I still like SuperDuper's snapshot approach, which I've proven to myself works flawlessly (see Don't Crack Open Your Mac for the story).
  • SuperDuper and Time Machine can share the same drive, so I have a single 500 GB drive that has all my backups on it.
  • It's now easier to replicate the source code in more places (IOW, more machines) because it's much smaller.
  • You can tell Time Machine to only backup certain directories (or, more specifically, exclude directories you don't want backed up). Because I only use Time Machine for my Documents folder, it takes only a little space.
  • Because SuperDuper creates a bootable drive image, and my external drive is FireWire, I can boot another MacBook Pro with the external drive. Yes, it's slow, but if the worst happened while on the road, and I've got to present at a conference, as long as I can borrow/steal another machine, I can boot into my machine from backup and do my presentation.
I've been using this approach for a while, and it works nicely. I leave the external hard drive hooked up all the time, and start a snapshot backup at bedtime every night. SuperDuper has a nice option that will sleep the computer when it's finished it's work. So far, this it working out really well. I haven't had to restore the whole drive from SuperDuper yet on this machine (but I know that works -- I've done it on other machines), and I have used Time Machine to grab a file that accidentally got deleted.


Anonymous said...

Neal, I like your process and have experimented with SuperDuper myself as well. The only concern here is what about offsite backup, should, heaven forbid, you have a fire or flood. I'm more worried about geographically distributed backups these days with natural disasters. To fill that part of the gap, I've turned to CrashPlan to get at least my data folders over to another geography (friend on the other side of Denver, and friend in Albuquerque)... Just wanting to hear your thoughts about this aspect...

Jason Rudolph said...

Nice post, Neal. I use a similar strategy, but prefer Git over SVN. Git, besides being superior in general ;-), also makes versioning iWork files "just work."

@matthew - You're right. Adding the offsite component is key. If you want a simple roll-your-own solution, check out this backup script that Rob Sanheim put together.

LSD said...

I too use both programs, but a little differently. I keep an internal drive running TM, and I switch that out when it gets full (and keep the previous as an archive). But I also weekly SuperDuper! my drives (rotating between two backup drives), and I keep those drives in a fireproof safe. Send an entire backup offsite on a yearly basis, so I could lose a lot should local disaster occur. Since I'm clearly paranoid, maybe I need more frequent way-off-site backups.