Backup on XP (geeky)

I asked the crowd via Twitter (and thence Facebook) about backup solutions for Windows XP, and got several responses, plus a few requests to hear what I found out, so this is to summarise that.

The particular problem I want to solve is backup on to an external huge hard disk.  This post gets a bit long and techie, but the short answer is I went with NTBackup, the backup tool built in to XP.

As the canard goes, backing up is a bit like flossing, in that everybody knows you ought to do it regularly but most people don’t. Except people who’ve been burned in the past.

Luckily, my then-technophobic mother taught me that particular lesson at an early age, when she wiped my first ever full-scale program by accidentally knocking the power cable out from the back of ZX Spectrum.  (I was trying to get her to test how user-friendly I’d managed to make it, and so I also learned the valuable lesson that real users can create whole categories of problems you did not anticipate.)

Backup is one of those things that in my head is a known solved problem.  There are two interesting problems to solve – the main one is how to back up the minimum amount of stuff but still cover everything; a secondary one is how to structure the backups to make it easy to get things back.

The ‘back up the minimum amount of stuff’ problem is essentially the problem that the rsync algorithm solves: how to find the minimum amount of data to cover the changes between an original and an updated chunk of data.  So any GNU/Linux installation can use rsync as the basis for an automated (or any degree of semi-automated) backup system.

And Unix-like file systems have another property that makes the secondary problem easy: hard linking. This essentially means you have a single file on the disk, but appearing in more than one place in the directory tree (folder hierarchy, if you prefer).  This is really really useful for backup, because it means you can do a full backup – copying everything – in to one directory on your backup disk, and then subsequently do an incremental backup (just the stuff that has changed) to another directory, adding hard links to the full backup.  And you can keep doing incremental backups like that. The clever bit is that each time you do a backup, the directory looks like a complete copy of whatever you are backing up, but the extra disk space taken up is only the difference between that backup and the last one.  Even better, you can delete (unlink) arbitrary backups without losing any other data. So, for instance, you could create a backup every hour, and delete backups on a rota so you end up with backups every hour for the last day, every day for the last fortnight, every fortnight for the last few months, etc.

(If you don’t have this system, you have to keep everything between the last full backup and the last incremental backup, or you’ve effectively lost your backup.  This is very fiddly to get right, and is a common cause of problems restoring from backups.)

If you’re a half-decent Linux geek, you can easily roll your own backup system with cron, rsync and a short shell script.  If you have a Linux box but that’s more fuss than you can be bothered with, there are umpteen Open Source graphical front ends to essentially the same system. These are of variable beauty and usability.

If you have a Mac, you get Time Machine, which has Apple’s beauty and usability built in to its interface, and the power/efficiency of the Unixy approach underneath. If you have an external drive to devote to it, it really is as simple as saying ‘Time Machine, do your thing on this drive’ and remembering to plug the drive in from time to time.  This is my dream backup system.

Alas, Windows XP doesn’t have this option. And my existing backup strategy (burn DVDs at pseudorandom times, keeping manual notes of what’s been backed up and what’s not) left a lot to be desired.

The problems run moderately deep, though.  Windows doesn’t come with rsync (though there are multiple ports, but you usually have to go half-way to a dual boot system (Cygwin) to make them work properly), and it doesn’t really do hard links (actually, it can, but not in a way that’s simple and straightfoward to the user, and so hardly any software does). It has its own system of flagging changed files (the archive attribute) which is fraught with problems.

So what’s to do?

The first solution suggested (thanks @andrew_x!) was to convert the Windows machine to a dual-boot system with Linux (e.g. Ubuntu), and use that to back up the Windows data.  That has the mathematician’s appeal of reducing it to a known solved problem.  If I wanted a dual boot system anyway and planned to spend most of the time in Linux, it’d be the top choice. But I don’t (I have other machines with Linux on).  Any backup regime that has ‘reboot in to a different operating system’ as step one is unlikely to be pursued as rigorously and regularly as it should.

The next set of solutions (thanks @elpuerco63 @hockeyshooter and others) is to buy some backup software.  There are plenty, from ECM/Dantz Retrospect (which is aimed at people with several Windows boxes to back up) and similar server-based packages, to the straight-up standard consumer packages like Symantec Norton Ghost, or Symantec Norton Save and Restore. (These two are the ones of the standard paid-for offline backup tools that Which? apparently rates as Best Buys.)  All of these, however, cost actual money, which I am very keen not to spend – partly because I have very little spare cash at the moment, partly because it seems silly to spend money on something when there are good Free/Open Source Software solutions, and partly because it’d mean I couldn’t get a backup done this weekend.

There’s a plethora of back-it-up-to-the-cloud solutions. I wasn’t interested in any of those because I have:

  • a) 120 Gb to back up and a capped Internet connection,
  • b) some nervousness about sending every last drop of my personal data in to the network,
  • c) a degree of skepticism about the reliability of such services, and
  • d) a vague, woolly echo of Richard Stallman’s political objection to cloud computing – though usually this is often balanced by a similarly vague, woolly echo of David Brin’s argument that a transparent society would be a good thing, and utterly outgunned by the siren call of Convenience.

Plus they cost real money for more than a few Gb, and my first two objections apply.  (If you do only have a few Gb of files to back up, I can heartily recommend Dropbox – free for <2Gb, syncs multiple machines and platforms easily.)

You also often get simple backup software bundled in with other things: Nero (the CD-burning package) apparently has a backup feature, and many external hard drives come with some toy backup software thrown in. Mine didn’t.

What I did manage to put my hands on, though, was NTBackup, the backup tool built in to Windows XP.  (In XP Home, it’s not installed by default – you need to get your original media and find and run NTBackup.msi in \Valueadd\Msft\Ntbackup.) It lives in Start | Accessories | System Tools.

It’s not world-class stuff: you can tell it was written for the original Windows NT 3.51.  Charmingly it defaults to writing the backup to A:\BACKUP.BKF (off the top of my head, I make it that I’d need over 80,000 floppy disks to back up my data, which would be a little tedious to insert). And the interface is almost wilfully ugly.

But (a) it didn’t cost me any more money, (b) it was to hand, (c) it has a handy option for backing up the system state (including the Registry), (d) it groks Volume Shadow Copy so can copy in-use files, and (e) it worked.