Posts Tagged ‘sparse bundle’

Problems with Time Machine network backups on OSX

Saturday, February 7th, 2009

Background

Time Machine was a major new feature of Apple’s OSX 10.5 (Leopard) – an easy-to-use backup system that anyone can use. This is great – something so simple that ‘your grandmother could use’, with a simple interface that makes it easy to restore data.

In addition to being able to backup to a local disk (connected by USB or firewire), you can backup over the network. Apple sell the Time Capsule NAS (network attached storage) primarily for network backups using Time Machine. Backups to local disk are simple file-based backups, whereas backups to a network drive utilise sparse-bundle images (another new feature in Leopard).

Configuration

I assume there is a way to backup to OSX servers (such as the Xserve), but there are numerous articles describing hacks for how to configure backups to an SMB (or NFS or AFP) fileshare.

Being a big proponent of open-source solutions, I looked around for a solution using Linux, Samba and/or Netatalk, preferably one that doesn’t require any special setup on the client machine; I managed to find one that makes the fileserver look like Apple hardware – something that I intend to try at some point!

First, though, I wanted to get a few Time Machine backups under my belt, so I started with the aforementioned hacks – backing up to a Samba fileshare (on Linux).

My Linux fileserver has mirrored disks and LVM, so it’s easy to create a new logical-volumes (similar to slices/partitions in other configurations), so I created a new LV for the backup and then configured samba:

[timemachine]
   comment = osx backups
   path = /BACKUPS/timemachine/machines
   public = no
   writable = yes

The root of the fileshare should contain a file called “.com.apple.timemachine.supported“, and the root of the fileshare needs to be writeable by the remote users (Time Machine creates a file there before it starts the backup).

On the OSX machine (Mac) you need to enter the following command into a Terminal window to allow backups to normal fileshares:

defaults write com.apple.systempreferences TMShowUnsupportedNetworkVolumes 1

You then need to create a sparse-bundle disk image for Time Machine to backup to. This has a particular name based on the machine name and main (ethernet) NIC. Assuming the computer’s name is MBP and the hardware address of the NIC is 00:01:23:45:67:89 then the following command should create a 300GB sparse-bundle image:

hdiutil create -size 300g -fs HFS+J -volname “Time Machine” MBP_000123456789.sparsebundle

This command should be run on the local OSX machine and copied (eg using ‘rsync -avE‘ or ‘cp -r‘) to the fileshare [1]; it cannot be run to create the file directly on the fileshare, apparently.

At this point you should be able to configure Time Machine, select the fileshare you wish to use (I found that the fileshare had to be mounted by logging into it in the Finder), give a username and password and then it should be ready to start the backup.

A full backup of my 250GB disk (of which about 220GB gets backed up) took around 3 hours the first time, with a gig-e connected ethernet cable. For me, in the first 10-20 minutes Time Machine did not utilise much bandwidth.

[1] If, like me, you used rsync to copy the file but forgot the -E flag, then the resource-fork will not have been copied. (On a Linux box (ext3 fs) this resource-fork gets created as the file ._MBP_000123456789.sparsebundle in the above example.) If the resource-fork is not on the remote filesystem then the “sparsebundle” image will look like a directory rather than a disk image. From my limited use, this doesn’t appear to have made any difference to the workings of Time Machine (backups worked OK, a partial restore worked OK); the only downside was that, if using Finder to browse the fileshare, you cannot mount the sparsebundle image by double-clicking (but hdiutil, and Time Machine, mount it fine). I was able to correct this problem by copying across the blank image again (from OSX to the fileshare), to a temporary subdirectory on the fileshare this time to avoid overwriting the backup data, and moving the resource-fork file back to the time-machine directory (so that the populated sparsebundle image and the resource-fork file are in the same directory). This made the populated sparsebundle then show up as a disk image under Finder. The temporary subdirectory and contents can then be removed.

Failures

Corruption

Having had numerous failures whilst trying to backup with Time Machine, I decided to see if I could fix it rather than just delete and start again.

I took a look for a someone who had found and resolved the problem. Whilst there are loads of people who report failures (the forums are alight with them), I only found one that managed to recover the backup image. The instructions were the normal disk-recovery type things:

  1. attach the disk image (which checks and may fix some issues)
    • sudo hdiutil attach -nomount -readwrite /path/to/sparseimage
  2. verify/repair disk
    • in disk utility, select the (unmounted) backup volume, select the First Aid tab, and click on Repair Disk
  3. check the filesystem
    • sudo fsck_hfs /dev/disk-slice-name (eg disk1s2)

Unfortunately, stage 2 failed for me with “Invalid sibling link” whilst “Checking Catalog file.” As the Catalog was the problem, I ran with the -r fsck_hfs flag (check the man page) to rebuild the catalog, but this failed, saying “The volume Time Machine could not be repaired.

Hopefully the above will be useful to someone (including me, if I need to re-visit) – especially if you are used to unix but not necessarily ‘the depths of OSX’.

Time Machine on new backup device/file

Sometimes the backups don’t work properly and you have to delete the Time machine backup (eg sparse file) and start again.

I had got Time Machine working fine, backing up every hour, and then I needed to move some large files about temporarily. Rather than TM making huge backups for minimal benefit I turned off TM for a couple of days. When I turned it back on it sat ‘Preparing the backup’ for 15 hours or more (confirmed several times) – it just didn’t work. As a result the only thing I could think of doing was delete the disk image (sparsebundle), create a new one of the same name (as per the above instructions), and start again … losing all previous backups.

Following this, TM failed every time. Checking the logs (Applications -> Utilities -> Console)

06/03/2009 11:52:19 /System/Library/CoreServices/backupd[11312] Backup requested due to disk attach
06/03/2009 11:52:19 /System/Library/CoreServices/backupd[11312] Starting standard backup
06/03/2009 11:52:19 /System/Library/CoreServices/backupd[11312] Network mountpoint /Volumes/timemachine not owned by backupd... remounting
06/03/2009 11:52:20 /System/Library/CoreServices/backupd[11312] Network volume mounted at: /Volumes/timemachine-1
06/03/2009 11:52:20 /System/Library/CoreServices/backupd[11312] Volume at path /Volumes/timemachine-1 does not appear to be the correct backup volume for this computer.  (Cookies do not match)
06/03/2009 11:52:25 /System/Library/CoreServices/backupd[11312] Backup failed with error: 18
06/03/2009 11:52:25 /System/Library/CoreServices/backupd[11312] Ejected Time Machine network volume.

The relevant (and problematic) part is “(Cookies do not match)“. This appears to be along the lines of the fact that a cookie is kept on the TM machine (that is being backed up), and in the disk image (for the backup), and they are not in sync. To fix this you need to go into Time Machine, select the disk to use, and select “none“. Then select the relevant disk/share. This appears to reset the cookie (or whatever) and allow you to run Time Machine backups again.

Links

There are numerous similar articles around the Internet, including one article mentioning problems with sparse files  I considered a bit obvious. Sparse images/bundles only take up the amount of disk space of the files contained in them (so if you create a 100GB sparse-image and put in 10GB of files, the sparse-image will only use up 10GB of disk space. The good point about this is that you can use the remainder of the disk space for other things … for the time being. Of course, if you do not keep an eye on available space then you could have problems! The aforementioned article describes a problem that Time Machine had with people who used this ’spare disk space’ on the server, causing (backup) data loss. Fortunately it sounds like this bug has been fixed, but it is something to bear in mind!

Note

As with all these kind of hacks, the author is not responsible for any loss of data or problems when following the instructions. That said, I would like to hear if you had problems, you can think of a better way to do this, or any other constructive feedback.

Update – 2009-07-08

By way of an update, after Konstantin’s comment, things have been working stably for me over the past few months.

Soon after writing this blog posting I noticed an error I had in my smb.conf file. I had “locking = no” on the share, for some reason (probably due to copying config from another share and not properly checking the options). I removed the line in smb.conf, created a new sparse-bundle image, and errors happened less frequently (although they still did happen for a bit).

For information, I am currently running OSX 10.5.7 (I was running 10.5.6 when it wasn’t working, and during the time it has been working), and Samba 3.0.25b (quite old, really). Samba has not been updated either, so I’m at a bit of a loss as to why things “suddenly” became stable.