Bandelero Loses a disk

|
One of my hard drives from my IDE Bandelero failed leaving me with a degraded RAID array. I've been spending a lot of time thinking about backup strategies for 203 gigs of data. Well this failure really sealed that deal. I bought a usb 2.0 card, a chincy USB 2.0/firewire enclosure, and a 250 gig IDE disk. Plus, I upgraded...

...the array by replacing the two (non-uniform) 100GB disks with 2 brand new Hitachi 160GB disks that were $60/ea at a labor day sale. So my raid array grew by 60GB to 263GB. Before I could rebuild the raid devices and recreate the logical volume I had to backup 190GB of data to the external hard drive.

The usb card and external drive worked easily with a kernel recompile for the usb chip type and usb-storage. Linux handles it all very well.

Next I moved the data from the degraded array to the external drive. I mounted the degraded array read only to prevent changes and to keep listening to the music collection from the samba share. I used rsync in archive mode to do the copying under the pretense that I'll make another backup in a few months. The first backup is just a copy, but using rsync, the next backup will be incremental. This backup took 50 hours.

After the backup completed I was ready to work on the new array. I replaced the physical drives in the computer. I mounted the backup read-only and shared it as the normal music share (to keep it online). Then I created a new raid array using the new 160GB disks. Finally building a logical volume from the remaining raid array and this new one.

Here's the commands:


#rebuild raid volume
mdadm --create /dev/md2 -l 1 -n 2 /dev/hdf1 /dev/hdh1

#create logical volumes
pvcreate /dev/md2
pvcreate -ff /dev/md1
rm -R /dev/vg0
vgcreate -s 8m vg0 /dev/md1 /dev/md2
vgdisplay
lvcreate -l 33943 vg0    
lvrename vg0 lvol1 lv0

lvdisplay /dev/vg0/lv0 
--- Logical volume ---
LV Name                /dev/vg0/lv0
VG Name                vg0
LV Write Access        read/write
LV Status              available
LV #                   1
# open                 1
LV Size                265.18 GB
Current LE             33943
Allocated LE           33943
Allocation             next free
Read ahead sectors     1024
Block device           58:0

Notice that I used 8m for my max physical extents. The default extents for a volume group, 4m, maxes the volume size at 256GB. I need, just over that. Also notice that I rename the volume logical volume from lvol1 to lv0, because that's what it is in my fstab.

Finally, I formatted the new logical volume as reiserFS. Now I'm copying the data back to the volume from the external drive using rsync again. I expect it to take another 50 hours.

After, the copy is finally done. I'll remove the drive from the external enclosure, seal it up, and store it at a friends house. That should cover me from most catastrophes. Then I'm taking the 100GB disk that didn't fail and putting it in the enclosure to use a a video disk to bounce between my various workstations.

Conclusions:

  • I'm not really happy with my IDE bandelero. It's tough to change the drives out. I probably won't revisit it until next failure though. At least it's hearty and done. Good enough!
  • Linux logical volumes on top of software raid1 works really well.
  • A hard-drive failing still hard locks the IO subsystem for certain failures and you have to restart the computer. So it's not a completely graceful failure, but no data is lost and it comes up degraded working fine.
  • External IDE hard drive backups are the best price, size, and speed for medium risk backups.
  • Rsync is subtle about how it uses subdirectories. Explore the -R option.
  • Big data takes time. Be patient.