Linux and RAID

Using a UPS and RAID at the same time

Worth a quick mention since it took me a number of embarrassing mistakes before I realised what was going on.

I have an intelligent APC UPS and, with the Nut software installed fairly quickly and painlessly, just following the installation instructions, I have the RS232 connected up and by either:

Pulling out the mains, or
Using the upsc interface to fake pulling out the mains

we can quickly reduce the system to shutting down after a short grace period and waiting for the UPS to pull the plug on the actual hardware when things get really desperate. There's a problem though - every time I do this, when we start up again the RAID array *always* does a re-sync.

The answer: while the system is waiting to go down, in the file /etc/rc.d/init.d/halt, it has not quite got to the stage where it calls /sbin/halt. This is a shame because it is the /sbin/halt that makes the RAID array go read-only and syncs to disk before shutdown or reboot. Knowing this it is a simple matter to insert a call to:

	mdadm --readonly /dev/md0

at some appropriate point in the /etc/rc.d/init.d/halt script (taking notice of whether /etc/killpower exists or not - which tells you whether we are doing a normal shutdown - it's absent - or whether the UPS is killing is automatically - it's present).

RAID5 Recovery from Failure of Two out of Three Disks

I have what I imagine is a fairly standard set-up for RAID5 on IDE:

	/dev/hda	boot disk & root
	/dev/hd[bcd]	RAID5

Despite being on a UPS, for some reason the hardware decided to throw some strange IDE error on the secondary channel during a RAID read. I think I was lucky that it was a read at this point and that I'm using the ext3 journal filesystem. Read the screen panics and the system log (/var/log/messages).

The hardware was only 6 months old and had been working fine, and I suspected it just wanted resetting. Sadly of course, the kernel had by this time noticed that two of the RAID disks (i.e. first /dev/hdc and then /dev/hdd) had left the array, and written the results to the remaining disk /dev/hdb.

OHMYGOD - where is my data? Surfed the web for ages for the results (the following) and thought it fair to reproduce in case in helps someone else.

First off I powered down and back on again some time later. All disks present for POST, so I assumed a transient IDE error. The standard Linux boot noticed that the system was in trouble during boot-up and dropped me into a root shell for system maintenance.

Running lsraid -p -R confirmed the actual state of the array, /dev/hdb ok and the other two absent. Next step is to basically lie about the state of the array using /etc/raidtab to rewrite the superblock. There are plenty of warnings here about the fact that you *MUST BE SURE THAT YOUR /etc/raidtab REALLY DOES MATCH THE CONFIRGURATION OF THE ARRAY*. If not, you can always construct it from the output of lsraid. Either way you now have a valid, if slightly untruthful, /etc/raidtab.

You then edit it, to reflect the first disk to go missing (check the system log if you don't know), and mark it as missing, the others present. In my case this resulted in (your settings may differ):

	raiddev             /dev/md0
	raid-level                  5
	nr-raid-disks               3
	chunk-size                  64k
	persistent-superblock       1
	nr-spare-disks              0
	    device          /dev/hdb1
	    raid-disk     0
	    device          /dev/hdc1
	    failed-disk   1
	    device          /dev/hdd1
	    raid-disk     2

You can then force this configuration to be written to the array with:

	mkraid --force --dangereous-no-resync /dev/md0

This will print an entire screenful of warnings which you read and then wish you hadn't eaten anything recently. I would recommend that you heed the fact that this is "kill or cure" by this point, you will either your data back or a nice big empty filesystem. It also points out the way to actually run the command through to completion:

	mkraid --really-force --dangerous-no-resync /dev/md0

Next we mount the array read-only:

	mdadm --readonly /dev/md0

and see if the data is readable. It's all there - what a relief. All that remains is to reboot which will mount it read-write and then add the third disk back in again with:

	raidhotadd /dev/md0 /dev/hdc1

Wait for about 3 hours for the re-sync and finally the array is rebuilt and in its former secure state.

Phew!

Finally time to go back and update the /etc/raidtab so that all the disks are down as raid-disk.

Home

Technical Index