24/09/2004
RAID5 + LVM2 + recovery + resize HOWTO
I was looking forward to creating a big fileserver with disk crash recovery capabilities. LVM2 with reiserfs partitions couldn’t do the trick for me. I had 3 200Gb disks “united” under a logical volume, and formated them with reiserfs and I want to test what would happen if one disk “crashed”. So I created a fake crash..I shut the machine down, pulled the plugs of a disk and rebooted. I managed to see the logical volume using the latest lvm2 sources and the latest version of the device mapper:
# lvm version
version LVM version: 2.00.24 (2004-09-16)
version Library version: 1.00.19-ioctl (2004-07-03)
version Driver version: 4.1.0
Unfortunately I had no luck in reading the reiserfs partition. The superblock was corrupted and the reiserfsck –rebuild-sb /device did not work… Salvation was impossible.
While googling the web and trying to find out possible solutions I came up to the wonderful idea of creating a software raid5 array of the 3 disks and have LVM2 on top of the raid. I would lose 1 disk in “space”…but I gained the ability to recover after an error and to be able to add more disks if that was necessary.
Before we continue I must say that it’s necessary that you HAVE worked before with raid and lvm so some commands are familiar to you. This is NOT a step by step guide…but more like a draft of how things are done.I am not going to explain every little detail…man pages and google are always around if you have any questions.
Enough of this…let’s start.
First of all let’s say that we got our 3 disks on /dev/hde, /dev/hdg, /dev/hdi
1) We create 1 partition on each one covering the total space using our favorite disk managment software (fdisk, cfdisk,etc). (btw, drives MUST be IDENTICAL).
2) Then it’s time to create the /etc/raidtab file. Our contents should look like:
raiddev /dev/md0
raid-level 5
nr-raid-disks 3
nr-spare-disks 0
persistent-superblock 1
chunk-size 32
parity-algorithm right-symmetric
device /dev/hde1
raid-disk 0
device /dev/hdg1
raid-disk 1
device /dev/hdi1
raid-disk 2
3) Now let’s create our array:
mkraid /dev/md0
4) It’s time for LVM2 now…let’s edit the /etc/lvm/lvm.conf so that we add support for raid devices. My filter line looks like this:
filter =[ “a|loop|”, “a|/dev/md0|”, “r|.*|” ]
5) Start initializing the LVM:
pvcreate /dev/md0 (you can issue a pvdisplay to see if all things are correct)
vgcreate test /dev/md0 (you can issue a vgdisplay to see if all things are correct)
6) Time to create a small logical volume just for testing:
lvcreate -L15000 -nbig test
(you can issue a lvdisplay to see if all things are correct)
7) Now there’s something that’s distro-specific. “Usually” lvm is started on init script before software raid. But in our case, when a reboot occurs, we want a) start the raid b) start the lvm. I am using gentoo as a distro and gentoo had these things the other way round…It first started the lvm and then the raid, which resulted in errors during the boot process. This case is easily solved in gentoo by editing /etc/init.d/checkfs and moving the part about the LVM below the part about the software raid. The config file is really easy to read so I don’t think anyone might have a problem on that…
8) Let’s test what we’ve done so far…Let’s format that logical volume we’ve created with ext3.
mke2fs -j /dev/test/big
9) Make an entry inside your /etc/fstab to point to a place you want to mount that logical volume…and then issue a:
mount /dev/test/big
10) You are now ready to start copying data onto that volume…I’d suggest that you copy 5-10Gb out of the first 15Gb that we’ve created (remember that -L15000 ?).
11) We first stop the raid device (after unmounting it and changing the activation of the logical volume, lvchange -a n /dev/test/big):
raidstop /dev/md0
12) Let’s destroy one disk. Open up again your favorite disk managment tool and pick up one disk to destroy…let’s say /dev/hdi. Delete the partition it already has…and create a new one. All previous data is now lost!
13) If you want to make sure that you are on the right path of destroying everything…reboot your machine. Upon reboot you should get errors on the software raid and on the LVM not being able to activate the volume group “test”.
14) Upon the root prompt issue:
raidstart /dev/md0
and then do a: cat /proc/mdstat
You should probably see something similar to this:
cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [multipath]
md0 : active raid5 hdi1[2] hdg1[1] hde1[0]
390716672 blocks level 5, 32k chunk, algorithm 3 [3/3] [UUU]
[========>…………] resync = 43.9% (85854144/195358336) finish=115.9min speed=15722K/sec
15) When that is finished, it will mean that raid5 has rebuilt the array after recovering from the “faulty” disk, that we’ve created, and the placement of the “new” drive. (both destruction and the new disk placement was done on step 12)
16) Issue: vgscan
It will make the volume group active again.
17) Say that you need more space to that logical volume you had created…15Gb is not that much after all…
lvextend -L100G /dev/test/big
We’ve now made that previous 15Gb logical volume to a 100Gb one…already feels much better…doesn’t it ?
18) But that’s not all, we now need to extend the ext3 partition to cover up all that “new space”
e2fsck -f /dev/test/big ; resize2f /dev/test/big
We first check that the partition is ok…and then resize it to the full extends of the logical volume.
19) We are set! We just need to mount our new partition…and we now have 100Gb of space! You can now extend that even further or create more logical volumes to satisfy your needs.
This section is to come in a few days…stay tuned.
I hope that all the abobe helped you to create a better and more secure fileserver. Comments are much appreciated.
Filed by kargig at 02:37 under General
2 Comments | 9,270 views
Have you ever tried to “raidstop” while a recovery was occurring after a “mdadm /dev/mdX –add /dev/scsi/cxdypz” and this
causes the system to hard halt and not even self boot back up?
Its a log story but I did the raidstop indirectly in a Serviceguard/LX environment (shared md). Then redid the “raidstop” directly to prove it.
I looked all over the web for someone who has come across this, that explains it but I have found no one.
This “recovery” takes 8hrs (250Gb) and I was simply moving a “service package” which ends in a raidstop.
Thank You
I have never tried such a “trick”…but it sounds bad, really bad 😐
I have had enough probs with software raid and various filesystems(ext2,ext3,reiser). Currently my system is stabilized with raid5+lvm…and I won’t change this for a long time I think