introduction to raid

(Written by Paul Cobbaut, https://github.com/paulcobbaut/, with contributions by Bert Van Vreckem, https://github.com/bertvv/)

Setting up a Redundant Array of Independent (originally Inexpensive) Disks or RAID usually has two main goals:

Redundancy: if a physical disk fails, no data is lost
Performance: data can be read from and written to the RAID array in speeds that are higher than individual disks by using multiple disks at the same time

Depending on the configuration or RAID level, you can have either of them, or a combination of both.

RAID can be set up using hardware or software. Hardware RAID is more expensive, is usually either included in server systems for enterprises, or can be purchased as a PCI card. Theoretically, it offers better performance because disk operations are performed by a dedicated controller. Software RAID is provided by the operating system and therefore cheaper, but it uses your CPU and your memory. Software RAID is generally easier to manage, since hardware RAID is vendor-specific.

Where ten years ago nobody was arguing about the best choice being hardware RAID, this has changed since technologies like mdadm, LVM and even next-gen filesystems like ZFS or BtrFS focus more on managability. The workload on the cpu for software RAID used to be high, but cpu's have gotten a lot faster.

Be aware that a RAID array is not a substitute for a backup strategy, nor for a disaster recovery plan!

raid levels

Before we dive in to the practical part of creating a raid array, let's first discuss the different raid levels. Be aware that only some of these levels are "standardized". Some hardware vendors have defined their own proprietary raid levels.

Deciding on which raid level to choose depends on the requirements of the specific use case and the available budget. Often, the result is a compromise between performance, redundancy and cost.

raid 0

raid 0 uses two or more disks, and is often called striping (or stripe set, or striped volume). Data is divided in chunks, evenly spread across every disk in the array. The main advantage of raid 0 is that the performance is higher than a single disk. raid 0 is the only raid level without redundancy. When one disk fails, all data is lost.

raid 0

jbod

jbod (Just a Bunch Of Disks) uses two or more disks, and is often called concatenating (or spanning, spanned set, or spanned volume). Data is written to the first disk, until it is full. Then data is written to the second disk, etc. The main advantage of jbod is that you can create larger drives than the space available on a single disk. However, JBOD offers no redundancy and no performance gain.

jbod

raid 1

raid 1 uses exactly two disks, and is often called mirroring (or mirror set, or mirrored volume). All data written to the array is written on each disk. The main advantage of raid 1 is redundancy. The main disadvantage is that you lose at least half of your available disk space (in other words, you at least double the cost).

raid 1

raid 2, 3 and 4?

raid 2 uses bit level striping, raid 3 byte level, and raid 4 is the same as raid 5 (see below), but with a dedicated parity disk (we'll explain parity in the next section). This is actually slower than raid 5, because every write would have to write parity to this one (bottleneck) disk. It is unlikely that you will ever see these raid levels in production.

raid 5

raid 5 uses three or more disks, each divided into chunks. Every time chunks are written to the array, one of the disks will receive a parity chunk. Unlike raid 4, the parity chunk will alternate between all disks. The main advantage of this is that raid 5 will allow for full data recovery in case of one hard disk failure.

raid 5

Parity is a method to reconstruct data when one disk fails. When data is written to one of the disks of the array, e.g. a byte with binary representation 0100 1110, it is combined with the data on the same position on one of the other disks, e.g. 1010 1011, using the logical XOR operation:

A	B	A XOR B
0	0	0
0	1	1
1	0	1
1	1	0

The result for the example above would be:

	data
disk 1	`0100 1110`
disk 2	`1010 1011`
parity	`1110 0101`

This parity byte is then written to the third disk of the array. When either disk fails, all data on that disk can be reconstructed by XOR-ing the data on the two remaining disks. For example, when disk 1 fails, the data on disk 2 can be reconstructed by XOR-ing the data on disk 2 and the parity byte:

	data
disk 2	`1010 1011`
parity	`1110 0101`
result	`0100 1110`

This also works for arrays with more than 3 disks. For example in an array with four disks, first XOR new data with the corresponding data on a second disk, then XOR the result with the data on the third disk to get the parity data and store that on the fourth disk.

raid 6

raid 6 is very similar to raid 5, but uses two parity chunks. raid 6 protects against two hard disk failures. Oracle Solaris zfs calls this raidz2 (and also had raidz3 with triple parity).

raid 0+1

raid 0+1 is a mirror(1) of stripes(0). This means you first create two raid 0 stripe sets, and then you set them up as a mirror set. For example, when you have six 100GB disks, then the stripe sets are each 300GB. Combined in a mirror, this makes 300GB total. raid 0+1 will survive one disk failure. It will only survive the second disk failure if this disk is in the same stripe set as the previous failed disk.

raid 1+0

raid 1+0 (also called raid 10) is a stripe(0) of mirrors(1). For example, when you have six 100GB disks, then you first create three mirrors of 100GB each. You then stripe them together into a 300GB drive. In this example, as long as not all disks in the same mirror fail, it can survive up to three hard disk failures.

raid 10

raid 50

raid 5+0 is a stripe(0) of raid 5 arrays. Suppose you have nine disks of 100GB, then you can create three raid 5 arrays of 200GB each. You can then combine them into one large stripe set.

many others

There are many other nested raid combinations, like raid 30, 51, 60, 100, 150, ...

building a software raid5 array

In this section, we'll show how to create a software raid 5 array. If you want to try this yourself, you can create a VM with three (extra) disks. We prepared a Vagrant environment that you can use for this purpose in the Github repository https://github.com/HoGentTIN/linux-training-labs. Clone the repository, open a terminal in subdirectory raid/ and run the command vagrant up. This will create a VM with name raidsrv that has three extra empty disks of 5GB each.

The mdadm command that is used to manage raid arrays is already installed. If it isn't on your lab instance, install it with the package manager of your distribution.

do we have three disks?

Log in to the vm with vagrant ssh and check with lsblk that the empty disks are connected.

[vagrant@raidsrv ~]$ lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda      8:0    0  64G  0 disk 
├─sda1   8:1    0   2G  0 part [SWAP]
└─sda2   8:2    0  62G  0 part /
sdb      8:16   0   5G  0 disk 
sdc      8:32   0   5G  0 disk 
sdd      8:48   0   5G  0 disk

Disk /dev/sda is where the OS is installed (AlmaLinux 9, which you can check by running cat /etc/os-release). Disks /dev/sdb, /dev/sdc and /dev/sdd are the three empty disks we will use to create the raid 5 array.

fd partition type

The next step is to create a partition of type fd on every disk. The fd type is to set the partition as Linux RAID autodetect. Use the L command (not shown in the screenshot) to list all partition types and find fd.

[vagrant@raidsrv ~]$ sudo fdisk /dev/sdb

Welcome to fdisk (util-linux 2.37.4).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0x8c2b2c46.

Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1): 
First sector (2048-10485759, default 2048): 
Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-10485759, default 10485759): 

Created a new partition 1 of type 'Linux' and of size 5 GiB.

Command (m for help): t
Selected partition 1
Hex code or alias (type L to list all): fd
Changed type of partition 'Linux' to 'Linux raid autodetect'.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

Repeat this for /dev/sdc and /dev/sdd. Afterwards, verify the layout with lsblk and fdisk:

[vagrant@raidsrv ~]$ lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda      8:0    0  64G  0 disk 
├─sda1   8:1    0   2G  0 part [SWAP]
└─sda2   8:2    0  62G  0 part /
sdb      8:16   0   5G  0 disk 
└─sdb1   8:17   0   5G  0 part 
sdc      8:32   0   5G  0 disk 
└─sdc1   8:33   0   5G  0 part 
sdd      8:48   0   5G  0 disk 
└─sdd1   8:49   0   5G  0 part 
[vagrant@raidsrv ~]$ sudo fdisk -l | grep raid
/dev/sdb1        2048 10485759 10483712   5G fd Linux raid autodetect
/dev/sdc1        2048 10485759 10483712   5G fd Linux raid autodetect
/dev/sdd1        2048 10485759 10483712   5G fd Linux raid autodetect

create the raid5

We create the raid 5 array with the mdadm command:

[vagrant@raidsrv ~]$ sudo mdadm --create /dev/md0 --chunk=64 --level=5 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

Let's verify the result:

[vagrant@raidsrv ~]$ lsblk
NAME    MAJ:MIN RM SIZE RO TYPE  MOUNTPOINTS
sda       8:0    0  64G  0 disk  
├─sda1    8:1    0   2G  0 part  [SWAP]
└─sda2    8:2    0  62G  0 part  /
sdb       8:16   0   5G  0 disk  
└─sdb1    8:17   0   5G  0 part  
  └─md0   9:0    0  10G  0 raid5 
sdc       8:32   0   5G  0 disk  
└─sdc1    8:33   0   5G  0 part  
  └─md0   9:0    0  10G  0 raid5 
sdd       8:48   0   5G  0 disk  
└─sdd1    8:49   0   5G  0 part  
  └─md0   9:0    0  10G  0 raid5 
[vagrant@raidsrv ~]$ sudo fdisk -l /dev/md0
Disk /dev/md0: 9.99 GiB, 10724835328 bytes, 20946944 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 131072 bytes

It is interesting to note that the raid device has a size of about 10GB, which is the sum of the sizes of two of the three disks!

/proc/mdstat

The status of the raid devices can be seen in /proc/mdstat. This example shows a raid 5 in the process of rebuilding.

[vagrant@raidsrv ~]$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 sdd1[3] sdc1[1] sdb1[0]
        16769664 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [UU_]
        [============>........]  recovery = 62.8% (5266176/8384832) finish=0\
.3min speed=139200K/sec

This example shows an active software raid 5.

[vagrant@raidsrv ~]$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 sdd1[3] sdc1[1] sdb1[0]
      10473472 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]

unused devices: <none>

mdadm --detail

Use mdadm --detail to get information on a raid device.

[vagrant@raidsrv ~]$ sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Fri Nov 22 18:50:56 2024
        Raid Level : raid5
        Array Size : 10473472 (9.99 GiB 10.72 GB)
     Used Dev Size : 5236736 (4.99 GiB 5.36 GB)
      Raid Devices : 3
     Total Devices : 3
       Persistence : Superblock is persistent

       Update Time : Fri Nov 22 18:51:23 2024
             State : clean 
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 64K

Consistency Policy : resync

              Name : raidsrv:0  (local to host raidsrv)
              UUID : f3404c53:ddf0584a:c409709a:ed20cb42
            Events : 18

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1
       3       8       49        2      active sync   /dev/sdd1

format and mount

In order to use the raid device, we need to format it with a filesystem and mount it. We'll use ext4 as the filesystem.

[vagrant@raidsrv ~]$ sudo mkfs -t ext4 /dev/md0
mke2fs 1.46.5 (30-Dec-2021)
Creating filesystem with 2618368 4k blocks and 655360 inodes
Filesystem UUID: 92439e25-eb9d-4c74-9997-a02fde4b3067
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

[vagrant@raidsrv ~]$ sudo mount /dev/md0 /srv/

To make the mount permanent, find the UUID of the device and add it to /etc/fstab.

[vagrant@raidsrv ~]$ sudo blkid /dev/md0
/dev/md0: UUID="92439e25-eb9d-4c74-9997-a02fde4b3067" TYPE="ext4"

The line added to /etc/fstab should look like this (replace the UUID with the one you found, and set options according to your needs):

UUID=92439e25-eb9d-4c74-9997-a02fde4b3067 /srv ext4 defaults 0 2

Log out and restart the VM with vagrant reload raidsrv to verify that the raid device is mounted after a reboot.

[vagrant@raidsrv ~]$ mount | grep '/dev/sd[a-d]'
/dev/sda2 on / type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
[vagrant@raidsrv ~]$ mount | grep md0
/dev/md0 on /srv type ext4 (rw,relatime,seclabel,stripe=32)
[vagrant@raidsrv ~]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        4.0M     0  4.0M   0% /dev
tmpfs           385M     0  385M   0% /dev/shm
tmpfs           154M  4.4M  150M   3% /run
/dev/sda2        62G  1.6G   61G   3% /
/dev/md0        9.8G   24K  9.3G   1% /srv
vagrant         748G  271G  477G  37% /vagrant
tmpfs            77M     0   77M   0% /run/user/1000

removing a software raid

The software raid is visible in /proc/mdstat when active. To remove the raid completely so you can use the disks for other purposes, you stop (de-activate) it with mdadm.

[vagrant@raidsrv ~]$ sudo mdadm --stop /dev/md0
mdadm: stopped /dev/md0

The disks can now be repartitioned.

practice: raid

Create your own VM with four extra virtual disks of 8GB each. If you use the Vagrant environment from the theory section, you can copy the entry for the raidsrv machine in vagrant-hosts.yml, give it a different name (e.g. raidlab) and change the size of the extra disks. In the provisioning/ directory, copy raidsrv.sh to raidlab.sh. Then run vagrant up raidlab to create the new VM.

Boot the VM and log in, verify that the disks are present.
Create raid partitions on each of the four disks that take up the whole disk. Verify the result.
Create a software raid 5 array with three active disks and one spare (find in the man page how to do this). Verify that the raid array exists. How much space is available on the array?
Format the raid 5 array with a filesystem of your choice and mount it (e.g. under /srv/ or /mnt/). Add to /etc/fstab to make the mount persistent.
Create a big file on the new filesystem, for example by downloading an ISO image from the internet, or with dd (filled with zeroes or with random data). Calculate a checksum of the file.
Halt the VM and remove one of the active disks by commenting out the corresponding lines in vagrant-hosts.yml. Boot the VM and verify that the raid 5 array is still present and that the filesystem is mounted. Observe how the spare disk is used to rebuild the array. After the rebuild is complete, verify that the file you created is still there and has the same checksum.
Halt the VM again and remove another disk. Will the array (and the data) survive? How many disks can you remove before the array is no longer operational?

(Optional) Do the same exercise with a raid 1, raid 0 or raid 10 array.

solution: raid

Boot the VM and log in, verify that the disks are present.

[vagrant@raidlab ~]$ lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda      8:0    0  64G  0 disk 
├─sda1   8:1    0   2G  0 part [SWAP]
└─sda2   8:2    0  62G  0 part /
sdb      8:16   0   8G  0 disk 
sdc      8:32   0   8G  0 disk 
sdd      8:48   0   8G  0 disk 
sde      8:64   0   8G  0 disk

Create raid partitions on each of the four disks that take up the whole disk. Verify the result.

In this example, we use parted non-interactively as an alternative to fdisk.

[vagrant@raidlab ~]$ sudo parted /dev/sdb --script mklabel gpt
[vagrant@raidlab ~]$ sudo parted /dev/sdb --script --align optimal mkpart primary 0% 100%
[vagrant@raidlab ~]$ sudo parted /dev/sdb set 1 raid on
Information: You may need to update /etc/fstab.

Repeat the above for /dev/sdc, /dev/sdd and /dev/sde. Verify with lsblk or fdisk. Since the output of fdisk is rather verbose, we show only the result for /dev/sdb.

[vagrant@raidlab ~]$ lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda      8:0    0  64G  0 disk 
├─sda1   8:1    0   2G  0 part [SWAP]
└─sda2   8:2    0  62G  0 part /
sdb      8:16   0   8G  0 disk 
└─sdb1   8:17   0   8G  0 part 
sdc      8:32   0   8G  0 disk 
└─sdc1   8:33   0   8G  0 part 
sdd      8:48   0   8G  0 disk 
└─sdd1   8:49   0   8G  0 part 
sde      8:64   0   8G  0 disk 
└─sde1   8:65   0   8G  0 part 
[vagrant@raidlab ~]$ sudo fdisk -l /dev/sdb
Disk /dev/sdb: 8 GiB, 8589934592 bytes, 16777216 sectors
Disk model: VBOX HARDDISK   
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: DF585956-01F3-4764-AA3F-CE8118885912

Device     Start      End  Sectors Size Type
/dev/sdb1   2048 16775167 16773120   8G Linux RAID

Create a software raid 5 array with three active disks and one spare (find in the man page how to do this). Verify that the raid array exists.

[vagrant@raidlab ~]$ sudo mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1 --spare-devices=1 /dev/sde1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
[vagrant@raidlab ~]$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 sdd1[4] sde1[3](S) sdc1[1] sdb1[0]
      16762880 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
      [=============>.......]  recovery = 69.2% (5802752/8381440) finish=0.2min speed=200094K/sec

unused devices: <none>

[... wait for the rebuild to finish ...]

[vagrant@raidlab ~]$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 sdd1[4] sde1[3](S) sdc1[1] sdb1[0]
      16762880 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]

unused devices: <none>
[vagrant@raidlab ~]$ sudo sudo mdadm --detail /dev/md0
/dev/md0:
      Version : 1.2
Creation Time : Fri Nov 22 20:37:49 2024
      Raid Level : raid5
      Array Size : 16762880 (15.99 GiB 17.17 GB)
Used Dev Size : 8381440 (7.99 GiB 8.58 GB)
      Raid Devices : 3
Total Devices : 4
      Persistence : Superblock is persistent

      Update Time : Fri Nov 22 20:38:31 2024
            State : clean 
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1

            Layout : left-symmetric
      Chunk Size : 512K

Consistency Policy : resync

            Name : raidlab:0  (local to host raidlab)
            UUID : 81ead50e:6af0741d:5e7adb95:e8c0ece9
            Events : 18

Number   Major   Minor   RaidDevice State
      0       8       17        0      active sync   /dev/sdb1
      1       8       33        1      active sync   /dev/sdc1
      4       8       49        2      active sync   /dev/sdd1

      3       8       65        -      spare   /dev/sde1

Format the raid 5 array with a filesystem of your choice and mount it (e.g. under /srv/ or /mnt/). Add to /etc/fstab to make the mount persistent and reboot to verify. How much space is available on the array?

[vagrant@raidlab ~]$ sudo mkfs -t xfs /dev/md0
log stripe unit (524288 bytes) is too large (maximum is 256KiB)
log stripe unit adjusted to 32KiB
meta-data=/dev/md0               isize=512    agcount=16, agsize=262016 blks
      =                       sectsz=512   attr=2, projid32bit=1
      =                       crc=1        finobt=1, sparse=1, rmapbt=0
      =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
data     =                       bsize=4096   blocks=4190720, imaxpct=25
      =                       sunit=128    swidth=256 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=16384, version=2
      =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[vagrant@raidlab ~]$ sudo mount /dev/md0 /srv
[vagrant@raidlab ~]$ sudo blkid /dev/md0
/dev/md0: UUID="48024b37-901a-40e3-9548-01a5c99823b5" TYPE="xfs"

The following line was added to /etc/fstab:

[vagrant@raidlab ~]$ grep srv /etc/fstab 
UUID=48024b37-901a-40e3-9548-01a5c99823b5 /srv  xfs  defaults  0 0

We reboot with vagrant reload and check if the filesystem is mounted:

[vagrant@raidlab ~]$ mount -t xfs
/dev/sda2 on / type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
/dev/md0 on /srv type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,sunit=1024,swidth=2048,noquota)
[vagrant@raidlab ~]$ findmnt --real
TARGET SOURCE    FSTYPE OPTIONS
/      /dev/sda2 xfs    rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota
└─/srv /dev/md0  xfs    rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,sunit=1024,swidth=204

The array has a size of about 16GB:

[vagrant@raidlab ~]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        4.0M     0  4.0M   0% /dev
tmpfs           385M     0  385M   0% /dev/shm
tmpfs           154M  5.1M  149M   4% /run
/dev/sda2        62G  1.6G   61G   3% /
vagrant         748G  275G  473G  37% /vagrant
tmpfs            77M     0   77M   0% /run/user/1000
/dev/md0         16G  147M   16G   1% /srv

Create a big file on the new filesystem, for example by downloading an ISO image from the internet, or with dd (filled with zeroes or with random data). Calculate a checksum of the file.

[vagrant@raidlab ~]$ sudo dd if=/dev/urandom of=/srv/bigfile bs=1M count=128
128+0 records in
128+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 0.238885 s, 562 MB/s
[vagrant@raidlab ~]$ ls -lh /srv/
total 128M
-rw-r--r--. 1 root root 128M Nov 22 20:47 bigfile
[vagrant@raidlab ~]$ sha256sum /srv/bigfile 
249b3ccfbfef304f985011a836a7ff18acd24be851c0a069b0d5c35f96935dca  /srv/bigfile

Halt the VM and remove one of the active disks by commenting out the corresponding lines in vagrant-hosts.yml. Boot the VM and verify that the raid 5 array is still present and that the filesystem is mounted. Observe how the spare disk is used to rebuild the array. After the rebuild is complete, verify that the file you created is still there and has the same checksum.

In this example, we removed /dev/sdb from the array (by commenting out the lines for disk sata02 in vagrant-hosts.yml).

After booting and logging in:

[vagrant@raidlab ~]$ findmnt --real
TARGET SOURCE    FSTYPE OPTIONS
/      /dev/sda2 xfs    rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota
└─/srv /dev/md0  xfs    rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,sunit=1024,swidth=204
[vagrant@raidlab ~]$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 sdd1[3] sdb1[0] sdc1[4]
      16762880 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U]
      [==============>......]  recovery = 72.2% (6053532/8381440) finish=0.1min speed=201784K/sec

unused devices: <none>
[vagrant@raidlab ~]$ sudo mdadm --detail /dev/md0
/dev/md0:
      Version : 1.2
Creation Time : Fri Nov 22 20:37:49 2024
      Raid Level : raid5
      Array Size : 16762880 (15.99 GiB 17.17 GB)
Used Dev Size : 8381440 (7.99 GiB 8.58 GB)
      Raid Devices : 3
Total Devices : 3
      Persistence : Superblock is persistent

      Update Time : Fri Nov 22 21:10:19 2024
            State : clean 
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0

            Layout : left-symmetric
      Chunk Size : 512K

Consistency Policy : resync

            Name : raidlab:0  (local to host raidlab)
            UUID : 81ead50e:6af0741d:5e7adb95:e8c0ece9
            Events : 49

Number   Major   Minor   RaidDevice State
      0       8       17        0      active sync   /dev/sdb1
      3       8       49        1      active sync   /dev/sdd1
      4       8       33        2      active sync   /dev/sdc1
[vagrant@raidlab ~]$ lsblk
NAME    MAJ:MIN RM SIZE RO TYPE  MOUNTPOINTS
sda       8:0    0  64G  0 disk  
├─sda1    8:1    0   2G  0 part  [SWAP]
└─sda2    8:2    0  62G  0 part  /
sdb       8:16   0   8G  0 disk  
└─sdb1    8:17   0   8G  0 part  
└─md0   9:0    0  16G  0 raid5 /srv
sdc       8:32   0   8G  0 disk  
└─sdc1    8:33   0   8G  0 part  
└─md0   9:0    0  16G  0 raid5 /srv
sdd       8:48   0   8G  0 disk  
└─sdd1    8:49   0   8G  0 part  
└─md0   9:0    0  16G  0 raid5 /srv
[vagrant@raidlab ~]$ ls -l /srv/
total 131072
-rw-r--r--. 1 root root 134217728 Nov 22 20:55 bigfile
[vagrant@raidlab ~]$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 sdd1[3] sdb1[0] sdc1[4]
      16762880 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]

unused devices: <none>
[vagrant@raidlab ~]$ sha256sum /srv/bigfile 
249b3ccfbfef304f985011a836a7ff18acd24be851c0a069b0d5c35f96935dca  /srv/bigfile

The raid 5 array is still present and the file is still there with the same checksum!

Other observations: - The /dev/sd* devices were reordered. /dev/sdb is still present, but it's actually /dev/sdc from before. The name /dev/sde no longer occurs. - After logging in, /proc/mdstat shows that the array is rebuilding. - There is no longer a spare disk in the array.

Halt the VM again and remove another disk. Will the array (and the data) survive? How many disks can you remove before the array is no longer operational?

After removing another disk (sata04 in this case), the array is still operational, but in a degraded state:

[vagrant@raidlab ~]$ sha256sum /srv/bigfile 
249b3ccfbfef304f985011a836a7ff18acd24be851c0a069b0d5c35f96935dca  /srv/bigfile
[vagrant@raidlab ~]$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid5 sdb1[0] sdc1[4]
      16762880 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U]

unused devices: <none>
[vagrant@raidlab ~]$ sudo mdadm --detail /dev/md0 
/dev/md0:
      Version : 1.2
Creation Time : Fri Nov 22 20:37:49 2024
      Raid Level : raid5
      Array Size : 16762880 (15.99 GiB 17.17 GB)
Used Dev Size : 8381440 (7.99 GiB 8.58 GB)
      Raid Devices : 3
Total Devices : 2
      Persistence : Superblock is persistent

      Update Time : Fri Nov 22 21:19:46 2024
            State : clean, degraded 
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

            Layout : left-symmetric
      Chunk Size : 512K

Consistency Policy : resync

            Name : raidlab:0  (local to host raidlab)
            UUID : 81ead50e:6af0741d:5e7adb95:e8c0ece9
            Events : 55

Number   Major   Minor   RaidDevice State
      0       8       17        0      active sync   /dev/sdb1
      -       0        0        1      removed
      4       8       33        2      active sync   /dev/sdc1

The next disk that is removed will make the array no longer operational. In fact, if you open the GUI of the VM (click the Show button in the VirtualBox main window), you will notice that the boot process hangs for a long time on a task that tries to detect the missing disk(s) (A start job is running for /dev/disk/by-uuid-...), it will fail to mount the /srv filesystem (dependency failed for /srv/) and drop you into a rescue shell.

At this time, the VM is basically broken and you can delete it with vagrant destroy.

The labs with raid 1, raid 0 or raid 10 are left as an exercise to the reader.

introduction to raid

raid levels

raid 0

jbod

raid 1

raid 2, 3 and 4?

raid 5

raid 6

raid 0+1

raid 1+0

raid 50

many others

building a software raid5 array

do we have three disks?

fd partition type

create the raid5

/proc/mdstat

mdadm --detail

format and mount

removing a software raid

further reading

practice: raid

solution: raid