Quick tips - RAID 0 array on NVMe drives
"Quick tips" is a series of small articles that show how to perform a single task for quick reference.
Today, how to create an RAID 0 array using the NVMe drives of an EC2 i3en
instance.
RAID 0 primer
A RAID 0 array consists in striping multiple disks to form a single volume. Any block written on the resulting volume is split evenly between all underlying disks.
This allows for maximizing disk throughput and space. The RAID array throughput is equal to the sum of the throughput of all underlying disks. And the total available space is equal to the sum of all the underlying disks space. The downside is that losing a single underlying disk means losing all the data that was on the array.
How to create it
On EC2, certain instances come with one or more local NVMe SSDs.
The i3en
instance family is one of them.
Each instance has one or more local NVMe SSD(s) available under /dev/nvme${x}n1
with x
ranging from 1 (for instances up to i3en.3xlarge
) to 8 (for i3en.24xlarge
and i3en.metal
).
First, identify the usable SSDs using lsblk
.
I.e. look for unmounted and non-partitioned disks.
In the example below, nvme1n1
and nvme2n1
.
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 8G 0 disk
├─nvme0n1p1 259:1 0 8G 0 part /
└─nvme0n1p128 259:2 0 1M 0 part
nvme2n1 259:3 0 6,8T 0 disk
nvme1n1 259:4 0 6,8T 0 disk
Then, run the following commands.
Adjust the parameters --raid-devices=2
and /dev/nvme{1,2}n1
respectively with the number of disks that compose your RAID array and their file under /dev
.
These commands will create a RAID array on those disks.
Then it will create an ext4
partition on them and will mount it under /mnt/md0
.
It will also ensure that the RAID array is automatically mounted at boot.
sudo mdadm --create --verbose /dev/md0 --level=0 --raid-devices=2 /dev/nvme{1,2}n1
sudo mkfs.ext4 -F /dev/md0
sudo mkdir /mnt/md0/ -p
sudo mount /dev/md0 /mnt/md0/
sudo chmod 777 /mnt/md0/
echo '/dev/md0 /mnt/md0/ ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab
Finally, verify with lsblk
that the disks are now part of the md0
array.
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 8G 0 disk
├─nvme0n1p1 259:1 0 8G 0 part /
└─nvme0n1p128 259:2 0 1M 0 part
nvme2n1 259:3 0 6,8T 0 disk
└─md0 9:0 0 13,7T 0 raid0 /mnt/md0
nvme1n1 259:4 0 6,8T 0 disk
└─md0 9:0 0 13,7T 0 raid0 /mnt/md0
Conclusion
This is a handy trick to be quickly have more disk space and increased throughput. I use it often when I need to generate large amounts of data. Here is an example with a 28 TB RAID array composed of 4 NVMe SSDs.
$ df -h /mnt/md0
Filesystem Size Used Avail Use% Mounted on
/dev/md0 28T 24.6T 3.4T 88% /mnt/md0
If you have any question/comment, feel free to send me a tweet at @pingtimeout. And if you enjoyed this article and want to support my work, you can always buy me a coffee ☕️.