A ZFS or LVM or MD redundant heterogeneous storage proposal-Collection of common programming errors
I have the same problem most people have: how to create a reliable personal storage solution with the fact that:
- Hard drives fail with alarming regularity. Losing files is unacceptable.
- I will buy a new HDD from time to time. Inevitably, the best price/GB is a different size than the last HDD purchase.
- #2 means that over time I have a heterogeneous collection of disks. I want to use them all, and failed disks will generally be replaced by larger disks.
- Data integrity and reliability is more important to me than speed.
So after banging my head against this problem for a few days (and in the back of my head for years) I propose the following solution. I will describe a solution that I have tested based on native linux ZFS which is available in an Ubuntu PPA, but LVM, MD, and btrfs can be used to achieve the same. For this I will use RAID1 (ZFS mirror vdevs).
- Given your set of drives, group them into two sets of disks, such that the capacity of each set is as near to the other as possible.
- Partition the larger disks such that there is a partition exactly the same size as one of the smaller disks, in the other group.
- Create mirror vdevs such that each disk has its mirror on another disk.
For example, consider a disk set of a new 2TB drive, an older 750GB drive, an 2 older 400GB drives, and one older 500GB drive. The optimal mirrored partitioning has 2TB of usable space and is described in the following diagram where ‘:’ separates partitions and ‘|’ separates disks:
+------------------------------------------------------------------+
| 2TB (sda1) : (sda2) : (sda3) : (sda4) |
+------------------------------------------------------------------+--+
| 750 GB (sdb) | 400 GB (sdc) | 400 GB (sdd) | 500 GB (sde1) :XX|
+---------------------------------------------------------------------+
Create your zpool as
zpool create archive mirror /dev/sda1 /dev/sdb mirror /dev/sda2 /dev/sdc mirror /dev/sda3 /dev/sdd mirror /dev/sda4 /dev/sde1
This creates 4 mirrored vdevs. If any one of the disks failed, it can be replaced (with any size disk) and partitioned to recreate the missing partitions. It’s important that ZFS vdevs can be added to a pool but not removed. So if at all possible, when one purchases a new drive, you want to rearrange the existing vdevs. Let’s say the next purchase was a 3TB drive. Your optimal configuration is 3.5TB usable, as described in the following diagram. This is now 5 vdev pairs. This can be achieved by appropriate partitioning and successively failing and repartitioning the drives.
+--------------------------------------------------------------+-------------+
| 3 TB (sdf1) : (sdf2) : (sdf3) : (sdf4) | 500GB (sde) |
+--------------------------------------------------------------+-------------+-+
| 2TB (sda1) | 400GB (sdb) | 400GB (sdc) | 750GB (sdd1) : (sdd2) :X|
+------------------------------------------------------------------------------+
Maintaining this pairing of mirrored drives could also be done with LVM or with MD RAID, the idea being to make sure each drive always has a mirror drive or parition. Because everything is mirrored, we are free to fail drives and rearrange paritions when drives are added or removed. Using LVM or MD it would be possible to remove drives and shrink the array, if desired, at the expense of less sophisticated recovery tools in ZFS compared to BTRFS.
Any comments on this procedure? A good script could handle the lossless allocation and rearrangement of drives. Any comments on LVM vs. MD vs. ZFS? Any comments on performance of the resulting weirdly partitioned array? Will data arrangement across multiple partitions on the same drive cause excessive head seeking and early failure?
BTRFS devs: everyone wants this and LVM or MD are not technically necessary (and in my opinion, sub-optimal). Making it easy to maintain a redundant heterogeneous array would be a killer feature for btrfs. It’s a hack on LVM/MD/ZFS as it is. Minimizing resliver/resync is massively desirable.
Yes, this is obviously a poor-man’s Drobo. One shouldn’t need dedicated hardware for that…