I’ve posted a few days ago, asking how to setup my storage for Proxmox on my Lenovo M90q, which I since then settled. Or so I thought. The Lenovo has space for two NVME and one SATA SSD.
There seems to a general consensus, that you shouldn’t use consumer SSDs (even NAS SSDs like WD Red) for ZFS, since there will be lots of writes which in turn will wear out the SSD fast.
Some conflicting information is out there with some saying it’s fine and a few GB writes per day is okay and others warning of several TBs writes per day.
I plan on using Proxmox as a hypervisor for homelab use with one or two VMs runnning Docker, Nextcloud, Jellyfin, Arr-Stack, TubeArchivist, PiHole and such. All static data (files, videos, music) will not be stored on ZFS, just the VM images themselves.
I did some research and found a few SSDs with good write endurance (see table below) and settled on two WD Red SN700 2TB in a ZFS Mirror. Those drives have 2500TBW. For file storage, I’ll just use a Samsung 870EVO with 4TB and 2400TBW.
SSD | TB | TBW | € |
---|---|---|---|
980 PRO | 1TB | 600 | 68 |
2TB | 1200 | 128 | |
SN 700 | 500GB | 1000 | 48 |
1TB | 2000 | 70 | |
2TB | 2500 | 141 | |
870 EVO | 2TB | 1200 | 117 |
4TB | 2400 | 216 | |
SA 500 | 2TB | 1300 | 137 |
4TB | 2500 | 325 |
Is that good enough? Would you rather recommend enterprise grade SSDs? And if so, which ones would you recommend, that are m.2 NVME? Or should I just stick with ext4 as a file system, loosing data security and the ability for snapshots?
I’d love to hear your thought’s about this, thanks!
ZFS without redundancy is not great in the sense that redundancy is ideal in all scenarios, but it’s still a modern filesystem with a lot of good features, just like BTRFS. The main problem will be that it can detect data corruption but not heal it automatically. Transparent compression, snapshotting, data checksums, copy-on-write (power loss resiliency), and reflinking are modern features of both ZFS/BTRFS, and BTRFS additionally offers offline-deduplication, meaning you can deduplicate any data block that exists twice in your pool without incurring the massive resources that ZFS deduplication requires. ZFS is the more mature of the two, and I would use that if you’ve already got ZFS tooling set up on your machine.
Note that the TrueNAS forums spread a lot of FUD about ZFS, but ZFS without redundancy is ok. I would take anything alarmist from there with a grain of salt. BTRFS and ZFS both store 2 copies of all metadata by default, so bitrot will be auto-healed on a filesystem level when it’s read or scrubbed.
Edit: As for write amplification, just use ashift=12
and don’t worry too much about it.
I barely scratched the surface with ZFS, so I’m not going to touch another file system for a while now. I’m fine with detecting data corruption only, since those files (on the static data storage) can be replaced easily and hold no real value for me. All other data will be either on the redundant pool or is saved to several other media and even one off-site copy.
I already wrote down ashift=12
in my notes for when I set it up.
In general, I found there is a lot of FUD out there when it comes to data security. One I liked a lot was ECC RAM being mandatory for ZFS. Then one of the creators of it basically said: "Nah, it’s not needed more than for any other file system’.
Where can I read more about good ZFS settings for a filesystem on a new RAID6 array? I don’t want to manage disks or volumes with ZFS, I’ll be doing that with mdadm, just want ZFS as filesystem instead of ext4. I assume a ZFS filesystem can grow if the space available expands later?
ZFS can grow if it has extra space on the disk. The obvious answer is that you should really be using RAIDZ2 instead if you are going with ZFS, but I assume you don’t like the inflexibility of RAIDZ resizing. RAIDZ expansion has been merged into OpenZFS, but it will probably take a year or so to actually land in the next release. RAIDZ2 could still be an option if you aren’t planning on growing before it lands. I don’t have much experience with mdadm, but my guess is that with mdadm+ZFS, features like self-healing won’t work because ZFS isn’t aware of the RAID at a low-level. I would expect it to be slightly janky in a lot of ways compared to RAIDZ, and if you still want to try it you may become the foremost expert on the combination.
I assume you don’t like the inflexibility of RAIDZ resizing
Right, I’d like to be able to add another disk and then grow the filesystem and be done with it.
my guess is that with mdadm+ZFS, features like self-healing won’t work because ZFS isn’t aware of the RAID at a low-level
Really, I’ll have to look into that then because health checks are my main reason for using ZFS over ext4.
mdadm RAID should be a transparent layer for ZFS, it manages the array and exposes a raw storage device. Not sure why ZFS would not like that but I don’t want to experiment if it’s not a reliable combination. I was under the impression that ZFS as a filesystem can be used without caring about the underlying disk support, but if it’s too opinionated and requires its own disk management then too bad…