Thinking about building or buying a home NAS or file server? While I can’t cover every use case, this article will feature my suggested small build and I’ll come back later for an update that discusses the large build I use (15 hot swap bays, 3 SSD drives, ESXi, ZFS, two LSI HBA adaptors, PCIe passthrough, 32GB RAM beast that currently occupies my closet and hosts all my files and my home lab). There’s a lot to be gained from building your own file server for home. While some of the commercial options offer extreeme simplicity (Drobo), great features and appearance (Synology, Qnap), or very low price (plenty of options here, all seem to come with a severe lack of performance), you could build your own server with many of the same features, hand picked hardware, and you just might end up learning something in the process!
Before we get started, if you get in over your head, go back to the first paragraph and review – there’s some good guidance there. Drobo is dead stupid simple (but be prepared to pay the price), while Synology and Qnap both offer some great hardware and software (and they can act as your Plex Media Server too!) Alright, so let’s begin.
First, I’m assuming you fall into a certain range of needs with this recommendation: This article is for people that need between 2-9TB of available space that will tolerate one hard drive failing at a time (this happens to about 1/10 hard drives in the first 3 years of use. Seriously). If you end up building a NAS that has 5 drives, that means you have about a 50% chance of losing data in 3 years if you aren’t using RAID like I recommend in this how to. Contact me through Twitter or email if you don’t fall in that category and I’ll try to give you some guidance. There are other dangers too, my recent obsession has been with bit rot and bit error rate which is REALLY important if you’re considering a RAID5 build on SATA drives. So important that you should statistically expect an error after reading 2.2TB of data (oh, and guess what? That’s right at the bottom of the space we’re trying to build a file server to host)
Ok, so a corrupted file is annoying, but it’s not the end of the world, right? Well, sure, except that whole 50% chance of a failed drive thing we just talked about. If the data you’re storing is important, then you need some way to be sure it can survive these failed drives, something a little more assuring than a coin toss. That answer is RAID. If you haven’t heard of RAID before, go grab a Drobo and save yourself some time and frustration. But if you’re comfortable with RAID, let’s talk about what you’ve probably been using for data archival. RAID5. RAID 5 is awesome, right? Sure. At least it was back in the 90′s before we started talking about single drives that were in the 250GB to 500GB range. Around that point (maybe 5 years ago?) we started running into a wall. Since RAID5 uses all disks (except the parity) to calculate a stripe of data, any bit rot on any of the disks equals corrupted data. Also, if you do have a disk failure, you have to read all the data from every disk to recover, and since we’re talking about an expected error for every 2.2TB, then during a recovery of a 4x3TB disk system (12TB physical, 9TB available) you can expect 2-3 corrupted stripes, killing off probably a few files. That sucks.
So if I’m telling you we have a major issue, and that we’re almost certainly going to get some errors while reading back data from this new NAS/file server I want to explain how to build, then there has to be a solution, right? Well, there is!
ZFS (Zettabyte File System) works on a high level a lot like RAID5 systems. But:
- It can use 1, 2, or even 3 parity drives (RAID 5 or RAID 6 do 1 or 2 disk parity, respectively)
- It can recover from a failed drive without reading every single block of data from every remaining drive. ZFS knows which blocks are used, and will only read those blocks. This means if you are using 50% of your array, your recovery is 50% faster, and 50% less likely to have a bit error
- ZFS keeps checksums of blocks, this is important because of the next feature
- It can scrub a healthy array to identify and even repair bit rot. Since you have a parity and 3 or more disks that add up to that parity, ZFS can identify that a stripe is corrupt and because it stores a checksum, it can attempt to recover the block by rotating the stripe form one disk at a time until the checksum is correct
- ZFS has some awesome enterprise features you could only dream about on a home server until now
- Unlimited filesystem snapshots (how do you feel about recovering your system to exactly as it was in July 25th, 2009 at 6PM? ZFS can do that if you’ve taken hourly or daily snapshots since then)
- On the fly data compression. ZFS can compress data as it is written to disk. This can actually speed up your read/write speeds because although your physical disks may only be able to write at 100MB/s, if the data is 50% the size after compression, it can effectively be written at 200MB/s on that same physical disk.
- SSD Cache. Yep, this is the same kind of stuff (local tiered storage) that Dell, EMC, NetApp and others sell even “entry level” systems starting at tens of thousands of dollars, but for free (as in beer and as in speech).
- Datasets – these are just awesome. In ZFS there aren’t partitions, there are datasets. These can be used to manage settings for “partitions” of ZFS RAIDs. They can be used to create individual mount points, each sharing the data of the full RAID, but each potentially set up with individual snapshots, limits to how much data they can use, reserved space, compression settings, and more.
- This one I hesitate to mention because it’s designed for systems with TONS of RAM, more than we’ll be including in this hardware. ZFS can do block level deduplification. It works best with 2GB or more RAM per TB of space, so don’t turn this on unless you’ve got something like 16GB of RAM in your file server.
- Plenty of other nice bits as well, like automatically creating mount points for your ZFS disk RAIDs (called pools) and datasets, built-in monitoring utilities, configuration management tools and more
Ok, so there it is. Sold yet?
Let’s talk about an example build. I’ll update this post tomorrow with hardware I’d recommend for a low power, 4 disk home NAS using ZFS.
Quick sneak peek:
- HP ProLiant N40L Micro Tower Server ($319) http://www.newegg.com/Product/Product.aspx?Item=N82E16859107052
- 4 x Western Digital Red ($129.99×4 = $519.96) http://www.newegg.com/Product/Product.aspx?Item=N82E1682223634
- USB Flash drive ($17.99) http://www.newegg.com/Product/Product.aspx?Item=N82E16820220253
- 8GB Unbuffered ECC Memory ($55.99) http://www.newegg.com/Product/Product.aspx?Item=N82E1682013926
- 120GB Intel 520 Series SSD ($129.99) http://www.newegg.com/Product/Product.aspx?Item=N82E16820167095
The SSD really isn’t needed for file server purposes, but if you’re interested in having an SSD cache, it’ll let you hit some crazy IOPs for cheap, plus really speed up directory listing and some other nice little performance boosts. 8GB of RAM I’d recommend but you can survive just fine on the 2GB stick that comes with it (albeit without a few nice performance improvements).
So that’s the basic hardware. An 8TB (6TB will be usable for storage) ZFS NAS starting at $856, or $1042 for an all out beast of a box that should handle hosting your home lab without much issue. The hardware is all designed for very low power usage, somewhere around 30-32 watts idle (with disks) and 37-45 watts under load (based on what others have reported). It’ll blow away anything you can get from the likes of Drobo, Synology, or Qnap until you start talking prices in the $550-$850 range before even buying the drives.