Saturday, September 5, 2009

NFSv4 & Gigabit Ethernet: A Semi-Functional Combination

First: Dig up Old Box. Second: Off With its Head!


So I had an old P4 desktop that had served me well from 2003 until the beginning of 2008 when I built my current Core2 box. It had been sitting around collecting dust until a few days ago when I grabbed 2GB of DDR400 RAM and slapped Arch Linux on it. It's an old Northwood core P4 that came with one onboard Gigabit card, and I slapped a second one in to make it a dual-homed host. Finally, after the OS was installed and I could get networking automatically running at boot... I pulled the video card to save on power, and since I wanted to see how an normal desktop PC would behave with zero graphics capabilities.



So I'm running headless, which is fine for right now since I managed to get the installation right enough that it gets onto the network and runs SSH. One of the reasons behind why I have two network cards is that the (better) one is hooked directly to my main PC and has a static IP address. It's there for performance since I run my NFS over that interface, and also for reliability since assuming the headless box can boot, it can get that static interface configured with a minimum amount of hassle. Of course... that assumes the machine can boot....



Right now my main concern is that this headless box is fine as long as I'm able to get it on a network. That's fine for normal operations, and this thing has a very stripped-down set of software that does not have to be updated often, but you never know when an update will result in a system that does not want to reboot without manual intervention. Since this box does not have normal serial ports, my only current option is to slap a video card back in and reboot...but I've asked some other people if they have solutions for that problem... we'll see.




NFSv4: FTW?


The next step is to actually do something with this nifty headless box, and I choose to make an NFS server. While I've used NFS as a client in larger settings like school, this was the first time I had ever setup a box to serve NFS. I chose NFS more to experiment, and because I've heard it has better performance than something like SAMBA in a pure-Linux environment which is where I'm running it. This isn't some crazy storage array, right now I only have one disk in the server with about 150GB free for the actual NFS export. I'm doing this more to get experience for 4TB RAID array I intend to build at some undertermined future date when buying a bunch of 2TB disks is even cheaper than it is now ;-)



The setup was actually pretty easy, although I didn't bother with the actually complicated part which is setting up security. I did make it so the /usr/sbin/rpc.mountd server only listens on my dedicated network port, so that my desktop (client) is the only machine physically capable of trying to mount the server. So, when I say that NFSv4 is relatively "simple" to setup, I'm sort of skipping all the Kerberos stuff that is necessary but also a real PITA to setup in the real world. I'm not saying it's as easy as right-clicking a folder and selecting "share this", but the Arch Linux Wiki has pretty clear instructions, and it doesn't take too long to setup. In fact, the Arch Wiki is informative enough that I'm not going to bother repeating it... but I will share some of my own specific issues below.




Teething Issues


I'm not expecting the setup to be perfect, but I definitely know why some people really hate NFS now (even though I think it's OK). When it is working, it's just like the mounted partition is on your own machine, but when something goes wrong, you'll see a mounted partition that hangs every single application that even tries to list the NFS directory. Fortunately, umount -f was invented just for these situations, but the NFS client's default configuration means that if your server flakes out, expect your client machine to hang up pretty badly, without even a timeout. These problems seem to rear their heads the most when I reboot the client computer, and try to re-establish my NFS mount. Even if I cleanly unmounted before the reboot, I've noticed that my NFS share will often be in a semi-comatose state where I can mount it, but any writes to the share start to hang. See below for why I think some of the source of this may be hardware related, but it would be great if the NFS client was a little more cognizant that Bad Things Happen (TM) and to not assume that there is a perfect NFS server 100% of the time.



The underlying issue: Flaky Gigabit cards


I've noticed that most of my problems have boiled down to flaky gigabit ethernet cards. Now, these cards "work" in Linux in as much as there are drivers and you can configure the interfaces, but I need more than that for what I want to do. In order to get as close to real gigabit speeds as possible I went to enable jumbo frames... which causes all sorts of problems with low-grade hardware. The good news: My Marvell Yukon2 PCIe controllers on my new machine can jump to 9000 byte frames without any issues. Also, it appears that the old Intel Gigabit controller built into the motherboard on my old machine does 9000 byte frames too. However, before I got to this point I had to go through the process of dealing with cards that used both the r8169 and skge drivers...and it was not pretty. The r8169 card worked with jumbo frames up to 7200 bytes (it refused to use the more common 9000 byte size) but then after this worked for a few hours.. the card dropped off the face of the earth and could not be reached, even after I rebooted the machine. Even worse, the card with the skge driver would come up fine, allow you to set an MTU of 9000 bytes, but then any real packet operations immediately froze the card and any pending NFS operations.



The main moral of my story: be VERY selective with the ethernet cards you use if you want to do anything more intensive than a simple home network. In my case, the Intel ethernet controllers are outstanding (they may be $5 more on Newegg but you get what you pay for) and also the Marvell 88E8056 controllers built into my X38 motherboard are also quite good. Be warned: Just because cards accept Jumbo Frame MTU's does not mean they will actually work when put under real stress. By the way, the performance using Jumbo Frames is quite a bit better than without, especially because my P4 box is not exactly a heavy-weight in the CPU category and all that extra packet processing does slow things down. Just about anything made this century can handle a saturated 100Mbit connection, but once you get above that you need to make sure the parameters are tweaked in order to get the performance listed on the box.



How its Working Now:



The good news first: If I do a large continuous write to my network share I'm averaging a little over 70 megabytes per-second, and from the saw-tooth graph I get on gkrellm, the network itself is no longer the bottleneck since it can easily exceed 100 megabytes/sec in transfers, but then drops down since the old disk in the server can't keep up. By the way, for those of you who like SCP (and I do too) I get about 40 Megabytes a second accross the wire with Jumbo Frames, and 30 Megabytes with normal frames. With NFS, my P4 box is spending a bunch of time in the IOWAIT state since the disk is the bottleneck, but with SCP, the CPU overhead becomes dominant since all that cryptography takes its toll at high speeds.



So I've got a nifty NFS server setup, and a fun platform to mess around with too. I eventually intend to turn my current desktop machine into a RAID file server as well as a home-network server providing DHCP/DNS caching/Printer/Firewall/Routing services, so this is a good first step to try things out.