VMware Server and NFS? Am I alone?
So the last few weeks I’ve been playing with VMWare Server. A lot. I’ve also been trying out the ESX version to see how it compares with the free offering. This is my first “serious” experience with VMware-style virtualization for server consolidation and I have to say that I’m quite impressed. Both products are solid, stable, and basically work as advertised. I’ve been running the free VMware Server (formerly GSX) product on Red Hat Enterprise 4 WS (Update 3), FWIW.
The plan, like everyone else in the universe, is to consolidate a bunch of 3 to 5 year old 1U servers that started out as application experiments and ended up becoming fairly critical parts of our shop. I’ve been testing (and will deploy) with a few 16GB quad-core X4100s. The goal is to be able to handle a host failure (hardware, most likely) and swing that host’s VMs over to another box in short order. Since most of these servers do very little IO, the natural choice is to host the VMs on one of the NetApp filers. A NAS setup, in theory, avoids having to copy VMs in the event of a host failure. I’ve also been experimenting with using the filers snapshotting abilities for nice “bare metal” recovery goodness. And this is where it gets a little interesting, because in all the literature, case studies, and chats I’ve had with current VMware customers it seems that everybody opts to use iSCSI. Not NFS.
Which, frankly, I don’t get. Cause in the testing I’ve been doing under RHEL 4 I’m getting better performance with the VMs on an NFS share than on an iSCSI LUN (off the same volume on the same filer). My theory is that with iSCSI I have to lay a journalled filesystem (ext3 in this case) across the LUN, and then of course inside the VM I’m also running ext3. I suppose I could switch to ext2 on the iSCSI LUN and/or the VM, but that just makes me nervous. Which means I’m “double journaling”. My guess is that the extra journalling is what’s making the iSCSI hosted VM run slower than the NFS hosted VM.
Stability wise, the NFS hosted VMs haven’t had a single glitch. I’ve pounded the piss out of the VMs, pulled the ethernet cable, forced reset the VM from the Server Console over and over, and outright yanked the power cord out of the host. So far everthing looks a-ok, with full filesystem checks on the guest OS showing no signs of problems.
Which begs the question, am I missing something? Why are there pages of forum posts and official docs and speaking tours highlighting iSCSI but hardly a mention of NFS-hosted VMs? Cause from where I’m standing NFS looks both simpler and faster than iSCSI for this job.
Note: I’m aware that the situation is different with ESX, where you really want to be using VMware’s special-purpose “VMFS” filesystem to hold your VMs’ virtual disks. There, iSCSI starts to make a lot more sense since you need to feed VMFS LUNs of some flavor or another. Whether or not VMFS-on-iSCSI actually makes much of a difference on ESX (vs, say, virtual disks on NFS) is something I’m going to be spending a few days testing next week. All I’ve tested so far is VMFS on local disks, and in that setup it is indeed faster than running vmdks on EXT3. Sadly, VMware’s license forbids me from actually disclosing anything even remotely resembling a “benchmark”. Sigh…
27 Comments
Jump to comment form | comments rss | trackback uri