VMware ESXi Server Keeps Running with Failed RAID Array

We at Corner Edge Solutions LOVE VMware.  It’s not too hard to tell that based on our blog, but this past week we found a new reason to fall in love all over again.  One of our ESXi 4 servers in a cluster had a double drive failure on our RAID 5 array, which would have completely crashed a server had it been a typical setup, but since it is running VMware ESXi with all the VM DataStores on a iSCSI storage device, we had ZERO impact on our environment.  ESXi is the lightweight version of the original ESX server which runs entirely in memory, not requiring disk access once it has been loaded at startup.

Since this machine was part of a cluster, we simply migrated the VMs on the failed server to the other working ESXi server through vSphere vCenter Server.  The working VMware server was able to overcommit the available physical memory by almost 50% with room to spare.  We then took down the server with the bad drives to rebuild.  We also took this opportunity to install the OS onto a USB flash drive, which installed internally to the server, and remove the remaining two working hard drives to run a completely diskless server configuration.  With a small amount of configuring to VMware, the newly rebuilt server was ready to join the cluster again and the VMs were then evenly distributed throughout the cluster, all the while never having to power anything off.  That means never having to send out maintenance notices to customers that their hosted servers will be offline, and keeping out uptime in tact.  The whole process took only about 5 hours as well.  When was the last time a total failure on a system RAID drive, and nothing went down, and everything was upgraded and repaired in 5 hours?


Data Store Size Limits in VMware ESX and ESXi

Well, you took the leap and are now virtualized.  You’re now doing more than ever, and data size is growing rapidly.  Time to add a new virtual hard drive to your machine, but wait…  I said 500 GB, why is it only 256 GB.  Well, you hit a limitation of the data store in VMware under your current default configuration.  Check to see what your data store block size is.  here’s where to find it:

Highlighted you will see the datastore block size.

Highlighted you will see the data store block size.

When data stores are created, their default block size is 1 MB, which gives you a maximum virtual hard drive size maximum of 256 GB.  So how do you get larger VHD’s?

Hopefully you are reading this and have a brand new ESX/ESXi setup, in which case you can just delete the data store and recreate it, choosing a different block size.  If you already have machines running on the data store, you have a project ahead of you, because deleting the data store will format all data on that drive, and you will have to start from scratch, or be creative before you make the change (there are some ideas of how to work around this below).

If you have the disk space to cover 2TB, then I would go with the maximum of 8MB block size to give you a maximum virtual HDD size of 2TB.  There is no noticeable I/O performance difference by using maximum size, so use the largest size to mazimize your storage.  Here is a quick reference of what block size you can choose and what the maximum VHD that will give you:

Block Size     Max VHD size

1MB                256GB
2MB               512GB
4MB               1TB
8MB               2TB

Already have servers running?  How do you fix it?

If you already have the data store in use, and can afford some downtime for a maintenance window, here is a workaround you could do, asuming you have more than one ESX(i) server at your disposal.  You can power the VM down and use the free VMware vCenter Converter (http://www.vmware.com/products/converter/) to move the virtual machine from one ESX(i) server to another.  Figure on about 1 minute per gig of hard drive size when moving it, with a GB network.  Once the VM is moved to its new location, power it an and make sure all is working well before you delete from disk the VM on the original ESX(i) server.  Once all the VMs are moved off the ESX server, you can go ahead and remove the data store and create a new one using the new block size.

If you hapen to have your VMs in a cluster with vMotion, this task is even easier, as you can change the location of the datastore through the migrate option.  If you dont have any other ESX servers, you could probably do it to VMware Servers, but at that point, you would probably be better just adding multiple drives to the VMs, it would be a lot less work.

Here is a nice reference guide from VMware with this and other importand configuration information: http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_config_max.pdf

Pressing F8 during Windows install on Vmware ESX 3.5

After scouring the Intertubes for an answer and not finding a solution, I feel this is a worthy tidbit of new technical information I should share with the rest of the world.

It’s a simple yet annoying problem.

While installing a fresh copy of Windows XP Pro on a Vmware ESX 3.5 host, I got to the usual “Licensing Agreement” page (you know the one with all the legalese). It asks for the F8 key to be pressed to “agree”.

To my surprise, pressing F8 just wouldn’t work! No combination of CTRL, ALT, shift, ascii-codes would work either, I was stuck on the agreement page!

I had seen this before with a Vmware Workstation installation.  Where, if I wanted to enter a VM’s BIOS settings I needed to use a PS/2 keyboard (in addition to my regular USB keyboard) in order to press “ESC” to enter the configuration page.  The only theory I had in mind was that maybe the USB devices weren’t loaded early enough to access the BIOS, while PS/2 support was. *shrug*

Well, back to ESX. I tried pressing F8 through both the VIC2.5 Console tab and through a separate Console window. No luck.

It turns out that the only solution to pressing F8 was to access the VIC2.5 through another workstation (using a PS/2 keyboard) and finally F8 was accepted. Weird.

Yes it’s simple yet annoying problem.

(For the record, my workstation has a Microsoft Natural Ergonomic Keyboard 4000 (ver 1.0))

Update 20080814: One reader’s comment really fixes this beautifully

hey4ndrw Says:
If you have the Microsoft Wireless Natural Multimedia ergonomic keyboard, and no amount of leaning on the F8 key to select “I Agree” works when reinstalling XP, try tapping once on the F Lock button (one key to the right of F12), then pressing F8. Worked for me.