2009-09-23

Nehalem Processors Are Great

A client came to me a few months ago requesting help with a problem they had.

They have a system that does some (I guess you can call it) grid computing. They were using 20 desktops with XP to compute perform some calculations, and the the process was taking 10 hours , utilizing ~100% CPU on each machine throughout the process.

Because of  a change that was made in the algorithm, the process would now take 22 hours to complete, for the same amount of calculations and that was not an acceptable result.

We wanted to test, would it be possible to cram a large amount of Virtual machines to do the work and we came up with the following solution:

  1. 3 IBM x3550 Dual E5430 Processors
  2. 8 GB RAM
  3. ESX3i on each of them
  4. 8 Windows Server 2003 VM's on each server (total of 24 VM's)

We saw that when the Virtual machines were busy doing the calculation process, they were utilizing 100% of the vCPU, bringing the host to very close to 100% utilization during the calculations. This was the hard limit of 8 VM's on each host.

The amazing thing was that the same run that would have taken 22 hours on physical Desktops now took on these three servers 8 hours (a 275% increase in performance). The client was thrilled!

We tested different configurations of more VM's with lower CPU limits, but since the utilization was 100% regardless of the speed of the vCPU, the results stayed that the optimized configuration was 8 VM's per host.

The cost for the whole design (including a test Server and a Management sever) was

prices1

Fast forward 6 months. Budget issues, etc. etc.

Client comes to say that they are ready to go forward with the project. But there was one slight problem. During that period IBM announced that they were going over to a new series of Servers (x3550M2) and that the old ones were no longer available for purchase.

And also during that period ESX4 was released.

New servers were tested with the same data as before, this time on a x3550M2 with Dual E5530 Processors.

We fired up the tests with the same 8 VM's as before. the results were pretty much the same. Except for one small thing. We were seeing that the the CPU was only 50% utilized (or more correctly only 50% of the cores were being used) Huh??

Where did I get another 8 cores from? The answer - Hyper threading!

With Hyper threading enabled -  the machine recognized 16 cores. So we deployed another 8 VM's (16 in total on one host). And of course there was no problem of RAM, because since all the machines are exactly the same,

I am not saying that in all use cases this will work, but this one did. We ran the tests, and instead of the results we had with the previous hardware of 8 hours, the job was now complete in 5 (a 60% increase in performance). With this metric, we now could reduce the number of physical Hosts from 3 down to 2. Also the pricing for VMware Software that was on the original configuration (Foundation Accelerator Kit) was now replaced with the VMware Essentials at a lower price (40%). In the same configuration we now configured the system with 2 ESX hosts and 32 VM's.

the new price for the project.

prices2

So yes, the Nehalem Processors, are a good thing, and in this specific case - it managed to lower the costs and boost the performance.

Hope you enjoyed the ride!