Quantcast
Channel: VMware Communities : Discussion List - All Communities
Viewing all articles
Browse latest Browse all 176483

Newer IBM servers not vMotion compatible despite identical CPUs

$
0
0

I just installed a new HS22V blade, identical to a pack of others installed about a year ago.  CPUs are identical, UEFI version is identical, Processor Settings in the firmware are identical.  Still not vMotion-compatible with existing cluster of blades though.  Crazy.  With VMware tech support and an IBM KB article we finally figured it out and fixed it.

 

It seems that IBM servers shipping with UEFI prior to 1.10 came with the Westmere "AES" feature disabled and those servers shipping with 1.10 and later have it enabled.  Even if you upgrade an older server to 1.10 or 1.12, etc, it still has AES disabled.  AES isn't something you can enable/disable from the BIOS Settings for some reason so it's not readily obvious what's out of sync.  While your 0x1 level, ecx row may vary from mine due to other cpu features, mine was 0000:0010:1001:1010:0010:0010:0000:0011 (or from the CPUID CD, ID1ECX: 0x029ee3ff).  Like I said, yours will vary based on what other features you have enabled such as VTx and XD.  The bits that seemed to come into play (broken vMotion compatibility) were these:  ----:--1-:----:----:----:----:----:--1- (in other words, the two "1's" in the bit mask above should have been "0's" in order for vMotion to work).  It seems that in our case, disabling AES on this new blade makes it vMotion-compatible with other Westmere blades in our cluster.  Therefore, disabling AES turns the two "1's" in the mask above into "0's".

 

AES ("Advanced Encryption Standard") instructions enable fast and secure data encryption and decryption.  I suppose if you have VMs doing a lot of encryption/decryption you might benefit from having this enabled.  For us it's not worth the hassle of rebooting all our VMs to get it enabled on all our hosts (there's a vMotion brickwall between any hosts with AES enabled and AES disabled).  I suppose we could enable EVC while we fix each host and that might allow us to fix the issue without rebooting VMs but I don't see enough benefit to us to warrant taking that much time so we're just disabling it on the new blade and moving on.

 

To disable AES (the alternative is to enable it on the existing blades which, one way or another, will require you to reboot all VMs sooner or later) you create a bootable CD from IBM.  When server boots to that CD it automatically fixes the problem without you having to push a button.  Here's what the screen looks like once the CD finishes booting (as you can tell, it's like IBM created this CD specifically for this exact issue):

aes-disable.jpg

 

Here's the goods:

KB Reference: http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5086963&brandind=5000020

Bootable ISO for changing the AES feature: ftp://testcase.boulder.ibm.com/eserver/fromibm/xseries/BoMC-2.20-uEFI-AesEnable-to-enabled-vmotion-fix.iso

Description of the AES CPU feature:  http://software.intel.com/en-us/articles/intel-advanced-encryption-standard-aes-instructions-set/

 

 

Benny


Viewing all articles
Browse latest Browse all 176483

Trending Articles