Virtualization with Hyper-V & The NUMA Tax Is Not Just About Dynamic Memory


First of all to be able to join in this little discussion you need to know what NUMA is and does. You can read up on that on the Intel (or AMD) web site like http://software.intel.com/en-us/blogs/2009/03/11/learning-experience-of-numa-and-intels-next-generation-xeon-processor-i/ and http://software.intel.com/en-us/articles/optimizing-software-applications-for-numa/. Do have a look at the following SQL Skills Blog http://www.sqlskills.com/blogs/jonathan/post/Understanding-Non-Uniform-Memory-AccessArchitectures-(NUMA).aspx which has some great pictures to help visualize the concepts.

What Is It And Why Do We Care?

We all know that a CPU contains multiple cores today. 2,4,6,8,12,16 etc. cores. So in terms of a physical CPU we tend to talk about a processor that fits in a socket and about cores for logical CPUs. When hyper threading is enabled you double the logical processors seen and used. It is said that Hyper-V can handle hyper threading so you can leave it on. The logic being that it will never hurt performance and can help to improve it. I suggest you test it Smile as there was a performance bug with it once.  A processor today contains it own memory controller and access to memory from that processor is very fast. The NUMA node concept is older than the multi core processor technology but today you can state that a NUMA node translates to one processor/socket and all cores contained in that processor belong to the same NUMA node. Sometimes a processors contains two NUMA node like the AMD 12 core processors. In the future, with the ever increasing number of cores, we’ll perhaps see even more NUMA nodes per processor. You can state that all Intel processors since Nehalem with Quick Path Interconnect and AMD processors with Hyper-Transport are NUMA processors. But To be sure, check with your vendors before buying. Assumptions right?

Beyond NUMA nodes there is also a thing called processor groups which help Windows to use more than 64 logical processors (its former limit) by grouping logical processors into groups of which Windows handle 4 meaning in total Windows today can support 4*64=256 logical processors. Due to the fact that memory access within a NUMA node is a lot faster than between NUMA nodes you can see where a potential performance hit is waiting to happen. I tried to create a picture of this concept below. Now you know why I don’t make my living as a graphical artist Eye rolling smile

imageimage

 

To make it very clear NUMA is great and helps us in a lot of ways. But under certain conditions and with certain applications it can cause us to take a (serious) performance hit. And if there is anything certain to ruin a system administrators day than it is a brand new server with a bunch of CPUs and loads of RAM that isn’t running any better (or worse?) than the one you’re replacing. Current hyper visors like Hyper-V are NUMA aware and the better servers like SQL Server are as well. That means that under the hood they are doing their best to optimize the CPU & memory usage for performance. They do an very good job actually and you might, depending on your environment never, ever know of any issue or even the existence of NUMA.

But even with a NUMA knowledgeable hyper visor and NUMA aware applications you run the risk of having to go to remote memory. The introduction of Dynamic Memory in Windows 2008 R2 SP1 evens increases this likelihood as there is a lot of memory reassigning going on. Dynamic Memory actually educated a lot of Hyper-V people on what NUMA is and what to look out for. Until Dynamic Memory came on the scene, and the evangelizing that came with it by Microsoft, it was "only" the people virtualizing  SQL Server or Exchange & other big hungry application that were very aware of NUMA with its benefits and potential draw backs. If  you’re lucky the application is NUMA aware, but not all of them are, even the big names.

A Peak Into The Future

As it bears on this discussion, what is interesting that leaked screenshots from Hyper-V 3.0 or vNext  … have NUMA configuration options for both memory and CPU at the virtual machine level! See Numa Settings in Hyper-V 3.0 for a picture. So the times that you had to script WMI calls (see http://blogs.msdn.com/b/tvoellm/archive/2008/09/28/looking-for-that-last-once-of-performance_3f00_-then-try-affinitizing-your-vm-to-a-numa-node-.aspx) to assign a VM to a NUMA node might be over soon (speculation alert) and it seems like a natural progression from the ability to disable NUMA with W2K8R2SP1 Hyper-V in case you need it to avoid NUMA issues at the Hyper-V host level. Hyper-V today is already pretty NUMA aware and as such it will try to get all memory for a virtual machine from a single NUMA node and only when that can’t be done will it span across NUMA nodes. So as stated, Hyper-V with Windows Server 2008 R2 SP1 can prevent this form happening as we can disable NUMA for a Hyper-V host now. The downside is that you can’t get more memory even if it’s available on the host.

NumaSpanning

A working approach to reduce possible NUMA overhead is to limit the number of CPUs to 2 as this gives the largest amount of memory to the CPUs, in this case 50%. 4 CPUs only control 25%, etc.So with more CPU (and NUMA nodes) the risk of NUMA spanning is getting bigger very fast. For memory intensive applications scaling out is the way to go. Actually you could state that we do scale up the NUMA nodes per socket (lots of cores with the most amount of direct accessible memory possible) and as such do not scale up the server. If you can keep your virtual machines tied to a single CPU on a dual socket server to try and prevent any indirect memory access and thus a performance hit. But that won’t always work. If you ever wondered when an 8/12/16 core CPU comes in handy, well voila … here a perfect case: packing as many cores on a CPU becomes very handy when you want to limit sockets to prevent NUMA issues but still need plenty of CPU cycles. This should work as long as you can address large amounts of RAM per socket at fast speeds and the CPU internally isn’t cut up into to many multiple NUMA nodes, which would be scaling out NUMA node in the same CPU and we don’t want that or we’re back to a performance penalty.

Stacking The Deck

One way of stacking the deck in your favor is to keep the heavy apps on their own Hyper-V cluster. Then you can tweak it all you want to optimize for SQL Server, Exchange, … etc. When you throw these virtual machines in your regular clusters or for crying out loud on a VDI cluster your going to wreak havoc on the performance. Just like mixing server virtualization & VDI is a bad idea (don’t do it), throwing vCPU hungry, memory hogging servers on those cluster is just killing of performance and capacity of a perfectly good cluster. I have gotten into arguments over this as some thing one giant cluster for whatever need is better. Well no, you’ll end up micro managing placement of VM with very different needs on that cluster effectively “cutting” it up in smaller “cluster parts”. Now is separate clusters for different needs always the better approach? No, it depends, If you only have some small SQL Server needs you  get away with one nice cluster. It depends, I know, the eternal consultants answer, but I have to say it. I don’t want to get angry mails from managers because someone set up a 6 node clusters for a couple of SQL Server Express databases Winking smile There are also concepts called testing, proof of concept, etc. It’s called evidence based planning. Try it, it has some benefits that become very apparent  when you’re going to virtualize beefy SQL Server, SharePoint and Exchange servers.

How do you even know it is happening apart from empirical testing. Aha, excellent question! Take a look at the "Hyper-V VM Vid Numa Node" counter set and read this blog entry by on this subject http://blogs.msdn.com/b/tvoellm/archive/2008/09/29/hyper-v-performance-counters-part-five-of-many-hyper-vm-vm-vid-numa-node.aspx. And keep an eye on the event log for http://technet.microsoft.com/hi-in/library/dd582929(en-us,WS.10).aspx (for some reason there is no comparable entry for W2K8R2 on TechNet)

Conclusions

To conclude, all of the above people is why I’m interested in the some of the latest generation of servers. The architecture of the hardware allows for a the processor to address twice the "normal" amount of memory when you only put dual CPUs on a quad socket motherboard. The Dell PowerEdge R810 and the M910 have this and it’s called a FlexMemory Bridge and that allows more memory to be available without a performance hit. They also allow for more memory per socket at higher speeds. If you put a lot of memory directly addressable to one CPU you see a speed drop. A DELL R710 with 48 GB of RAM runs at 1033 MHZ  but put 96 GB in there and you fall back to 800 Mhz. So yes, bring on those new quad socket motherboards with just 2 sockets used, a bunch of fast direct accessible memory in a neat 2 unit server package with lost of space for NIC cards & FC HBAs if needed. Virtualization heaven :-) That’s what I want so I can give my VMs running SQL Server 2008 R2 & "Denali" (when can I call it SQL Server 2012?) a bigger amount of direct accessible memory form their NUMA node. This can be especially helpful if you need to run NUMA unaware applications like SAP or such. Testing is the way to go for knowing how well a NUMA aware hyper visor and a NUMA aware application figure out the best approach to optimize the NUMA experience together.  I’m sure we’ll learn more about this as more and more information becomes available and as technology evolves.  For now we optimize for performance with NUMA where we can, when we can with what we have :-) For Exchange 2010 (we even have virtualization support for DAG mailbox servers now as well) scaling out is easier as we have all the neatly separate roles and control just about everything down to the mail client. With SQL Server applications this is often less clear. There is a varied selection of commercial and home grown applications out there and a lot of them can’t even scale out, only up. So your mileage of what you can achieve may vary. But for resource & memory heavy applications under your control, for now, scaling out is the way to go.

A Brighter Future For Public Folders?


The Exchange Team posted a blog entry asking for feedback on how we use public folders. Nice to see they are taking an interest again. The past 4 years the mantra was “move away from them”, “do it now while you still have the time”, etc. SharePoint was always put forwards as number one replacement option. For some scenarios this is indeed a good choice but let’s face it, for some public folder uses there is no decent replacement and that hurts us as they haven’t seen any decent improvements in the last 2 Exchange releases. I know public folders have always been a bit problematic and finicky for us administrators. They tend to need a bit of voodoo and patience to trouble shoot and get running smoothly (see  blog post by me for an example of this). But instead of using that of an excuse to get rid of them they could also choose to invest in making them as reliable and robust as mail databases. Giving them the same high availability features might also be a welcome improvement, especially now with DAGs in Exchange 2010.

Especially in the Exchange 2007 era Microsoft was promoting getting rid of them actively. But they are still around because so many people use them and they have not decent alternative for all scenarios. In that respect they do listen to their customers. But we want improvements. Some of the functionality we need is there but we really need more robust, reliable and high available public folders. As as shared mail instrument for both sending and receiving mail in a team public folders beat shared mailboxes and SharePoint any time.  It also shines for maintaining a shared repository of contacts. I’m not a proponent of using public folder for a document repository but I understand that its relative simple usage and data protection via replicas still sounds attractive to some versus the complexity of SharePoint. Sure SharePoint has more to offer but perhaps they don’t need those capabilities and to make matters even less attractive; it’s quite an effort to migrate from public folders to SharePoint.

So that left us public folders users feeling a bit abandoned with a message of get out but no easy path to go anywhere else that serves all our needs. So until today all my customers are still and want to  keep using public folders. They are a worried however that one day they will be left out in the cold. But perhaps there is a better future on the horizon for public folders.  They are asking us to “Help us learn more about how you use public folders today!” in that blog post. The emphasis is on “usage scenarios, folder management habits or thought process around public folder data organization”. So if you need and use public folders in any way and you’d like for them to get more attention and evolve into more robust and functional instruments give Microsoft your feedback. Exchange 2010 has brought us great features & very affordable high availability together with support for virtualization. Now we either need a better alternative to public folders than the ones we got now or (my preference) we need better public folders. Since consumption of public folders occurs mostly in Outlook I would suggest the latter. And while we’re asking, bring back access to folder shares in OWA Winking smile.

Exchange 2010 SP1 Rollup 3 Pulled – BlackBerrys sending duplicate messages


Just a quick notification. Due to the duplicate message issue with RIM Blackberry devices and Exchange 2010 Sp1 Rollup 3 Microsoft is temporarily pulling RU3. If you don’t use BES and have no other issues, don’t sweat it. If you wanted RU for UDP support with Outlook 2003 or to fix the DAG Copies GUI bug you’ll have to wait especially if you have Blackberry devices. More the the Exchange Team Blog here.

Exchange 2010 SP1 Rollup 3 Released: Fixes Bug since SP1 in EMC & Brings Back UDP Support


UPDATE March 9th 2011: I have installed Exchange 2010 SP1 Rollup 3 at one site and this did indeed fix this issue finally.

The Microsoft Exchange Team Blog just announced the release here Released: Update Rollup 3 for Exchange 2010 SP1 and Exchange 2007 SP3. This is good news for all the folks out there that got bitten by the Exchange 2010 SP1 bug that causes the Exchange Management Shell (EMC) not to show all database copies after upgrading to exchange 2010 SP1. I’ve blogged about this in EMC Does Not Show All Database Copies After Upgrade To Exchange 2010 SP1 and chimed in to the discussion at Database copies are not all showing up in EMC after SP1 upgrade on the Exchange forums. So apart from cheers for the UDP notifications returning in support of Outlook 2003 let’s hear it for a the EMC case sensitivity bug getting fixed Smile

After while Microsoft also blogged about this Database copies fail to display after upgrading to Exchange 2010 Service Pack 1

We got notified around October 13th that they would included the fix in Exchange 2010 SP1 Roll Up 3 but that they where working on an interim update. They dropped the ball there because communication died about the latter and we were left to conclude we would have to wait for Rollup 3. Well that took it’s time. It’s now march 2011. One of the reasons I think it took so long for Rollup 3 to arrive is the decision for to re-add UDP support for Exchange 2010 for use with Outlook 2003 as blogged about in Microsoft Listens To Customers & Adds UDP Notification Support Back to Exchange 2010

In the ends we will have silly and long unaddressed bug fixed and a welcome aid in migrating customers to Exchange 2010 that are running Outlook 2003. I do wonder however if the bug had been with  PowerShell in the EMS and not in the EMC if Microsoft would have fixed this sooner.  Sure it wasn’t an issue as you could manage everything perfectly using PowerShell and it was only a GUI bug but for some users/customers this is not as obvious  and it made ‘m feel a bit like 2nd class citizens so we had to do some extra “damage” control on that front as well.

Microsoft Listens To Customers & Adds UDP Notification Support Back to Exchange 2010


Well, after almost 14 months of deploying Exchange 2010 and tweaking the Outlook 2003 settings via GPO’s to give users an acceptable experience Microsoft adds support for User Datagram Protocol (UDP) notification functionality back into Microsoft Exchange Server 2010. By doing so they recognize that a lot of businesses & organizations will be using Outlook 2003 for a while and that not all of them where happy to deal with the way Outlook 2003 functions with Exchange 2010. More information on the UDP issue can be found here http://support.microsoft.com/kb/2009942 (In Outlook 2003, e-mail messages take a long time to send and receive when you use an Exchange 2010 mailbox). Now most my customers use cached mode where possible and a GPO Setting to reduce the Maximum Polling Frequency registry entry to 5 seconds helped. But there are places where cached mode is not an option (Terminal Services) or people don’t accept this change in behavior and go with Outlook 2007 instead of 2010  or even choose to deploy Exchange 2007 over 2010. All because of this dropping of the UDP notification support.

Now this functionality will be back with in Exchange Server 2010 Service Pack 1 Roll-Up 3 (SP1 RU3).  Good news for people dealing with Outlook 2003 and Exchange 2010. Less good news for the people dealing with the GUI bug that Exchange 2010 SP1 introduced where the Exchange Management Console does not show all database copies after upgrading to Exchange 2010 SP1. This is set to be fixed in Roll-Up 3 but to get the UDP support back they adjusted the release schedule for the E2K10 Sp1 Roll-Up 3, which is now expect to be released in March 2011. So we’ll have to wait a bit longer for that fix. As you noted you need to be running Exchange 2010 SP1 to get this backward compatibility support for outlook 2003.

Read this announcement on the Exchange Team Blog: UDP Notification Support Re-added to Exchange 2010

Exchange 2010 SP1 Public Folder High Availability Returns with Roll Up 2


Al lot of people were cheering in the inter active session on Exchange 2010 SP1 High Availability with Scott Schnoll and Ross Smith of the Exchange Team. They announced (between goofing around) that the alternate server that provides failover to the clients (so they can select another public folder database to connect to) for public folders and that is sadly missing from Exchange 2010 would return with Exchange 2010 SP 1 Roll Up 2. This feature is needed by Outlook to automatically connect to an alternate public folder and it’s return means that high availability will finally be achievable for public folders in Exchange 2010 SP1. That’s great news and frankly an “oversight” that shouldn’t have happened even in Exchange 2010 RTM. The issue is described in knowledge base article “You cannot open a public folder item when the default public folder database for the mailbox database is unavailable in an Exchange Server 2010 environment” which you can find here  http://support.microsoft.com/kb/2409597.

In previous versions of exchange you made public folders highly available to Outlook clients by having replica’s. The Outlook clients could access an replica on another server if the default public folders as defined in the client settings of the database was not available. Clustering in Exchange 2010 does nothing for public folders. In Exchange 2010 the Outlook clients connect directly to the mailbox server in order to get to a public folder so they do not leverage the CAS or CAS array. Also the DAG does not support public folders and as clustering happens at the database level on DAG members and no longer at the server level we no longer get any high availability for the clients with clustering in Exchange 2010. Sure, if you have multiple replica’s the data is highly available but the access to another replica/database/server for public folder doesn’t happen automatically in Outlook when you’re running Exchange 2010. To make that happen you need an alternate server to be offered to the client for selection But as this feature is missing in Exchange 2010 up until SP1 Roll Up 1 in reality until now you need to keep using Exchange 2003/2007 to have public folder high availability.  Exchange 2010 SP1 Roll Up 2 will change that. I call that good news.

Exchange 2010 Public Folder Worries At Customer: No existing ‘PublicFolderProxyInformation’ matches the following Identity


A customers was recently using the EMC GUI in their Exchange 2010 environment, having a look a the public folder properties when they got this error:

—————————
Microsoft Exchange
—————————
Can’t log on to the Exchange Mailbox server ‘DAGMBX.demolab.com’. No existing ‘PublicFolderProxyInformation’ matches the following Identity: ‘\demolab\HeadQuarters\FincanceDepartment\FiscalUnit’. Make sure that you specified the correct ‘PublicFolderProxyInformation’ Identity and that you have the necessary permissions to view ‘PublicFolderProxyInformation’.. It was running the command ‘Get-MailPublicFolder -Identity ”\demolab\HeadQuarters\FincanceDepartment\FiscalUnit” -Server ‘DAGMBX.demolab.com”.
—————————
OK  
—————————

image

Hey … when did this start?  They never complained about this before, but did they ever use it.This probably was actually the first time they tried to look/edit the public folder permissions after doing the following over the past month and in this particular order:

  1. Moving to Exchange 2010 SP1
  2. Removing the last Exchange 2007 servers from the organization.

Now I know about a bug that exist and that was recently blogged about by Dan Rowley in Exchange 2010 get-mailpublicfolder \name returns No existing ‘PublicFolderProxyInformation’. The point is that there should be a mailbox database mounted on the server that has the System Attendant mailbox associated with it.  However, this is not the case here.  The mailbox servers are member of a DAG and all of them host a copy of the PF. The replication runs fine, users can work with them, the remaining Outlook 2003 users report no issues. But there is more in that blog: “Basically the work around is to mount a mailbox store on the server that is generating the error, or if there is a database already mounted – verify the system attendant is properly configured to point to a valid homemdb.” Now that last point is interesting and indeed that was the issue here. On two members of the DAG the homeMDB attribute was not set. Now what could be the root cause of this? I don’t know, certainly not in this case. All things have been done by the book … Ah well, luckily the fix is not very difficult. We need to put a valid entry in the homemdb. In this case we’ll take the value of the DAG member that had it filled in. This seems to be the most recently created database in the DAG. In Exchange 2010 this is done as described below. Note we have a DAG here, so we can work with any database that has a valid copy on the server(s) in question.

How to check the homeMDB attribute value:

  • Start ADSI Edit and navigate to CN=Configuration,DC=,DC=,DC=/Services/Microsoft Exchange//Administrative Groups/Exchange Administrative Group (FYDIBOHF23SPDLT)//Servers/MBXServerWithIssue
  • Right-click Microsoft System Attendant, and then click Properties to display the  Attributes list and find the homeMDB attribute.
  • If the homeMDB attribute has a value make sure  it points to a valid mailbox database. If the value of the homeMDB attribute is empty (not set) or incorrect you need to fix this.

image

How Fix the homeMDB attribute value:

  • In ADSI Edit navigate to Start ADSI Edit and navigate to CN=Configuration,DC=,DC=,DC=/Services/Microsoft Exchange//Administrative Groups/Exchange Administrative Group (FYDIBOHF23SPDLT)/Databases."
  • Right-click a mailbox database that is local (NON DAG) or has a valid copy on the server (DAG) , select Properties and in  the Attributes list, select the distinguishedName, and then click View.
  • Copy the value of the distinguishedName attribute and close the dialogs

image

NOTE in this particular case we can copy the value that was filled in the homeMDB attribute on one of the DAG members. You might not have one set in any.

  • Right-click Microsoft System Attendant, and then click Properties to get to the Attributes list, click homeMDB, and then choose Edit
  • In the Value box, paste the value that you copied form the distinguishedName attribute
  • Close the dialog boxes and exit ADSI Edit

When you’ve don this you’ll find following entry in the application event viewer:

Log Name:      Application

Source:        MSExchangeSA

Date:          11/2/2010 3:25:59 PM

Event ID:      9159

Task Category: General

Level:         Warning

Keywords:      Classic

User:          N/A

Computer:      DAGMBX.demolab.com

Description:

Microsoft Exchange System Attendant has detected that the system attendant object in the DS has been modified. System Attendant needs to restart the Microsoft Exchange Free Busy Publishing Service.

image

After that, I wait 10 minutes to get AD replicated and make sure to close the EMC and start it again and voila, it’s fixed.