New KB Article 2494016 Related to Windows Server 2008 SP1 Hyper-V: Stop error 0x0000007a When Using CVS in Redirected Access


Well not a day after my blog post Extra Info on Clustering & Hyper-V with Dynamic Memory When You Start With Windows Server 2008 R2 SP1on important hotfixes for Hyper-V clustering with Windows Server 2008 R2 SP1 Microsoft releases a new hot fix for issue below. I’ll add it to the post to keep up to date.

Stop error 0x0000007a occurs on a virtual machine that is running on a Windows Server 2008 R2-based failover cluster with a cluster shared volume, and the state of the CSV is switched to redirected access

The KB article with instructions on how to get the hot fix is here: http://support.microsoft.com/kb/2494016/en-us?sd=rss&spid=14134

The scenario is detailed as follows:

Consider the following scenario:

  • You enable the cluster shared volume (CSV) feature on a Windows Server 2008 R2-based failover cluster.
  • You create a virtual machine on the CSV on a cluster node.
  • You start the virtual machine on the cluster node.
  • You move the CSV owner to another cluster node, and you change the state of CSV to redirected access.
  • The connection that is used for redirected access is switched to another connection when one of the following scenarios occurs:
    • The cable for local area network (LAN) is disconnected.
    • The related network adapter is disabled.
    • The connection is switched by using Failover Cluster Manager.

In this scenario, you receive a Stop error message that resembles the following in the virtual machine:

STOP 0x0000007a ( parameter1 , parameter2 , parameter3 , parameter4 )
KERNEL_DATA_INPAGE_ERROR

Note

  • The parameters in this Stop error message vary, depending on the configuration of the computer.
  • Not all "0x0000007a" Stop error messages are caused by this issue.
  • You may also receive other Stop error messages when this issue occurs. For example, you may receive a "0x0000004F" Stop error message.

Extra Info on Clustering & Hyper-V with Dynamic Memory When You Start With Windows Server 2008 R2 SP1:


Here’s a quick “heads up” if your starting to use or thinking about using Windows Server 2008 R2 SP1 for your Hyper-V clusters. The most common issues I’ve seen in the wild are:

  1. http://workinghardinit.wordpress.com/2011/04/01/kb2230887-hotfix-for-dynamic-memory-with-windows-2008-standard-web-edition-does-not-apply-to-without-hyper-v-editions/ This one is being worked on and the hotfix will be re-released to support the “Without Hyper-V” SKU of Windows Server 2008 SP2.  It’s a simple oversight but one that can be important when your Hyper-V clusters are filled with that SKU.
  2. We also got bitten by this one Déjà vu Bug: The network connection of a running Hyper-V virtual machine may be lost under heavy outgoing network traffic on a computer that is running Windows Server 2008 R2 SP1, but the hotfix was already available luckily.
  3. And than one to head and to read the TechNet forum about Cluster Validation Bug In Windows 2008 R2 SP1 – Disk has a Persistent Reservation on it. They are also working on a fix. I’ve written a blog post on this and I suggest you read it and also take note of the discussion in the TechNet forum.

    UPDATE: The hotfix for issue 3 has become available today, April 26th 2011 as announced on the TechNet forum here:

    A hotfix is now available that addresses the Win2008 R2 service pack 1 issue with Validate on a 3+ node cluster. This is KB 2531907. The KB article and download link will be published shortly, in the mean time you can obtain this hotfix immediately free of charge by calling Microsoft support and referencing KB 2531907.   Update 27/05/2011 Here is the link: http://support.microsoft.com/kb/2531907/en-us?sd=rss&spid=14134

An other one that I haven’t seen in the wild is:

Windows Server 2008 R2 installation may hang if more than 64 logical processors are active. There is is a workaround and a hotfix for this one.

Issue: When you try to install Windows Server 2008 R2 on a computer that has more than 64 logical processors, Windows Setup may stop responding in one of the following operations:

  • Initialization of Windows Setup
  • One of the two restarts that are required to complete Setup

Cause: This issue occurs because of an error in the Network Driver Interface Specification.This issue occurs because of an error in the Network Driver Interface Specification (NDIS) driver.
When a computer has more than 64 logical processors, the NDIS driver does not correctly handle some operations. Therefore, the computer encounters stop responding issues and other system failures.

I don’t have any nodes under my care who have more than 64 logical processors so that’s why I guess Smile But with ever more cores available you it’s bound to happen in the near future.

Update 2: To keep me busy this KB article was released within 24 hours of me posting this blog on a BSOD with CSV and redirected access for witch a hot fix is available

Stop error 0x0000007a occurs on a virtual machine that is running on a Windows Server 2008 R2-based failover cluster with a cluster shared volume, and the state of the CSV is switched to redirected access

The KB article with instructions on how to get the hot fix is here: http://support.microsoft.com/kb/2494016/en-us?sd=rss&spid=14134

The scenario is detailed as follows:

Consider the following scenario:

  • You enable the cluster shared volume (CSV) feature on a Windows Server 2008 R2-based failover cluster.
  • You create a virtual machine on the CSV on a cluster node.
  • You start the virtual machine on the cluster node.
  • You move the CSV owner to another cluster node, and you change the state of CSV to redirected access.
  • The connection that is used for redirected access is switched to another connection when one of the following scenarios occurs:
    • The cable for local area network (LAN) is disconnected.
    • The related network adapter is disabled.
    • The connection is switched by using Failover Cluster Manager.

In this scenario, you receive a Stop error message that resembles the following in the virtual machine:

STOP 0x0000007a ( parameter1 , parameter2 , parameter3 , parameter4 )
KERNEL_DATA_INPAGE_ERROR

Note

  • The parameters in this Stop error message vary, depending on the configuration of the computer.
  • Not all "0x0000007a" Stop error messages are caused by this issue.
  • You may also receive other Stop error messages when this issue occurs. For example, you may receive a "0x0000004F" Stop error message.

The Dilbert® Life Series: The Carbon Copy, Blind Carbon Copy e-mail Pandemic


Disclaimer: The Dilbert® Life series is a string of post on corporate culture from hell and dysfunctional organizations running wild. This can be quite shocking and sobering. A sense of humor will help when reading this. If you need to live in a sugar coated world were all is well and bliss and think all you do is close to godliness, stop reading right now and forget about the blog entries. It’s going to be dark. Pitch black at times actually, with a twist of humor, if you can laugh at yourself.

Have you ever worked in an office where no one ever takes responsibility and all communication is CC’d and BCC’d to an absurd number of people? There are corporate “cultures” that act very different from our own style. Size often has nothing to do with it. But this habit is just another symptom and indication of blatant management failure.

These are organizations where no one feels like they can make decisions or take actions without involving half of the company in some form of meeting or committee. CYA (Cover Your Ass) in action. One of the symptoms is the fact that just about anyone who has (or thinks they have) some important or urgent information sends all mail with some managers in CC or even BCC. Often the middle management acts the same way and before you know it more CC & BCC recipients are involved making the entire mail flow a mess and proving without any doubt that you’re in kinder garden. Decisions are postponed or never made. No one is going to take responsibility for a decision, that much is for sure. So when a decision is finally made it often by the wrong person, too late and probably not the best one. Basically you have a management structure where no one knows who’s responsible and is utterly dysfunctional

This is also a symptom of another issue: managers without authority. Yes, they are not very good at their area of expertise; they can’t delegate or organize and lack real people skills. These are often found in middle management where they can be used by the upper level. After all there needs to be someone between the hammer (management) and the anvil (employees). You see authority does not come from your rank or pay grade. It comes from what you know and can do and the support you get from other well respected managers or leaders. If you need to CC all your bosses and all bosses of the people you’re mailing that indicates that you’re a whining kid that can’t hack it. And no, simply not using CC or BCC anymore won’t solve that problem. I thought I’d mention this as they tend to think and act rather simplistic. We have a saying: “You salute the rank, not the man. You respect the man, not the rank.”

Anyway, the mail process is as most people in the mail are not involved, don’t care, don’t need and shouldn’t care and hopefully don’t want to care. Once it got so out of control I added some more people to the CC list and wrote sarcastically at the top of the mail body that we really should make an effort to senselessly involve as many people as we possibly could. Not very nice, I know. Shouldn’t do that, I know. Some got the message, some didn’t. Another solution is to ignore the mail. Really if so many people, including a bunch of managers above and way above me are in the CC list I would not have the arrogance to assume I have anything to say in the matter and thus I await their proposals or decisions.

The best employee a manager can have is Vanilla Ice. Really “…If there was a problem yo I’ll solve it …”, that’s what a good manager needs and wants. You see your boss has better things to do than micro manage the details of your incompetence. You know your end state, so all you need to do is figure out what you need and how you’ll achieve it. Results, that’s what your boss really needs, not details and moaning about how hard middle management is. I know shit flows down and gripes flow up but try to maintain a balance or you’ll find yourself holding a pink slip or being promoted to where you can do the least damage and annoy the least amount of people. I secretly think some people have that as a cunning plan I don't know smile.

But if you’re stuck with a couple of micro managers, do not despair. You can work around them, unless they surround you. In the latter case, break out and run! They deal with urgent and very pseudo important problems that are actually just details which are benign in nature and are not negatively affected by all that overzealous attention. So the trick is to keep it that way. You have to treat them like mushrooms: keep them in the dark and feed ‘m shit. As long as they don’t know any better and keep getting their “data” fix they are lovable. Whatever you do, don’t give ‘m real information or show them the real problems. Micromanagers really can’t handle them. Ambitious ones that get into the light and get gourmet food can become very dangerous. Both to themselves and the organization. Now you do need ‘m to keep ‘m involved and they need to sign and approve work and proposals. So give ‘m finished work, solutions that are ready to go. Forget about involving them in the details or the decision making process, they’ll just get lost. And guess what, this is like a good boss should work and act so we have a win-win situation for the entire company!

Now you know how to help prevent that e-mail becomes a burden instead of a useful tool. No CC or BCC unless really needed. Go practice it.

BriForum Europe 2011 & The Experts Conference Europe 2011


Great news from the educational & conference front. First of all, I’m attending BriForum in London, United Kingdom in May (http://briforum.com/Europe/index.html).  That’s good news, normally we’d have to pop over the big pond to go to that one, so this is pretty neat. And timely, due to some prospecting I’m doing for Disaster Recovery,  Business continuity, application aware storage in a virtualized environment It’s a good match and I hope to get in to some educational discussions about the challenges we all face. Some of the storage vendors we’re interested in are there as well so there is certainly some potential to make it a good experience.

And just recently confirmed that The Experts Conference is coming to Europe. TEC2011 Europe will be held in Frankfurt, Germany from October 17th to October 19th 2011. This conference is high quality and created to fill the needs of the most experienced users, which is one of the reasons I would like to attend. The more you learn & grown the more you bump into the next level of challenges and being able to learn form high level content and interact with experienced speakers and attendees who are dealing with the same issues can be very rewarding. Attendees of TechEd have a way to measure the level of the sessions, well, they are all supposed to be Level 400 only. Quest is hosting this, so they certainly should be able to round up the expertise.  I’m going to make it to the new “track” at this conference and that’s “Virtualization & Cloud”. More information can be found here http://www.theexpertsconference.com/europe/2011/virtualization-cloud-training/overview/

The timing of these conferences is pretty good. As I said we’re doing a lot of prospecting right now and hope to get a lot of information from attending these. For anyone interested why I attend conferences and why I think they are valuable see mu blog post on this subject http://workinghardinit.wordpress.com/2010/06/05/why-i-find-value-in-a-conference/

Windows Hyper-V Server R2 SP1 is available for download


Ever since Windows 2008 R2 SP1 became available people have been waiting for Windows Hyper-V Server R2 to catch up. The wait is over as last week Microsoft made it available on their website http://technet.microsoft.com/en-us/evalcenter/dd776191.aspx. That’s a nice package to have when it serves your needs and there ‘s little to argue about. Guidance on how to configure it and how to get remote management set up has been out for a while and is quite complete so that barrier shouldn’t stop your from using it where appropriate. If you’re staring out head over to José Barreto’s blog to get a head start and here’s some more information on the subject http://technet.microsoft.com/en-us/library/cc794756(WS.10).aspx and naturally there are some tools around to help out if needed and the Microsoft provided tools are not to you liking http://coreconfig.codeplex.com/. So there you go, now you have a free and very capable hyper visor available to the public that gives you high availability, Live Migration, Dynamic Memory, Remote FX and they even threw in their software iSCSI target 3.3 in to the free package so you can build a free iSCSI SAN supported by Microsoft. Live is good Smile.

Déjà vu Bug: The network connection of a running Hyper-V virtual machine may be lost under heavy outgoing network traffic on a computer that is running Windows Server 2008 R2 SP1


Anyone who’s been doing virtualization with Hyper-V on Windows 2008 R2 has a good change of having seen the issue described in http://support.microsoft.com/kb/974909/en-us

You install the Hyper-V role on a computer that is running Windows Server 2008 R2.

  • You run a virtual machine on the computer.
  • You use a network adapter on the virtual machine to access a network.
  • You establish many concurrent network connections, or there is heavy outgoing network traffic.

In this scenario, the network connection on the virtual machine may be lost. Additionally, the network adapter is disabled.
Note You have to restart the virtual machine to recover from this issue.

We’ve seen this one on VM’s that have indeed a lot of outgoing traffic.  In our environment the situation looks like this:

  • You can access the VM with Hyper-V Manager or SCVMM but not via RDP as all Network connectivity is lost.  The status the  guest NIS is always “Enabled” but there is no traffic/connectivity
  • You can try to disable the NIC but this tales a  very long time and when you try to enable it again this never succeeds. Disconnecting the NIC form the virtual network and connecting it again doesn’t help either.
  • You need to shut down the host but this takes an extremely long time, so long you really can’t afford to wait if it ever succeeds. It seems to hang at shutting down with a “non whirling whirly”.  So finally you’ll power off the VM and start it up again. Apart from entries related to having not connectivity the event logs are “clean” and there is no indication as to what happened.

Well this exact same issue is back with Windows 2008 R2 SP1. That’s the bad news. The good news is there is a hotfix for it already so you can fix it. You can read up on this issue in Knowledge Base article 2263829  and request the hotfix here. Instructions to get the hotfix are in there as well as a reference to the previous fixes for Windows 2008 R2 RTM.

Consider the following scenario:

  • You install the Hyper-V role on a computer that is running Windows Server 2008 R2 Service Pack 1 (SP1).
  • You run a virtual machine on the computer.
  • You use a network adapter on the virtual machine to access a network.
  • You establish many concurrent network connections. Or, there is heavy outgoing network traffic.

In this scenario, the network connection on the virtual machine may be lost. Additionally, the network adapter may be disabled.
Notes

  • You must restart the virtual machine to recover from this issue.
  • This issue can also occur on versions of Windows Server 2008 R2 that do not have SP1 installed. To resolve the issue, apply the hotfix that is described in one of the following Microsoft Knowledge Base articles:

    974909 (http://support.microsoft.com/kb/974909/ ) The network connection of a running Hyper-V virtual machine is lost under heavy outgoing network traffic on a Windows Server 2008 R2-based computer
    2264080 (http://support.microsoft.com/kb/2264080/ ) An update rollup package for the Hyper-V role in Windows Server 2008 R2: August 24, 2010

Oh yeah, people often seem confused  as to where to install the hotfix. Does it go on the Hyper-V hosts or and/or on the guest?  It’s a hyper visor bug in Hyper-V so it goes on the hosts. Have a nice weekend.

Cluster Validation Bug In Windows 2008 R2 SP1 – Disk has a Persistent Reservation on it


Pretty soon after the RTM of Windows 2008 R2 SP1 release we were discussing a bug on the TechNet forum (Hyper-V Cluster issues after applying Win2008 R2 SP1 on a 3 node Cluster!) here. If you have a Windows 2008 R2 SP1 cluster with more than 2 nodes you get the following warning:

List Potential Cluster Disks

Disk with identifier 2sef8cdf has a Persistent Reservation on it. The disk might be part of some other cluster. Removing the disk from validation set

“Normally” you would expect such a warning if the LUN ever belonged to another cluster and it needs the old reservation cleared. To do that you would use following command on the node that throws the warning (where in this example the disk is disk 2 in disk manager/diskpart) and after making sure it is not in use anywhere else in the SAN

"cluster node clusternode1 /clearpr:2"

However this is not the cause here as were most others in this discussion. And I’m pretty no san software or MPIO software is putting a reservation on there either so what is this? A bug? Well yes, it has been confirmed by Microsoft support that is is indeed a bug an that is fix will be made available by April 18th2011 .

This was not a show stopper bug, but it could be one if you needed to add a host to a cluster and confirm all is well and supported. However if you’re certain you’ve done everything right you can choose not to run cluster validation.

I will update this blog with more information when the fix becomes available.

UPDATE:  The hotfix has become available today, April 26th 2011 as announced on the TechNet forum here:

A hotfix is now available that addresses the Win2008 R2 service pack 1 issue with Validate on a 3+ node cluster.  This is KB 2531907.  The KB article and download link will be published shortly, in the mean time you can obtain this hotfix immediately free of charge by calling Microsoft support and referencing KB 2531907. Update 27/05/2011 Here is the link: http://support.microsoft.com/kb/2531907/en-us?sd=rss&spid=14134