Design Considerations For Converged Networking On A Budget With Switch Independent Teaming In Windows Server 2012 Hyper-V


Last Friday I was working on some Windows Server 2012 Hyper-V networking designs and investigating the benefits & drawbacks of each. Some other fellow MVPs were also working on designs in that area and some interesting questions & answers came up (thank you Hans Vredevoort for starting the discussion!)

You might have read that for low cost, high value 10Gbps networks solutions I find the switch independent scenarios very interesting as they keep complexity and costs low while optimizing value & flexibility in many scenarios. Talk about great ROI!

So now let’s apply this scenario to one of my (current) favorite converged networking designs for Windows Server 2012 Hyper-V. Two dual NIC LBFO teams. One to be used for virtual machine traffic and one for other network traffic such as Cluster/CSV/Management/Backup traffic, you could even add storage traffic to that. But for this particular argument that was provided by Fiber Channel HBAs. Also with teaming we forego RDMA/SR-IOV.

For the VM traffic the decision is rather easy. We go for Switch Independent with Hyper-V Port mode. Look at Windows Server 2012 NIC Teaming (LBFO) Deployment and Management to read why. The exceptions mentioned there do not come into play here and we are getting great virtual machine density this way. With lesser density 2-4 teamed 1Gbps ports will also do.

But what about the team we use for the other network traffic. Do we use Address hash or Hyper-V port mode. Or better put, do we use native teaming with tNICs as shown below where we can use DCB or Windows QoS?

image

Well one drawback here with Address Hash is that only one member will be used for incoming traffic with a switch independent setup. Qos with DCB and policies isn’t that easy for a system admin and the hardware is more expensive.

So could we use a virtual switch here as well with QoS defined on the Hyper-V switch?

image

Well as it turns out in this scenario we might be better off using a Hyper-V Switch with Hyper-V Port mode on this Switch independent team as well. This reaps some real nice benefits compared to using a native NIC team with address hash mode:

  • You have a nice load distribution of the different vNIC’s send/receive traffic over a single member of the NIC team per VM. This way we don’t get into a scenario where we only use one NIC of the team for incoming traffic. The result is a better balance between incoming and outgoing traffic as long an none of those exceeds the capability of one of the team members.
  • Easy to define QoS via the Hyper-V Switch even when you don’t have network gear that supports QoS via DCB etc.
  • Simplicity of switch configuration (complexity can be an enemy of high availability & your budget).
  • Compared to a single Team of dual 10Gbps ports you can get a lot higher number of VM density even they have rather intensive network traffic and the non VM traffic gets a lots of bandwidth as well.
  • Works with the cheaper line of 10Gbps switches
  • Great TCO & ROI

With a dual 10Gbps team you’re ready to roll. All software defined. Making the switches just easy to use providers of connectivity. For smaller environments this is all that’s needed. More complex configurations in the larger networks might be needed high up the stack but for the Hyper-V / cloud admin things can stay very easy and under their control. The network guys need only deal with their realm of responsibility and not deal with the demands for virtualization administration directly.

I’m not saying DCB, LACP, Switch Dependent is bad, far from. But the cost and complexity scares some people while they might not even need. With the concept above they could benefit tremendously from moving to 10Gbps in a really cheap and easy fashion. That’s hard (and silly) to ignore. Don’t over engineer it, don’t IBM it and don’t go for a server rack phD in complex configurations. Don’t think you need to use DCB, SR-IOV, etc. in every environment just because you can or because you want to look awesome. Unless you have a real need for the benefits those offer you can get simplicity, performance, redundancy and QoS in a very cost effective way. What’s not to like. If you worry about LACP etc. consider this, Switch independent mode allows for nearly no service down time firmware upgrades compared to stacking. It’s been working very well for us and avoids the expense & complexity of vPC, VLT and the likes of that. Life is good.

I’m Attending The 2013 MVP Global Summit


Well, that time of the year is getting closer again. It’s something different, unique and somewhat exclusive. It’s the 2013 MVP Global Summit!

image

For this summit MVPs from all over the world converge on Bellevue/Redmond near Seattle. The summit takes place on and around the Microsoft campus. To discuss their favorite & most important MSFT technologies in depth amongst each other and with Microsoft staff.

I have the good fortune of being able to attend again this year. I have to express my thanks to our top management for this Smile. This is very valuable to both me and my employers. It’s also fun to discuss the technology you work with amongst so many like minded people in the same business. The amount of knowledge sharing, insights and ideas around Redmond creates a stimulating buzz and I loved every moment of it last year. I met many great professionals and interesting people with whom, from breakfast till after dinner drinks, we had a truckload of interesting discussions. It’s a bit of a geek fest.

So I’m looking forward to all this and also to meeting up again with some MSFT employees and professionals from the Seattle area I got to know last time.

The MVP summit is also a good time to pass feedback from others on to Microsoft as well. You’re not in the drivers seat when it comes to the direction Windows and Hyper-V will take. However, you cannot have your opinions taken into consideration unless you let them be be heard. So, please feel free to share any remarks, feedback, feature requests you’d like to the virtualization, cluster, storage, file share, network, etc. product teams to know. You can post them in the comments for all to see. To shy to post it publicly? You can send me a e-mail via the contact form on my blog or direct message me via @workinghardinit on twitter.

Now the entire summit is under NDA (Non Disclosure Agreement) but that doesn’t mean it’s a pure diplomatic mission. We all love the technology, that is for sure, but we also  pass along the bad and the ugly next to the good. It’s not marketing or indoctrination,if it was MVPs would not spend the time an money to attend.

That’s where the words “independent” and real world” comes into play. We’re not a bunch of fan boys. The communication is both ways and I think that make this event extra valuable to both parties. I’m looking forward to the 2013 MVP Summit and I have a lot of feedback and questions based on using Windows Server 2012 and Hyper-V in real live.

Cluster Aware Updating – Cluster CNO Name 15 Characters (NETBIOS name length) GUI Issue


There seems to be a small bug in the Cluster Aware Updating GUI when the cluster name exceeds 15 characters. In our example we’ll look at a cluster with the name XXXCLUSSQLSERVERS or xxxclussqlservers.test.lab. We’ll try to connect to that cluster to do some cluster aware updating.

Click on the dropdown arrow and select our cluster

image

 

Once selected, click “Connect”

image

 

Now we’re greeted by this little message

image

No, you didn’t make a typo as you selected the cluster from the drop down list. You also know that your cluster is up and running. So what happened? Well, the GUI queries AD and returns the CNOs it finds. Those are limited to the NETBIOS name and as such maximal 15 characters long. In this case the name is XXXCLUSSQLSERVERS and this gives a CNO of XXXCLUSSQLSERVE, which is not found as a cluster.

The fix is easy and simple. Just type in the cluster name. XXXCLUSSQLSERVERS and voila. You can connect and are on your way.

image

Let’s see if the FQDN is accepted as well, shall we? And yes, the below screenshot proves this.

http://workinghardinit.files.wordpress.com/2012/12/image43.png?w=584

Conclusion

So this is not a problem once you know this Smile. The CAU GUI returns the cluster CNO name and that’s the NetBIOS name which can be only 15 characters long. Selecting it in CUA to connect to the cluster doesn’t work. You need to fill out the complete name. As we demonstrated the CAU GUI does also accept a FQDN. To prevent running into this issue consider not making your cluster names longer than 15 characters as then the CNO and the cluster name will be identical and is a smart thing to do as you’ll avoid possible duplicate CNOs trying (and failing) to be created or other bugs Winking smile.

In PowerShell you always submit the cluster name so you don’t hit this issue. Perhaps the GUI drop down list could translate the CNOs into the actual cluster names?

Windows Server 2012 Hyper-V Supports IPsec Task Offloading


IPsec has been around for a while now. In an ever more security conscious & regulated world you want and/or are required to protect your network communication by
authenticating and encrypting the contents of at least some of your network traffic. Think about SOX and HIPPA and you’ll see that trade or government security requirements are not going anywhere but up for us all. This is not just restricted to military of intelligence organizations.

We’ve seen the ability to offload IPsec traffic to the NIC for a while now. This is great as the IPsec processing is a very CPU intensive workload. Unfortunately it didn’t work for virtual machines . Until now IPsec offloads was only available to host/parent workloads in using Windows Server 2008 R2. The virtualization of high volume network traffic workloads that require encryption means a serious hit on the resources on the host. If you’re willing to pay you might get by by throwing extra host & CPU power at the issue. But what if the load means a single virtual machine with 4 vCPUs can’t hack it? Game over. Sure Windows Server 2012 Hyper-V allows for 32 vCPUs now,  but that is very costly, so this is not a very cost effective solution. So in some cases this lead to those workloads being marked as “unsuited for virtualization”.

But with Windows Server 2012 Hyper-V we get a very welcome improvement, that is the fact that a virtual machine can now also offload the IPsec processing to the physical NIC on the host. That frees up a lot of CPU cycles to perform more application-level work, resulting in better virtualization densities, which means less costs etc.

Let’s take a look where you can set this in the Hyper-V GUI where you’ll find it under the network adaptor /Hardware Acceleration.

image

IPsec offload is also managed by the Hyper-V switch, this controls whether the offloading will be active or not. This is to prevent that the IPsec offload stopping the services if insufficient resources are available. Please do note that IPsec when required in the guest will be done anyway creating an extra CPU burden. So this does not disable IPsec, just the offloading of it. On top of this and in the gravest extreme you can guarantee that IPsec servers can get the resources they need by sacrificing less important guest if needed. by using virtual machine prioritization. The fact that you can configure the number of security associations helps balancing the needs of multiple virtual machines requiring IPsec offload.

To conclude, this wouldn’t be Windows Server 2012 if you couldn’t do all this with PowerShell. Take a look at  Set-VMNetworkAdapter and notice the following parameter:

-IPsecOffloadMaximumSecurityAssociation<UInt32>

This specifies the maximum number of security associations that can be offloaded to the physical network adapter that is bound to the virtual switch and that supports IPSec Task Offload. The thing to notice here is that specify a zero value is used to disable the IPsec Offload feature.

image

TechEd Europe 2012 (Amsterdam, 25-29 June 2012)


After a sad year of no TechEd Europe in 2011, one of our favorite tech conferences for Microsoft technologies is back in full force. Ladies & Gentlemen, TechEd Europe 2012 will be here sooner than you think.

It’s more than just technical training, it is networking, white board sessions and passionate discussion amongst peers, experts & Microsoft employees who built the products. If you still need more technical content than that take a look at the pre-conference agenda for a full day of expertly delivered education.

No this is not just a commercial, I haven’t missed a TechEd Europe yet this century and for good reasons. If you’d like to read why take a look at this blog post Why I Find Value In A Conference

There will be loads of sessions on all products in the System Center 2012 and Windows 8. In the developer sphere there’s the .NET Framework 4.5 & Visual Studio 2012 to look forward to. Combine this with a lot of experience based guidance on current technologies and you can’t afford to miss out. To avoid disappointment register as soon as possible to join your fellow IT Pros & Developers.

Hope to see you there!

Windows 8 introduces SR-IOV to Hyper-V


We dive a bit deeper into SR-IOV today. I’m not a hardware of software network engineer but this is my perspective on what it is and why it’s valuable addition to the toolbox of Hyper-V in Windows 8.

What is SR-IOV?

SR-IOV stands for Single Root I/O Virtualization. The “Single Root” part means that the PCIe device can only be shared with one system. The Multi Root I/O Virtualization (MR-IOV) is a specification where it can be shared by multiple systems. This is beyond the scope of this blog but you can imagine this being used in future high density blade server topologies and such to share connectivity among systems.

What does SR-IOV do?

Basically SR-IOV allows a single PCIe device to emulate multiple instances of that physical PCIe device on the PCI bus. So it’s a sort of PCIe virtualization. SR-IOV achieves this by using NICs that support this (hardware dependent) by use physical functions (PFs) and virtual functions (VFs). The physical device (think of this a port on a NIC)  is known as a Physical Function (PF) . The virtualized instances of that physical device (that port on our NIC that gets emulated x times) are the Virtual Functions (VF). A PF acts like a full blown PCIe device and is configurable, it acts and functions like a physical device. There is only one PF per port on a physical NIC. VF are only capable of data transfers in and out of devices and can’t be configured or act like real PCIe devices. However you can have many of them tied to one PF but they share the configuration of the PF.

It’s up to the hypervisor (software dependency)  to  assign one or more of these VFs to a virtual Machine (VM) directly. The guest can then use the VF NIC ports via VF driver (so there need to be VF drivers in the integration components) and traffic is send directly (via DMA) in and out of the guest to the physical NIC bypassing the virtual switch of the hyper visor completely. This reduces overhead on CPU load and increases performance of the host and as such also helps with network I/O to and from the guests, it’s as if the virtual machine uses the physical NIC in the host directly. The hyper visor needs to support SR-IOV because it needs to know what PFs and VFs are en how they work.

image

So SR-IOV depends on both hardware (NIC) and software (hypervisor) that supports it. It’s not just the NIC by the way, SR-IOV also needs a modern BIOS with virtualization support. Now most decent to high end server CPUs today support it, so that’s not an issue. Likewise for the NIC.  A modern quality NIC targeted at the virtualization market supports this.  And of cause SR-IOV also needs to be supported by the hypervisor. Until Windows 8, Hyper-V did not support SR-IOV but now it does.

I’ve read in an HP document that you can have 1 to 6 PFs per device (NIC port) and up to 256 “virtual devices” or VF per NIC today. But in reality that might not viable due to the overhead in hardware resources associated with this. So 64 or 32 VFs might be about the maximum but still, 64*2=128 virtual devices from a dual port 10Gbps NIC is already pretty impressive to me. I don’t know what they are for Hyper-V 3.0 but there will be limits to the number of SR-IOV NIC is a server and the number of VFs per core and host but I think they won’t matter to much for most of us in reality. And as technology advances we’ll only see these limits go up as the SR-IOV standard itself allows for more VFs.

So where does SR-IOV fit in when compared to VMQ?

Well it does away with some overhead that still remains with VMQ. VMQ took away the overload of a single core in the host have to be involved in handle all the incoming traffic. But still the hypervisor still has to touch every packet coming in and out. With SR-IOV that issue is addressed as it allows moving data in and out of a virtual machine to the physical NIC via Direct memory Access (DMA). So with this the CPU bottle neck is removed entirely from the process of moving data in and out of virtual machines. The virtual switch never touches it. To see a nice explanation of SR-IOV take a look at the Intel SR-IOV Explanation video on YouTube.

Intel SR-IOV Explanation

VMQ Coalescing tried to address some of the pain of the next bottle neck of using VMQ, which is the large number of interrupts needed to handle traffic if you have a lot of queues. But as we discussed already this functionality is highly under documented and it’s a bit of black art. Especially when NIC teaming and some NIC advanced software issues come in to play. Dynamic VMQ is supposed to take care of that black art and make it more reliable and easier.

Now in contrast to VMQ & RSS that don’t mix together in a Hyper-V environment you can combine SR-IOV with RSS, they work together.

Benefits Versus The Competition

One of the benefits That Hyper-V 3.0 in Windows 8 has over the competition is that you can live migrate to an node that’s not using SR-IOV. That’s quite impressive.

Potential Drawback Of Using SR-IOV

A draw back is that by bypassing the Extensible Virtual Switch you might lose some features and extensions. Whether this is  very important to you depends on your environment and needs. It would take me to far for this blog post but CISCO seems to have enough aces up it’s sleeve to have an integrated management & configuration interface to manage both the networking done in the extensible virtual switch as the SR-IOV NICs. You can read more on this over here Cisco Virtual Networking: Extend Advanced Networking for Microsoft Hyper-V Environments. Basically they:

  1. Extend enterprise-class networking functions to the hypervisor layer with Cisco Nexus 1000V Series Switches.
  2. Extend physical network to the virtual machine with Cisco UCS VM-FEX.

Interesting times are indeed ahead. Only time will tell what many vendors have to offer in those areas & for what type customer profiles (needs/budgets).

A Possible Usage Scenario

You can send data traffic over SR-IOV if that suits your needs. But perhaps you’ll want to keep that data traffic flowing over the extensible Hyper-V virtual switch. But if you’re using iSCSI to the guest why not send that over the SR-IOV virtual function to reduce the load to the host? There is still a lot to learn and investigate on this subject As a little side note. How are the HBAs in Hyper-V 3.0 made available to the virtual machines? SR-IOV, but the PCIe device here is a Fibre HBA not a NIC. I don’t know any details but I think it’s similar.

Full Steam Ahead With Windows 8 & Hyper-V in 2012


Some History

There have been a good number of people who’ve always used, some a lot more and some others a lot less, a bit of Microsoft bashing to gain some extra credibility or try to position other products as superior. Sometimes this addressed, at least, some real challenges and issues with Microsoft products. A lot of the time it doesn’t. I have always found this ridiculous. In the early years of this century I was told to get out of the Microsoft stack and into the LAMP stack to make sure I still had a job in a few years’ time. My reaction was to buy Inside SQL Server 2000 among other technology books Smile. The paradox is that in some cases, like some storage integrators, is that the ones doing the bashing are forgetting that their customers are often heavily invested in the Microsoft stack.

I Still Have A Job

As you might have realized already, I still have a job today. I’m very busy, building more and better environments based on Microsoft technologies. Microsoft does not get everything right. Who does? Sometimes it takes more than a few tries, sometimes they fail. But they also succeed in a lot of their endeavors.They are capable to learn, adapt and provide outstanding results with a very good support system to boot (I would dare say that you get out of that what you put into it). Given the size and nature of the company, combined with IT evolving at the speed of light, that’s not an easy task.

Today that ability translates into the upcoming release of Windows 8. Things like Hyper-V 3.0, the new storage and networking features, the improvements to clustering and the file system are the current state an evolution. A path along Windows 2000 over Windows 2003(R2), to  the milestone Windows 2008 which was improved with Windows 2008 R2. Now, Windows 8 being the next generation improves vastly on that very good and solid foundation. With Windows 8 we’ll take the next step forward in building highly scalable, highly available, feature rich a very functional solutions in a very cost effective manner. On top of that we can do more now than ever before, with less complexity and with affordable  standard hardware. If you have a bigger budget, great, Windows 8 will deliver even more and better bang for the buck if and when your hardware vendors get on the band wagon.

Windows 8 & Storage

One of the things the Windows BUILD Conference achieved is that it wanted me to buy hardware that I couldn’t get yet. Just try asking DELL or HP for RDMA support on 10Gbps and you get a bit of a vacant blank stare.

Another thing is that it made me look at our storage roadmap again. One of the few sectors in IT that are still very expensive is storage. Some of the storage vendors might start to feel a bit like a major network gear vendor. You know the one that has also seen the effects of serious competition by high quality but lower cost kit. Just think about what Storage Pools/Spaces will do for affordable, easy to use and rich storage solutions. Both with standard over the shelf available (read affordable) hardware and with modern SANs that leverage the Windows 8 features there is value. Heath my warning storage vendors. You’re struggling in the SMB market due to complexity, cost and way to much overhead and expensive services. Well it’s only going to get worse. You’ll have to come with better proposals or you’ll end up being high end / niche market players in the future. Let’s face it, if I can buy a super micro chassis with the disks of my choosing I can build my own storage solution for cheap and use Windows 8 to achieve my storage needs. Perhaps is 80/20 but hey, that’s great. It’s not that much better with more expensive solutions (vendor disks are ridiculously over priced) and the support process is sometimes a drain on your workforce’s time and motivation. And yes you paid for that. Compare this with being able to buy some spare parts on the cheap and having it all available of the shelf with the vendors. No more calls, no more bureaucratic mess for return parts, nor more IT illiterate operators to work through before you reach support that can be sub standard as well. Once you reach a certain level of hardware quality there is not that much difference any more except for price and service. Granted, some vendors are better at this then others. The really big ones often struggle getting this right.

I’ve been in this business long enough to know that all stuff breaks. SLAs are fine for lawyers and for management. CYA is part of doing business. But for the IT Pro in the field you need reliable people, gear and services.  On top of that you have to design for failure. You know things will break. So it should be a cheap, easy and fast as possible to fix while your design and architecture should cope with the effects of a failure. That’s what IT Pros need and that what’s keeps things running (not that SLA paper in the mailbox of your manager).

Show the Windows customers a bit more love than you have done in the past. Some in the storage industry tend to like to look down on the Windows OS. But guess what, it is your largest customer base. Unless you want to end up in the same niche as a very expensive personal trainer for Hollywood stars (tip: there’s not a huge job market there) you’d better adjust to new realities. A lot of them are doing that already , some of them aren’t. To those: get over it and leverage the features in Windows 8. You’ll be able to sell to a more varied public and at the high end you’ll have even better solutions to offer. Today I notice way to many storage integrators who haven’t even looked at Windows 8. It’s about time they started … really, like today. I mean how do you want to sell me storage today if you can’t answer my queries on Windows 8 & System Center 2012 support and integration? To me this is huge! I want to know about ODX, RDMA, SMI-S and yes I want you to be able to answer me how your storage deals with CSVs. You should know about the consumption of persistent ISCSI-3 reservations and a rock solid hardware VSS provider. If you can do that it creates the warm fuzzy feeling a customers need to make that leap of faith.

When I look at the network improvements in Windows 8. Things like RDMA, SMB 2.2; File Transfer Offload and what that means for file sharing and data intensive environments I’m pretty impressed. Then there is Hyper-V 3.0 and it many improvements. Only a fool would deny that it is a very good, affordable & rich hypervisor with a bright future as far as hypervisors go (they are not the goal, just a means to an end). Live Storage Migration, an extensible virtual switch, monitoring of the virtual switch, Network Virtualization, Hyper-V Replica, … it’s just too much to mention here. But hop on over to Windows 8 Hyper-V Feature Glossary by Aidan Finn. He’s got a nice list up of the new features relevant to the Hyper-V crowd. Again, we see improvements for all business sizes, from SMB to enterprise, including the ISPs and Cloud providers. Windows 8 is breaking down barriers that would interdict it’s use in various environments and scenarios. Objections based on missing features, scalability, performance or security in multi tenancy environments are being wiped of the map. If you want to see some musing on this subject just look at Group Video Interview: What is your favorite Hyper-V feature in Windows 8?.

2012 & Beyond

Hyper-V is growing. It’s already won a lot of hearts and minds of many smaller Microsoft shops but it’s also growing in the enterprise. The hybrid world is here when you look at the numbers, even if it’s not yet the case in your neck of the woods. Why? Cost versus features. Good enough is good enough. Especially when that good is rather great. On top of that the integration is top notch and it won’t cost you a fortune and save you a lot of plumbing hassle.

Basically everyone can benefit from all this. You’ll get more and better at a lesser or at least a more affordable cost. Even if you don’t use any Microsoft technologies you’ll benefit from the increased competition. So everyone can be happy.

Experts2Experts Conference London (UK) 2011


I’m at the Experts2Experts Conference in London and I’m having a great time talking shop, tech & business with my fellow IT Pro colleagues from around Europe. Aidan Finn, Jeff Wouters, Carsten Rachfahl, Ronnie Isherwood.

It might be fun for Microsoft to join us for some of these lunch & dinner time dicussions. It would provide them with great feedback, ideas, concerns. Very educational. While we’re discussing Citrix, VMware, Microsoft & ISV solutions (RES, Appsense) this is not a vendor centric conference. Sure we all work with these products but we’re discussing it from our point of view. The challenges, the issues, the successes & failures are discussed and mentioned.

There’s a high density of virtualization, private cloud, desktop virtualization (VDI, Terminal Servers, Application Virtualization, Client hosted virtual desktops etc.) expertise at the conference to make it interesting.

Tomorrow I’ll be sharing some musings on “High Performance & High availability Networks for Hyper-V Clusters” during my session.

Direct Connect iSCSI Storage To Hyper-V Guest Benefits From VMQ & Jumbo Frames


As I was preparing a presentation on Hyper-V cluster high available & high performance networking by, you guessed it, presenting it. During that presentation I mentioned Jumbo Frames & VMQ (VMDq in Intel speak)  for the virtual machine, Live Migration and CSV network. Jumbo frames are rather well know nowadays but VMQ is still something people have read about, at best have tinkered with, but no many are using it in production.

One of the reason for this that it isn’t explained and documented very well. You can find some decent explanation on what it is and does for you but that’s about it. The implementation information is woefully inadequate and, as with many advanced network features, there are many hiccups and intricacies. But that’s a subject for another blog post. I need some more input from Intel and or MSFT before I can finish that one.

Someone stated/asked that they knew that Jumbo frames are good for throughput on iSCSI networks and as such would also be beneficial to iSCSI networks provided to the virtual machines. But how about VMQ? Does that do anything at all for IP based storage. Yes it does. As a matter of fact It’s highly recommend by MSFT IT in one of their TechEd 2010 USA presentations on Hyper-V and storage.

So yes enable VMQ on both NIC ports used for iSCSI to the guest. Ideally these are two dedicated NICs connected to two separate switches to avoid a single point of failure. You do not need to team these on the host or have Multiple Path I/O (MPIO) running for this mat the parent level. The MPIO part is done in the virtual machines guests themselves as that’s where the iSCSI initiator lives with direct connect. And to address the question that followed, you can also use Multiple Connections per Session (MCS) in the guest if your storage device supports this but I must admit I have not seen this used in the wild. And then, finally coming to the point, both MPIO and MCS work transparently with Jumbo Frames and VMQ. So you’re good to go Smile

WDeployConfigWriter Account Issues – Trouble Shooting Web Deploy 2.0 With Lessons Learned


Here’s a small recap of a trouble shooting incident we dealt with recently and that served as a coaching exercise for trouble shooting. It seems we have Web Deploy 2.0 in use for in house deployments of web apps. It seems to be a valued asset as well. At least valuable enough to land a help request on the desk of one of the young, eager, smart and upward mobile IT Professionals when it stops working and they need some assistance.

Hello ICT,

To deploy our we websites remotely we use web deployment service (see http://technet.microsoft.com/en-us/library/dd569087(WS.10).aspx for more info).

This service runs under the network service account by default. Deploying fails now. In the security log on the server I find  "The specified account’s password has expired".

Does anyone know the password of this account?

Best regards,

Hardworking Web Guy In Trouble

Basically we have enough information to know something went wrong and that they need it to work again. But that’s about it. Password for the network service account expired? They also included an error log and reading it learns us something. The lesson to be learned here: investigate yourself, read the log, interpret them. Don’t let patients give you a diagnosis. Their input is critical, but you need to draw your own conclusions.

An account failed to log on.

Subject:
                Security ID:                           LOCAL SERVICE
                Account Name:                    LOCAL SERVICE
                Account Domain:                NT AUTHORITY
                Logon ID:                              0x3e5

Logon Type:                                         8

Account For Which Logon Failed:
                Security ID:                           NULL SID
                Account Name:                    WDeployConfigWriter
                Account Domain:                lab.test

Failure Information:
                Failure Reason:                     The specified account’s password has expired.
                Status:                                0xc000006e
                Sub Status:                            0xc0000071

Process Information:
Caller Process ID: 0x1f44
Caller Process Name: C:\Windows\System32\inetsrv\WMSvc.exe

What did we just read and learn? No it’s not the Network Service Account whose password has expired. This doesn’t happen/doesn’t work that way … so that was our first indication that this isn’t quite right in the support ticket. As you can see the real problem account mentioned in the error log:  WDeployConfigWriter. That account is indeed a local account.

WdeployAccounst

 

Cool, now we check what service runs under that account by looking in the services panel …. none! The easy way to check is to sort on the "Log On As" column. You won’t find WDeployConfigWriter. Right … , what else do we learn from the Services panel. Well we do have service called Web Deployment Agent Service running under the local Network Service account. We can stop and start it just fine so there is nothing wrong with the Network Service account , which is as expected and this service is not our culprit.  What we also learn that this is Web Deploy 2.0.

Service

 

As the Web Deployment Agent Service has nothing to do with the problem at hand. So where is that WDeployConfigWriter being used and what is it status? Let’s take a look.

WdeployAccountsettings

 

Hey, how could this account have expired? This is impossible. Unless they changed it while trying to fix the error. We check this with  quick phone call and yes, they did exactly that.  The good thing is that this web guy is professional and tells us what they did. Some people think this might get them into trouble and won’t do that. It doesn’t change anything, things are what they are, but it does make communication less easy when you discover people act that way… So the lessons here are to double check & verify what happened if at all possible. Originally the settings were:

WDeployAccountOriginalSettings 

 

They changed them after they ran into issues hop that checking those options might fix it. Well no, expired is expired and you can’t fix it like that. You need indeed to correct the settings if you don’t want the password to expire and even prevent the user from changing it but you also need to set a new password when it has already expired. After doing so we contact the hardworking web guy in trouble to let ‘m test and predict a new error: whatever runs under that Account will now fail to run due to an incorrect password. And guess what? “Unknown user name or bad password” in the security log.

Log Name:      Security
Source:        Microsoft-Windows-Security-Auditing
Date:          24/06/2011 10:30:39
Event ID:      4625
Task Category: Logon
Level:         Information
Keywords:      Audit Failure
User:          N/A
Computer:     server1.lab.test
Description:
An account failed to log on.

Subject:
    Security ID:        LOCAL SERVICE
    Account Name:        LOCAL SERVICE
    Account Domain:        NT AUTHORITY
    Logon ID:        0x3e5

Logon Type:            8

Account For Which Logon Failed:
    Security ID:        NULL SID
    Account Name:        WDeployConfigWriter
    Account Domain:        lab.test

Failure Information:
    Failure Reason:        Unknown user name or bad password.
    Status:            0xc000006d
    Sub Status:        0xc000006a

Process Information:
    Caller Process ID:    0x1f44
    Caller Process Name:    C:\Windows\System32\inetsrv\WMSvc.exe

 

The user wants to repair install or uninstall and reinstall the application to “get a quick fix” but we do not to give in and keep trouble shooting. It’s better to learn what the cause really is and how to fix it instead of relying on wishful reinstalling.

So where is the thing that runs under that account. We start a quick search in the registry and on the file system for the  account name just in case it’s configured in the registry or a configuration file and let it run while we keep investigating.  We also send  a tweet in to the universe, as perhaps some one out there  knows this and can help out. We search the internet for Web Deploy 2.0 and WDeployConfigWriter. This results in very few hits, hmmm, interesting  … . One of them is http://blogs.iis.net/msdeploy/archive/2011/04/05/announcing-web-deploy-2-0-refresh.aspx

Where we learn a few things, the most important is the one line from that blog post I formatted in bold and red from the blog snippet right below. I also enlarged the picture from the blog post to make it readable where you can find in IIS  what we learned here:

Notice that Web Deploy setup created two new local user accounts:

- WDeployConfigWriter, which has Write permissions to the IIS server’s applicationHost.config. This is used by delegation rules for createApp, appPoolNetFx and appPoolPipelineMode.

I’ve included the entire block of text from where this was taken below.

1. Easier setup for non-administrator deployments on IIS7

One of the common requests from our users was to make it easier to setup Web Deploy so non-administrators can publish to their sites. Typically, you will need to do this if you are running a shared hosting environment or if you are administering a build machine and you do not want users to have admin access.

If you launch the Web Deploy installer and choose “Custom”, you will notice a new option, “Configure for Non-administrator Deployments”:

clip_image001

If you choose this option, Web Deploy will automatically create Management Service Delegation rules for the following providers, as well as user the accounts needed for providers like createApp and recycleApp that need elevated privileges.

These are the rules you will have in the Management Service Delegation UI in IIS Manager after you install this component:

Notice that Web Deploy setup created two new local user accounts:

- WDeployConfigWriter, which has Write permissions to the IIS server’s applicationHost.config. This is used by delegation rules for createApp, appPoolNetFx and appPoolPipelineMode.

- WDeployAdmin, which is an administrator. This is used by delegation rules for recycleApp.

If you prefer to create these rules by hand, uncheck the component in the installer. We also provide a PowerShell script for creating delegation rules (more on this later in the post) if you prefer that route.

Well armed with this information we go have a look at the Management Service Delegation:

ManagementServiceDelegation

 

Where we indeed find createApp, appPoolNetFx and appPoolPipelineMode:

ManagementServiceDelegationWebdeployconfig

 

So now we take a look a bit what we can configure here and  sure enough, by double clicking on them the Edit Rule form:

ManagementServiceDelegationWebdeployconfigSettings

 

So we click on Edit security credentials and are welcomed by this form:

ManagementServiceDelegationWebdeployconfigSettingsPW1

 

So we enter the account name and the new password we set before (remember to do this for both providers):

ManagementServiceDelegationWebdeployconfigSettingsPW2

 

Guess what, end user happy, things are working again. Jay! From service down report to helpdesk to fully operational again in less than an hour with a technology new to the service desk. Well done young, eager, smart and upward mobile IT Pro Winking smile with lessons learned.

How did this happen and did they end up with this funky configuration (expiring password of an account that no one knows where it is used for and where configured)? Aha, operational control => know the configuration of what you use and know why it is configured that way and where it’s configured. Is it a mistake/assumption in the installer that the accounts WDeployConfigWriter and WDeployAdmin have their passwords set to expired and can be changed by the user or did somebody mess with them after the install? Well I did the test by setting it up on a test server and found that they are indeed installed with their passwords set to expire and that the password can be changed by the user. It assumes that the person doing the install knows and realizes the implications. I’m not saying either setting is wrong but you should know why, when and where. There is no documentation on this as far as we could find right now and perhaps the installer should mention the benefits/risks of both types of configuration and ask what to choose. This, together with better documentation, could help prevent this issue. As always, no guarantees given Winking smile 

Overall lesson: don’t assume things, trust but verify …