In Defense of Switch Independent Teaming With Hyper-V


For many old timers (heck, that includes me) NIC teaming with LACP mode was the best of the best, at least when it comes to teaming options. Other modes often led to passive/active, less than optimal receiving network traffic aggregation. Basically, and perhaps over simplified, I could say the other options were only used if you had no other choice to get things to work. Which we did a lot … I used Intel’s different teaming modes for various reasons in the past (before we had MLAG, VLT, VPC, …). Trying to use LACP where possible was a good approach in the past in physical deployments and early virtualized environments when 1Gbps networking dominated the datacenter realm and Windows did not have native support for LBFO.

But even LACP, even in those days, had some drawbacks. It’s the most demanding form of teaming. For one it required switch stacking. This demands the same brand and type of switches and that means you have no redundancy during firmware upgrades. That’s bad, as the only way to work around that is to move all workload to another rack unit … if you even had the capability to do that! So even in days past we chose different models if teaming out of need or because of the above limitations for high availability. But the superiority of NIC teaming with LACP still stands for many and as modern switches support MLAG, VLT, etc. the drawback of stacking can be avoided. So does that mean LACP for NIC teaming is always the superior choice today?

Some argue it is and now they have found support in the documentation about Microsoft CPS system documentation about Microsoft CPS system. Look, even if Microsoft chose to use LACP in their solutions it’s based on their particular design and the needs of that design I do not concur that this is the best overall. It is however a valid & probably the choice for their specific setup. While I applaud the use of MLAG (when available to you a no or very low cost) to have all bases covered but it does not mean that LACP is the best choice for the majority of use cases with Hyper-V deployments. Microsoft actually agrees with me on this in their Windows Server 2012 R2 NIC Teaming (LBFO) Deployment and Management guide. They state that Switch Independent configuration / Dynamic distribution (or Hyper-V Port if on Hyper-V and if not on W2K12R2)  is the best possible default choice is for teaming in both native and Hyper-V environments. I concur, even if perhaps not that strong for native workloads (it depends). Exceptions to this:

  • Teaming is being performed in a VM (which should be rare),
  • Switch dependent teaming (e.g., LACP) is required by policy, or
  • Operation of a two-member Active/Standby team is required by policy.

In other words in 2 out of 3 cases the reason is a policy, not a technical superior solution …

Note that there are differences between Address Hash, Hyper-V Port mode & the new dynamic distribution modes and the latter has made things better in W2K12R2 in regards to bandwidth but you’ll need the read the white papers. Use dynamic as default, it is the best. Also note that LACP/Switch Dependent doesn’t mean you can send & receive to and from a VM over the aggregated bandwidth of all team members. Life is more complicated than that. So if that’s you’re main reason for switch dependent, and think you’re done => be ware Winking smile.

Switch Independent is also way better for optimization of VMQ. You have more queues available (sum-of-queues) and the IO path is very predictable & optimized.

If you don’t control the switches there’s a lot more cross team communication involved to set up teaming for your hosts. There’s more complexity in these configurations so more possibilities for errors or bugs. Operational ease is also a factor.

The biggest draw back could be that for receiving traffic you cannot get more than the bandwidth a single team member can deliver. That’s true but optimizing receiving traffic has it’s own demands and might not always be that great if the switch configuration isn’t that smart & capable. Do I ever miss the potential ability to aggregate incoming traffic. In real life I do not (yet) but in some configurations it could do a great job to optimize that when needed.

When using 10Gbps or higher you’ll rarely be in a situation where receiving traffic is higher than 10Gbps or higher and if you want to get that amount of traffic you really need to leverage DVMQ. And a as said switch independent teaming with port of dynamic mode gives you the most bang for the buck. as you have more queues available. This drawback is mitigated a bit by the fact that modern NICs have way larger number of queues available than they used to have. But if you have more than one VM that is eating close to 10Gbps in a non lab environment and you planning to have more than 2 of those on a host you need to start thinking about 40Gbps instead of aggregating a fistful of 10Gbps cables. Remember the golden rules a single bigger pipe is always better than a bunch of small pipes.

When using 1Gbps you’ll be at that point sooner and as 1Gbps isn’t a great fit for (Dynamic) VMQ anyway I’d say, sure give LACP a spin to try and get a bit more bandwidth but will it really matter? In native workloads it might but with a vSwith?  Modern CPUs eat 1Gbps NICs for breakfast, so I would not bother with VMQ. But when you’re tied to 1Gbps it’s probably due to budget constraints and you might not even have stackable, MLAG, VLT or other capable switches. But the arguments can be made, it depends (see Don’t tell me “It depends”! But it does!). But in any case I start saving for 10Gbps Smile

Today as the PC8100 series and the N4000 Series (budget 10Gbps switches, yes I know “budget” is relative but in the 10Gbps world, but they offer outstanding value for money), I tend to set up MLAG with two of these per rack. This means we have all options and needs covered at no extra cost and without sacrificing redundancy under any condition. However look at the needs of your VMs and the capability of your NICs before using LACP for teaming by default. The fact that switch independent works with any combination of budget switches to get redundancy doesn’t mean it’s only to be used in such scenarios. That’s a perk for those without more advanced gear, not a consolation price.

My best advise: do not over engineer it. Engineer it for the best possible solution for the environment at hand. When choosing a default it’s not about the best possible redundancy and bandwidth under certain conditions. It’s about the best possible redundancy and bandwidth under most conditions. It’s there that switch independent comes into it’s own, today more than ever!

There is one other very good, but luckily also a very rare case where LACP/Switch dependent will save you and switch independent won’t: dead switch ports, where the port becomes dysfunctional. So while switch independent protects against NIC, Switch, cable failures, here it doesn’t help you as it doesn’t know (it’s about link failures, not logical issues on a port).

For the majority of my Hyper-V deployments I do not use switch dependent / LACP. The situation where I did had to do with Windows NLB in combination with ICMP Multicast.

Note: You can do VLT, MLAG, stacking and still leverage switch independent teaming, LACP or static switch dependent is NOT mandatory even when possible.

Hyper-V Technical Preview Live Migration & Changing Static Memory Size


I have played with  Hot Add & Remove Static Memory in a Hyper-V vNext Virtual Machine before and I love it. As I’m currently testing (actually sometimes abusing) the Technical Preview a bit to see what breaks I’m sometimes testing silly things. This is one of them.

I took a Technical Preview VM with 45GB of memory, running in a Technical Preview Hyper-V cluster and live migrate it.  I then tried to change the memory size up and down during live migration to see what happens, or at least nothing goes “BOINK”. Well, not much, we get a notification that we’re being silly. So no failed migrations, crashed or messed up VMs or, even worse hosts.

image

It’s early days yet but we’re getting a head start as there is a lot to test and that will only increase. The aim is to get a good understanding of the features, the capabilities and the behavior to make sure we can leverage our existing infrastructures and software assurance benefits as fast a possible. Rolling cluster upgrades should certainly help us do that faster, with more ease and less risk. What are your plans for vNext? Are you getting a feeling for it yet or waiting for a more recent test version?

Hyper-V Guest Protected Network Testing Tip


I’ve been pinged a few times over the years with people saying that the new protected network feature does not work for them. This setting is set per vNIC of the virtual machine.

image

The issue lies in how & what people test, bar any number of other reasons why a live migration might not start or complete.  What people tend to do is disable a NIC to which the vSwitch is connected. But a Protected Network is about media sense loss detection of network disconnects and this requires the NIC to be actually there and enabled. Remember, we’re talking about the NIC on the host connected to the virtual switch. A physical link failure here, meaning that the virtual switch the protected virtual network adapter no longer has network connectivity, will lead to all the VMs with  the protected network enabled do be live migrated to another node in the cluster that still has a connected virtual switch for the same network.  The latter is to avoid  senseless virtual machine migrations to other nodes that might also have lost connectivity due to a failed physical switch.

So the point is that testing by disabling the NIC in the OS will not do. You need to unplug the cables to the virtual switch or disable the port on the switch or even shutdown the switch (a bit drastic).

Do note that it can take a little time for the live migration to kick in,  it varies a bit, but it beats having to wait for the issue to be resolved. You’ll see event id 1255 logged when the VMs lose network connectivity:image

In this day and age with NIC teaming to redundant switches & the fact that you might be using converged networking these tests aren’t as simple as you might think. Also don’t pull out all if the cables used for clustering if you want the cluster to be able to help you out here with a live migration. Because when the other cluster nodes can’t talk to the node your testing in any way it will be kicked out of the cluster, the VMs will go down, be moved to another node and started. This might seem obvious but if you a are using a teamed 10Gbps solution in a converged setup this might cause exactly that.

Another thing to note is that if you have a virtual switch with a dedicated backup network exposed to hosts & VMs that can tolerate down time you might want to disable protected networks on that vNIC as you don’t want to live migrate the VMs of when that network has an issue. It all depends on your needs & tastes.

Last but not least please behave, and don’t do anything silly in production when testing this. Be careful in your testing.

Golden Nuggets: Windows Server 2012 R2 Failover Cluster CSV Placement Policy


Some enhancements only become truly evident to people when they see them in action. For many features this means something need to go wrong before they kick in. Others are more visible during normal operations. This is the case with the CSV enhancements in Windows Server 2012 R2 Failover Clustering.

One golden nugget here is the CSV placement policy (which really shines in combination with SOFS/Storage Spaces). This will spread ownership of the CSV amongst the cluster nodes to ensure a balanced distribution. In a failover cluster, one node is the “coordinator node” (owner) for a CSV. The coordinator node owns the physical disk resource that is associated with a logical unit (LUN). All I/O operations for the File System on that LUN are are through the coordinator node. In previous versions there is no automatic rebalancing of coordinator node assignment. This means that all LUNs could potentially be owned by the same node. In storage spaces & SOFS scenarios becomes even more important.

The benefits

  • It helps all nodes carry their share of the workload as it load balances the disk I/O.
  • Failovers of CSV owners are potentially quicker and more predictable/consistent as an even distribution ensures that no one node owns a disproportionate number of CSVs.
  • When losing storage access the number of CSVs that are in redirected mode is potentially less as they are evenly distributed. In an unbalanced cluster it could be for all of them in a worse case scenario.
  • When using SOFS with Storage Spaces it makes sure the Storage Spaces Ownership is distributed fairly.

When does it happen

  • Each time a node leaves or joins the cluster. This means you don’t need to intervene manually or via PowerShell to get an even distribution. This goes for both exiting nodes as when adding a new node. The new node will get a CSV assigned if there is any on surplus on one of the existing nodes.
  • The process also works when you start a failover cluster when it has shut down.

When customers see this in action (it’s most obvious when then add a node as then they are normally watching) they generally smile as the cluster does it job getting  the best possible results out of their hardware.

The Hyper V Amigos Showcast Episode 6: Storage Spaces


Everybody is very busy and I’m a bit tires but here’s the 6th episode of the Hyper-V Amigos show cast. In this episode we get to play a bit with storage spaces in Carsten’s lab.

As always we had a lot of fun doing so and thanks to Carsten Rachfahl and the assistance of Kerstin (his charming wife, also an MVP, in Office 365) we could simulate hardware failures & film them for you!

 

Carsten & I discuss several scenarios and what’s happening during failovers. Carsten is assisting customers with this a lot so he has some of the most varied experience with storage spaces and SOFS out there!  Interesting stuff and for people who haven’t even looked at Windows Server 2012 or later yet a wake up call to start as the world is not limited to what we once knew. It’s not your daddy’s Windows anymore Winking smile

I hope you enjoy it and we’re already planning for the next one!

Dell generation 13 servers & Intel E5 v3 18 core CPUs are upon us in world where per core licensing is reality


As I watched the Intel E5 v3 launch event & DELL releasing their next generation servers to the public to purchase there is a clear opportunity for hardware renewal next year. I’m contemplating on what the new Intel E5 v3 18 core processors

image

and the great DELL generation 13 PowerEdge Servers mean for the Hyper-V and SQL server environments under my care.

image

For the Hyper-V clusters I’m in heaven. At least for now as Windows is still licensed per socket at the time of writing. vNext has me worried a bit, thinking about what would happen if that changes to core based licensing to. Especially with SQL Server virtualization. I do hope that if MSFT ever goes for per core licensing for the OS they might consider giving us a break for dedicated SQL Server Hyper-V clusters.

image

For per core licensing with SQL Server Enterprise we need to run the numbers and be smart in how we approach this. Especially since you need Software Assurance to be able to have mobility & failover / high availability. All this at a time you’re told significant cost cutting has to happen all over the board.

So what does this mean? The demise of SQL Server in the Enterprise like some suggest. Nope. The direct competitors of SQL Server in that arena are even more expensive. The alternatives to SQL are just that, in certain scenarios you don’t need SQL (Server) or you can make due with SQL Server Express. But what about all the cases where you do really need it? You’ll just have to finance the cost of SQL Server. If that’s not possible the business case justifying the tool is no longer there, which is valid. As the saying goes, if you can’t afford it, you don’t need it. A bit harsh yes, I realize, but this is not a life saving medicine we’re talking about but a business tool. There might be another reason your SQL Server licensing has become unaffordable. You might be wasting money due to how SQL Server is deployed and used in your environment. To make sure you don’t overpay you need to evaluate if SQL Server consolidation is what is really needed to save the budget.

Now please realize that consolidation doesn’t mean stupidly under provisioning hardware & servers to make budget work out. That’s just plain silly. For some more information on this, please read Virtualizing Intensive Workloads on Hyper-V, Can It Be Done

So what is smart consolidation (not all specific to SQL Server by the way):

  • You have to avoid physical SQL Server sprawl with a vengeance.
  • You need to consolidate SQL Servers aggressively.
  • Virtualize on a dedicated SQL Server Hyper-V cluster if possible
  • Favor scale out over scale up in the Hyper-V scenario to keep node costs reasonable and allow for affordable expansion.
  • Use 2 socket servers and replace the hardware faster to keep the number of needed cores down.
    • This allows to leverage modern commodity, high performance storage, networking and compute where you can in order to optimize workloads & minimize costs.
    • It helps save on power consumption & cooling
    • More nodes with lesser cores (scale out approach) reduces VM density per node but also keep the cost of adding a node (with SQL Server per core licensing, or when it comes to that for the OS as well), which is your scaling block with a fixed cost under control. It’s all about balance and it isn’t as easy as it seems.
  • Play the same game with storage. This can be a harder sale to make internally. Traditionally people hang on to storage longer due to the high CAPEX. I have said it before, storage vendors have to deliver more & better. Even the challengers & hyper converged systems are still too expensive to really get into a short renewal cycle for most organizations.

Be smart about it. A great DBA can make a difference here and some hard core performance tuning is what can save a serious amount of money. If on top of that you have some good storage & network skills around you can achieve a lot. Next to the fact that you’ll have to spend serious money for serious workloads the ugly truth is that consolidation requires you find your peak loads and scale for those with a vengeance. Look, maxing out one server on which one SQL Server is running isn’t that bad. But what if 3 SQL Servers running a peak performance spread over a 3 node Hyper-V cluster dedicated to SQL Server VMs might kill performance all over!

The good news is I have solid ideas,visions, plans and options to optimize both the on premise & cloud of part of networking, storage & compute. Remember that there is no one size fits all. Execution follows strategy. The potential for very performant, cost effective  & capable solutions are right there. I cannot give you a custom solution for your needs in a blog post. One danger with fast release cycles is that it requires yearly OPEX end if they cannot guarantee it the shift in design to solutions with less longevity  could become problematic if they can’t come up with the money. Cutting some of the “fat” means you will not be able to handle longer periods of budget drought very well. There is no free lunch.

So measure twice & cut once or things can go wrong very fast and become even more expensive.

You might think this sounds a bit pessimistic. No this is an opportunity, especially for a Hyper-V MVP who happens to be a MCDBA Winking smile. The IT skills shortage is only growing bigger all over the planet, so not too much worries there, I won’t have to collect empty bottles for a living yet. The only so called “draw back” here could be that the environments I take care of have been virtualized and optimized to a high extend already. The reward for being good is sometimes not being able to improve things in orders of magnitude. Bad organizations living in a dream world, the ones without a solid grasp of the realities of functional IT in practice, might find that disappointing. Yes the “perception is reality” crowd. Fortunately the good ones will be happy to be in the best possible shape and they’ll invest money to keep it that way.  Interesting times ahead.

Manually Merging Hyper-V Checkpoints


A Last ditch Effort

Fist of all you need to realize this might not work. It’s a last ditch effort. There is a reason why you have backups (with tested restores) and why you should monitor your environment for things that are not as they should be. Early intervention can pay off.

Also see blog post on a couple of more preferred actions.

If you have lost checkpoints, you have basically lost data and corruption/data inconsistencies are very much a possibility and reality. If the files have been copied and information about what file is the parent the dates/timestamps are what you have to go by. You might not know for sure if you have them all.

Setting up the demo

For demo purposes we take a test VM and ad files to indicate what checkpoint we’re at.

We start with ORGINAL.TXT on the desktop and we create a checkpoint, which we rename accordingly.

image

We add a file called CHECK01.TXT and we create a checkpoint, which we rename accordingly.

image

We add a file called CHECK02.TXT and we create a checkpoint, which we rename accordingly.

image

We add a file called NOW.TXT no more checkpoints are taken.

image

The file names represent the content you’d see disappear if you applied the checkpoint and we have reflected this in the name for the checkpoints.

image

As we want to merge all the snapshots and and up with a usable VHDX we’ll work back from the most recent differencing disk until all is merged. As you can see this is a straight forward situation and I hope you’ll never be running having to deal with a vast collection of sub trees Smile.

Finding out what are the parents of avhdx files

In this demo it’s pretty obvious what snapshot exist and what avhdx files they represent. We’ve even shown you the single tree visualized in Hyper-V Manager. In reality bad things  have happened and you don’t see this information anymore. So you might have to find out yourself. This is done via inspect disk in Hyper-V manager. I you’re confused about what the parent is of (a)vhdx files this tool will help you find out or show you what the most recent one was.

image

Sometimes the original files have been renamed or moved and that it will show you’re the last known valid parent.

image

Manually Merging the checkpoints

Remember to make a copy of all files as a backup! Also make sure you have enough free diskspace … you need working space! You might need another shot at this. As we want to merge all the snapshots and and up with a usable VHDX we’ll work back from the most recent differencing disk until all is merged in the oldest one which is the vhdx. You can look at the last modified time stamps to find out the correct order in which to work. The most recent avdx is the one used in the virtual machine configuration file and locate the information for the virtual hard disk.

image

The configuration file’s avhdx is the one containing the “NOW” running state of the VM.

Note: You might find some information that you need to rename the extension avhdx to vhdx (or avhd to vhd). The reason for this was that in Windows 2008 Hyper-V Manager did not show avhd files in the Edit virtual disk wizard. You can still do this and it will still works, but you do not need to. Ever since Windows Server 2008 R2 avhd (and with since Windows Server 2012 avhdx) files do show up in Hyper-V Managers Disk edit.

For some insights as to why the order is important read this blog by Ben Armstrong What happens when a snapshot is being merged? [Hyper-V]

WARNING: If you did not start with the most recent one and work your way down, which is the easiest and least confusing way all is not lost. But you will have to reconnect the first more recent (a)vhdx to one to it’s new parent. This is needed as by merging a snapshot out of order more recent one will have lost it’s will have lost it’s original parent.

Here’s how to do this: Select reconnect. This is the page you’ll get if you’d start edit disk wizard as all other option are unavailable due to the missing parent.

image

The wizard will tell you what used to be the parent and allow you to select a new one. Make sure to tick the check box for Ignore ID mismatch or the reconnect will fail as you’re previous out of order merge has created a new a(vhdx). If your in this pickle by renaming (a)vhdx files or after a copy this isn’t needed by the way.

image
Follow the wizard after that and when your done you can launch the edit disk wizard again and perform a merge. It’s paramount that you do not mix up orders when doing so that you reconnect to the parent this or you’ll end up in a right mess. There are many permutations, keep it simple!. Do it in order Smile. If you start having multiple checkpoint trees/subtrees things can get confusing very fast.

You might also have to reconnect if the checkpoints have lost their connection the what they know to be their last parent for other reasons. In that case you do this and when that’s done, you merge. Rinse and repeat. The below walk through assumes you have no reconnects to be done. If so it will tell you like in the example above.

Walk trough:

Open the Edit Disk Wizardimage

Select the most recent avhdx & click “Next”

image

image

We choose to merge the avhdx

image

In our case into its parent disk

image
Verify the options are correct and click “Finish”

image

Let the wizard complete

image

That’s it. You’ve merged the most recent snapshot into it’s parent. That means that you have not lost the most recent state of the virtual machine as when it was running before you shut it down. This can be verified by mounting the now most recent avhdx and looking at the desktop for my user profile. You can see the NOW.txt text file is there!

OK, dismount the avhdx and now it’s rinse and repeat.

image

image

You do this over an over again until your merge the last avhdx into the vhdx.

image

Than you have the vhdx you will use to create a new virtual machine.

image

Make sure you get the generation right.

image

Assign memory

image

Connect to the appropriate virtual switch or not if you’re not ready to do this yet

image

Use your vhdx disk that’s the remaining result of your merging efforts

image

image

When you boot that virtual machine you’ll see that all the text files are there. It’s as if you’ve deleted the checkpoints in the GUI and retained “NOW” in the vhdx.

image

image

Last but not least, you can use PowerShell or even DiskPart for this but I found that most people in this pickle value a GUI. Use what you feel most comfortable with.

Thanks for reading and hope this helps someone. Do remember “big boy” rules apply. This is not safe, easy or obvious in each and every situation so you are responsible for everything you do in your environment. If your in to deep, way over your head, etc. call in some expert help.