SMB Direct RoCE Does Not Work Without DCB/PFC


Introduction

SMB Direct RoCE Does Not Work Without DCB/PFC. “Yes”, you say, “we know, this is well documented. Thank you.” but before you sign of hear me out.

Recently I plugged to RoCE cards into some test servers and linked them to a couple of 10Gbps switches. I did some quick large file copy testing and to my big surprise RDMA kicked in with stellar performance even before I had installed the DCB feature, let alone configure it. So what’s the deal here. Does it work without DCB? Does the card fail back to iWarp? Highly unlikely. I was expecting it to fall back to plain vanilla 10Gbps and not being used at all but it was. A short shout out to Jose Barreto to discuss this helped clarify this.

DCB/PFC is a requirement RoCE

The more busy the network gets the faster the performance will drop. Now in our test scenario we had two servers  for a total of 4 RoCE ports on the network consisting of a beefy 48 port 10Gbps switches. So we didn’t see the negative results of this here.

DCB (Data Center Bridging) and Priority Flow Control are considered a requirement for any kind of RoCE deployment. RDMA with RoCE operates at the Ethernet layer. That means there is no overhead from TCP/IP, which is great for performance. This is the reason you want to use RDMA actually. It also means it’s left on it’s own to deal with Ethernet-level collisions and errors. For that it needs DCB/PFC other wise you’ll run into performance issues due to a ton of retries at the higher network layers.

The reason that iWarp doesn’t require DCB/PCF is that it works at the TCP/IP level also offloaded by using a TCP/IP stack on the NIC instead of the OS. So errors are handled by TCP/IP at a cost: iWarp results in the same benefits as RoCE but it doesn’t scale that well. Not that iWarp performance is lousy, far form! Mind you, for bandwidth management reasons,you’d be better of using DCB or some form of QoS as well.

Conclusion

So no, not configuring  DCB on your servers and the switches isn’t an option, but apparently it isn’t blocked either so beware of this. It might appear to be working fine but it’s a bad idea. Also don’t think it defaults back to iWarp mode, it doesn’t, as one card does one thing not both. There is no shortcut. RoCE RDMA does not work error free out of the box so you do have the install the DCB feature and configure it together with the switches.

Using RAMDisk To Test Windows Server 2012 Network Performance


I’m testing & playing different Windows Server 2012 & Hyper-V networking scenarios with 10Gbps, Multichannel, RDAM, Converged networking etc. Partially this is to find out what works best for us in regards to speed, reliability, complexity, supportability and cost.

Basically you have for basic resources in IT around which the eternal struggle for the prefect balance finds place. These are:

  • CPU
  • Memory
  • Networking
  • Storage

We need both the correct balance in capabilities, capacities and speed for these in well designed system. For many years now, but especially the last 2 years it very save to say that, while the sky is the limit, it’s become ever easier and cheaper to get what we need when it comes to CPU, Memory. These have become very powerful, fast and affordable relative to the entire cost of a solution.

Networking in the 10Gbps era is also showing it’s potential in quantity (bandwidth), speed (latency) and cost (well it’s getting there) without reducing the CPU or memory to trash thanks to a bunch of modern off load technologies. And basically in this case it’s these qualities we want to put to the test.

The most trouble some resource has been storage and it has been for quite a while. While SSD do wonders for many applications the balance between speed, capacity & cost isn’t that sweet as for our other resources.

In some environments were I’m active they have a need for both capacity and IOPS and as such they are in luck as next to caching a lot of spindles still equate to more IOPS. For testing the boundaries of one resource one needs to make sure non of the others hit theirs. That’s not easy as for performance testing can’t always have a truck load of spindles on a modern high speed SAN available.

RAMDisk to ease the IOPS bottleneck

To see how well the 10Gbps cards with and without Teaming, Multichannel, RDMA are behaving and what these configuration are capable of I wanted to take as much of the disk IOPS bottle neck out of the equation as possible. Apart from buying a Violin system capable of doing +1 million IOPS, which isn’t going to happen for some lab work, you can perhaps get the best possible IOPS by combining some local SSD and RAMDisk. RAMDisk is spare memory used as a virtual disk. It’s very fast and cost effective per IOPS. But capacity wise it’s not the worlds best, let alone most cost effective solution.

image

I’m using free RAMDisk software provided by StarWind. I chose this as they allow for large sized RAMDisks. I’m using the ones of 54GB right now to speed test copying fixed sized VHDX files. It install flawlessly on Windows Server 2012 and it hasn’t caused me any issues. Throw in some SSDs on the servers for where you need persistence and you’re in business for some very nice lab work.

clip_image001

You also need to be aware it doesn’t persist data when you reboot the system or lose power. This is not an issue if all we are doing is speed testing as we don’t care. Otherwise you’ll need to find a workaround and realize those ‘”flush the data to persistent storage” isn’t full proof or super fast, the SSDs do help here.

You have to register but the good news is that they don’t spam you to death at all, which I find cool. As said the tool is free, works with Windows Server 2012 and allows for larger RAMDisks where other free ones are often way to limited in size.

It has allowed me to do some really nice testing. Perhaps you want to check this out as well. WARNING: The below picture is a lab setup … I’m not a magician and it’s not the kind of IOPS I have all over the datacenters with 4 Cheapo SATA disks I touched my special magic pixie dust.

image

With #WinServ 2012 storage costs/performance/capacity are the only thing limiting you  http://twitter.yfrog.com/mnuo9fp #SMB3.0 #Multichannel

Some quick tests with a 52GB NTFS RAMDisk formatted with a 64K NTFS Allocation unit size.

image image

I also tested with another free tool from SoftPerfect ® RAM Disk FREE. It performs well but I don’t get to see the RAMDisk in the Windows Disk Management GUI, at least not on Windows Server 2012. I have not tested with W2K8R2.

NTFS Allocation unit size 4K NTFS Allocation unit size 64K
image image

SMB 3.0 Multichannel Auto Configuration In Action With RDMA / SMB Direct


Most of you might remember this slide by Jose Barreto on SMB Multichannel  Auto Configuration in one of his many presentations:image

  • Auto configuration looks at NIC type/speed => Same NICs are used for RDMA/Multichannel (doesn’t mix 10Gbps/1Gbps, RDMA/non-RDMA)
  • Let the algorithms work before you decide to intervene
  • Choose adapters wisely for their function

You can fine tune things if and when needed (only do this when this is really the case) but let’s look at this feature in action.

So let’s look at this in real life. For this test we have 2 * X520 DA 10Gbps ports using 10.10.180.8X/24 IP addresses and 2 * Mellanox  10Gbps RDMA adaptors with 10.10.180.9X/24 IP addresses. No teaming involved just multiple NIC ports. Do not that these IP addresses are on different subnet than the LAN of the servers. Basically only the servers can communicate over them, they don’t have a gateway, no DNS servers and are as such not registered in DNS either (live is easy for simple file sharing).

image

Let’s try and copy a 50Gbps fixed VHDX file from server1 to server2 using the DNS name of the target host (pixelated), meaning it will resolve to that host via DNS and use the LAN IP address 10.10.100.92/16 (the host name is greyed out). In the below screenshot you see that the two RDMA capable cards are put into action. The servers are not using  the 1Gbps LAN connection. Multichannel looked at the options:

  • A 1Gbps RSS capable Link
  • Two 10Gbps RSS capable Links
  • Two 10Gbps RDMA capable links

Multichannel concluded the RDMA card is the best one available and as we have two of those it use both. In other words it works just like described.

image

Even if we try to bypass DNS and we copy the files explicitly via the IP address (10.10.180.84)  assigned to the Intel X520 DA cards Multichannel intelligence detects that it has two better cards  that provide RDMA available and as you can see it uses the same NICs  as in the demo before.  Nifty isn’t it Smile

 image

If you want to see the other NICs in action we can disable the Mellanox card and than Multichannel will choose the two X520 DA cards. That’s fine for testing but in real life you need a better solution when you need to manually define what NICs can be used. This is done using PowerShell Smile (take a look at Jose Barrto’s blog The basics of SMB PowerShell, a feature of Windows Server 2012 and SMB 3.0  for more info).

New-SmbMultichannelConstraint –ServerName SERVER2 –InterfaceAlias “SLOT 6 Port 1”, “SLOT 6 Port 2”

This tells a server it can only use these two NICs which in this example are the two Intel X520 DA 10Gbps cards to access Server2. So basically you configure/tell the client what to use for SMB 3.0 traffic to a certain server. Note the difference in send/receive traffic between RDMA/Native 10Gbps.

On Server1, the client you see this:

image

On Server2, the server you see this:

image

Which is indeed the constraint set up as we can verify with:

Get-SmbMultichannelConstraint

image

We’re done playing so let’s clean up all the constraints:

Get-SmbMultichannelConstraint | Remove-SmbMultichannelConstraint

image

Seeing this technology it’s now up to the storage industry to provide the needed  capacity and IOPS I a lot more affordable way. Storage Spaces have knocked on your door, that was the wake up call Winking smile. In an environment where we throw lots of data around we just love SMB 3.0

Design Considerations For Converged Networking On A Budget With Switch Independent Teaming In Windows Server 2012 Hyper-V


Last Friday I was working on some Windows Server 2012 Hyper-V networking designs and investigating the benefits & drawbacks of each. Some other fellow MVPs were also working on designs in that area and some interesting questions & answers came up (thank you Hans Vredevoort for starting the discussion!)

You might have read that for low cost, high value 10Gbps networks solutions I find the switch independent scenarios very interesting as they keep complexity and costs low while optimizing value & flexibility in many scenarios. Talk about great ROI!

So now let’s apply this scenario to one of my (current) favorite converged networking designs for Windows Server 2012 Hyper-V. Two dual NIC LBFO teams. One to be used for virtual machine traffic and one for other network traffic such as Cluster/CSV/Management/Backup traffic, you could even add storage traffic to that. But for this particular argument that was provided by Fiber Channel HBAs. Also with teaming we forego RDMA/SR-IOV.

For the VM traffic the decision is rather easy. We go for Switch Independent with Hyper-V Port mode. Look at Windows Server 2012 NIC Teaming (LBFO) Deployment and Management to read why. The exceptions mentioned there do not come into play here and we are getting great virtual machine density this way. With lesser density 2-4 teamed 1Gbps ports will also do.

But what about the team we use for the other network traffic. Do we use Address hash or Hyper-V port mode. Or better put, do we use native teaming with tNICs as shown below where we can use DCB or Windows QoS?

image

Well one drawback here with Address Hash is that only one member will be used for incoming traffic with a switch independent setup. Qos with DCB and policies isn’t that easy for a system admin and the hardware is more expensive.

So could we use a virtual switch here as well with QoS defined on the Hyper-V switch?

image

Well as it turns out in this scenario we might be better off using a Hyper-V Switch with Hyper-V Port mode on this Switch independent team as well. This reaps some real nice benefits compared to using a native NIC team with address hash mode:

  • You have a nice load distribution of the different vNIC’s send/receive traffic over a single member of the NIC team per VM. This way we don’t get into a scenario where we only use one NIC of the team for incoming traffic. The result is a better balance between incoming and outgoing traffic as long an none of those exceeds the capability of one of the team members.
  • Easy to define QoS via the Hyper-V Switch even when you don’t have network gear that supports QoS via DCB etc.
  • Simplicity of switch configuration (complexity can be an enemy of high availability & your budget).
  • Compared to a single Team of dual 10Gbps ports you can get a lot higher number of VM density even they have rather intensive network traffic and the non VM traffic gets a lots of bandwidth as well.
  • Works with the cheaper line of 10Gbps switches
  • Great TCO & ROI

With a dual 10Gbps team you’re ready to roll. All software defined. Making the switches just easy to use providers of connectivity. For smaller environments this is all that’s needed. More complex configurations in the larger networks might be needed high up the stack but for the Hyper-V / cloud admin things can stay very easy and under their control. The network guys need only deal with their realm of responsibility and not deal with the demands for virtualization administration directly.

I’m not saying DCB, LACP, Switch Dependent is bad, far from. But the cost and complexity scares some people while they might not even need. With the concept above they could benefit tremendously from moving to 10Gbps in a really cheap and easy fashion. That’s hard (and silly) to ignore. Don’t over engineer it, don’t IBM it and don’t go for a server rack phD in complex configurations. Don’t think you need to use DCB, SR-IOV, etc. in every environment just because you can or because you want to look awesome. Unless you have a real need for the benefits those offer you can get simplicity, performance, redundancy and QoS in a very cost effective way. What’s not to like. If you worry about LACP etc. consider this, Switch independent mode allows for nearly no service down time firmware upgrades compared to stacking. It’s been working very well for us and avoids the expense & complexity of vPC, VLT and the likes of that. Life is good.

Windows Server 2012 NIC Teaming Mode “Independent” Offers Great Value


There, I said it. In switching, just like in real life, being independent often beats the alternatives. In switching that would mean stacking. Windows Server 2012 NIC teaming in Independent mode, active-active mode makes this possible. And if you do want or need stacking for link aggregation (i.e. more bandwidth) you might go the extra mile and opt for  vPC (Virtual Port Channel a la CISCO) or VTL (Virtual Link Trunking a la Force10 – DELL).

What, have you gone nuts? Nope. Windows Server 2012 NIC teaming gives us great redundancy with even cheaper 10Gbps switches.

What I hate about stacking is that during a firmware upgrade they go down, no redundancy there. Also on the cheaper switches it often costs a lot of 10Gbps ports (no dedicated stacking ports). The only way to work around this is by designing your infrastructure so you can evacuate the nodes in that rack so when the stack is upgraded it doesn’t affect the services. That’s nice if you can do this but also rather labor intensive. If you can’t evacuate a rack (which has effectively become your “unit of upgrade”) and you can’t afford the vPort of VTL kind of redundant switch configuration you might be better of running your 10Gbps switches independently and leverage Windows Server 2012 NIC teaming in a switch independent mode in active active. The only reason no to so would be the need for bandwidth aggregation in all possible scenarios that only LACP/Static Teaming can provide but in that case I really prefer vPC or VLT.

Independent 10Gbps Switches

Benefits:

  • Cheaper 10Gbps switches
  • No potential loss of 10Gbps ports for stacking
  • Switch redundancy in all scenarios if clusters networking set up correctly
  • Switch configuration is very simple

Drawbacks:

  • You won’t get > 10 Gbps aggregated bandwidth in any possible NIC teaming scenario

Stacked 10Gbps Switches

Benefits:

  • Stacking is available with cheaper 10Gbps switches (often a an 10Gbps port cost)
  • Switch redundancy (but not during firmware upgrades)
  • Get 20Gbps aggregated bandwidth in any scenario

Drawbacks:

  • Potential loss of 10Gbps ports
  • Firmware upgrades bring down the stack
  • Potentially more ‘”complex” switch configuration

vPC or VLT 10Gbps Switches

Benefits:

  • 100% Switch redundancy
  • Get > 10Gbps aggregated bandwidth in any possible NIC team scenario

Drawbacks:

  • More expensive switches
  • More ‘”complex” switch configuration

So all in all, if you come to the conclusion that 10Gbps is a big pipe that will serve your needs and aggregation of those via teaming is not needed you might be better off with cheaper 10Gbps leverage Windows Server 2012 NIC teaming in a switch independent mode in active active configuration. You optimize 10Gbps port count as well. It’s cheap, it reduces complexity and it doesn’t stop you from leveraging Multichannel/RDMA.

So right now I’m either in favor of switch independent 10Gbps networking or I go full out for a vPC (Virtual Port Channel a la CISCO) or VTL (Virtual Link Trunking a la Force10 – DELL) like setup and forgo stacking all together. As said if you’re willing/capable of evacuating all the nodes on a stack/rack you can work around the drawback. The colors in the racks indicate the same clusters. That’s not always possible and while it sounds like a great idea, I’m not convinced.

image

When the shit hits the fan … you need as little to worry about as possible. And yes I know firmware upgrades are supposed to be easy and planned events. But then there is reality and sometimes it bites, especially when you cannot evacuate the workload until you’re resolved a networking issue with a firmware upgrade Confused smile Choose your poison wisely.

Exploring Hyper-V Virtual Switch Port Mirroring


Windows Server 2012 brings us many new capabilities and one of those is port mirroring. You can now configure a virtual machine NIC (vNIC) who’s traffic you want to monitor as the source in the Advanced Features of the Network Adapter settings. The vNIC of the virtual machine where you’ll run a network sniffer, like Network Monitor or WireShark, against is set to “Destination”. It’s pretty much that simple to set up. Easy enough.

On the vNIC you want to monitor the traffic to and from the VM, under Settings, Network Adapter (choose the correct one), under Advanced Features you select “Source” as Mirroring mode. In this example we’re going to monitor data traffic to and from the guest Columbia.image

On the destination VM we have a dedicated vNIC set up called “Sniffie”image

On the guest VM Pegasus, where we’ll capture the network traffic via a dedicated vNIC (“Sniffie”), we set that vNIC (virtual port) to “Destination” as Mirroring node:image

So now let’s start pinging a host (ping –t crusader)  on our Source VM  Columbiaimage

And take a look on the Destination vNIC on virtual machine Pegasus where we’re capturing the traffic. The “Sniffie” NIC there is set to destination as Mirror Mode. Look at the ICMP echo reply from form 192.168.2.32 (Crusader host). Columbia is at 192.168.2.122 sending out the ICMP echo request.image

Pretty cool!

Some Technicalities

So deep down under the hood, it’s the switch extension capabilities  of the Hyper-V virtual switch that are being leveraged to achieve port sniffing. This is just one of the many functionalities that the Hyper-V extensible switch enables. The Hyper-V extensible switch itself uses port ACLs to set a rule that forwards traffic from one  virtual port to another virtual port. For practical reasons translate virtual port to vNIC in a VM and this translates into what we shown above. While it’s good to know that port ACLs are what is used by the extensible switch to do enable all kinds of advances features like port mirroring but you don’t need to worry about the details to use it.

Things to note

Initially many of us made the assumption that we’d be able to sniff the traffic form a virtual port to a port on their physical switch. This is not the case. Basically, in box, it’s a source VM that mirrors it’s network traffic form one or more virtual ports (vNICs) to a destination VM’s one or more virtual ports (vNIC).

You can send many sources to one destination. That’s fine. You could also define more destinations on the same host but that’s not really wise and practical as far as I can see. All in all, you set it up on  when needed on the source VM and you keep a destination VM with a sniffer around for the sniffing.

Also keep in mind that all this works within the boundaries of the same host. Which means that if you want to monitor a VMs network traffic when it moves across nodes in a cluster you’ll have to have "destination” virtual machine on each host. This means that when a source VM is live migrated it will mirror the traffic to that local destination VM. That works.

You could try and live migrate source & destination VMs to the same host but this is not feasible in real life. For one the capture doesn’t survive after a life migration as your sniffer loses connectivity to virtual Port / vNIC.image

Don’t be too disappointed about this. Port mirroring is not meant to be a permanent situation that you need to keep highly available anyway, bar some special environments/needs.

Whilst is it true that out of the box you can’t do stuff like sending the mirrored traffic form a guests vNIC/virtual port to a physical switch port where you attach your network sniffer laptop or so. If you throw on the CISCO Nexus 1000V it replaces the Microsoft in box “Forwarding Extensions” and than it’s up to CISCO’s implementation to determine what you can or can’t do. As this stuff is right up their sleeve they allow the Cisco Nexus 1000V mirrors traffic sent between virtual machines by sending ERSPAN to an external Cisco Catalyst switch. I have not had the pleasure of playing working with this.

Anyway, I hope this help to explain things a little. Happy sniffing and don’t get yourself into trouble, follow the rules.

I’m Presenting At The Belgian TechDays 2013


A little end of year news flash for you all. I already mentioned the contribution the local Belgian MVPs and MEET members are making to the TechDays in 2013 and now I can tell you I’m joining them for a presentation as well on March 7th in the 16:15-16:30 time slot . In the talk Windows Server 2012 Hyper-V Networking Evolved I’ll be discussing some of the network improvements in Windows Server 2012. Some are very well known others a bit less but they all work together to make Windows Server 2012 a very capable operating system that’s future proof.

Hyper-V benefits from a range of new features introduced across the entire network stack in Windows Server 2012. Some of these are native networking improvements in the operating system itself. Others leverage technology that requires supported Network Adapters & Switches that benefits Hyper-V hosts and the virtual machines that run on top of it. Come and see how even the most demanding workloads can now be virtualized without sacrificing performance, reliability, security or scalability. These features vary from easy & transparent, with almost zero configuration, to complex, requiring more design and implementation considerations. Join me for an overview of these network improvements, how they work and what they can do for your business.

We’ve been running Windows Server 2012 Hyper-V in production since August & September 2012. So we put our money where our mouth is. If you hurry up and register before the end of the year you can still get the early bird price.

TechDays Early bird banner wide

Your company cannot lose here. You gain insight & knowledge, your employer gets a well prepared and motivated employee. How’s that for a nice new years gift?

Attending the Dell Tech Summit EMEA


As you read this I’m preparing to get on my way to the DELL Tech Summit in Lisbon, Portugal for a few days. I’ll be discussing the needs we have from them as customers (and their competition actually for that matter) when it comes to hardware in the Microsoft landscape in the era of Windows Server 2012.

image

I’m very happy and eager to tell them what, in my humble opinion, they are doing wrong and what they are doing right and even what they are not doing at all Smile  I believe in giving feedback and interaction with vendors. Not that I have any illusion of self importance as to the impact of my voice on the grand scheme of things but if I don’t speak up nothing changes either. As Intel and Microsoft are there as well,  this makes for a good selection of the partners involved. So here I go:

  1. More information on storage features, specifications and roadmaps
  2. Faster information on storage features, specifications and roadmaps
    • Some of these are in regards to Windows Server 2012 & System Center 2012 (Storage Pools & Spaces, SMI-S, ODX, UNMAP, RDMA/SMB3.0 …) and some are more generic like easier & better SAN/Cluster failovers capabilities, ease of use, number of SCSI 3 persistent reservations, etc.
  3. How to address the IOPS lag in the technology evolution. Their views versus my ideas on how to tackle them until we get better solutions.
  4. Plans, if any, for Cluster In a Box (CiB) building blocks for Windows Server 2012 Private Cloud solutions.
  5. When does convergence make sense and when not cost/benefit wise (and at what level). I’d like a bit more insight into what DELLs vision is and how they’ll execute that. What will new storage options mean to that converged network, i.e. SMB 3.0, Multichannel & RDMA capable NICs. Now convergence always seems tied to one tech/protocol (VOIP in the past, FCoE at the moment) and it shouldn’t, plenty of other needs for loads of bandwidth (Live migration, Storage Live Migration, Shared Nothing Live Migration, CSV redirected mode, …).

Now while it’s important to listen to you customers, this is not easy if you want to do it right, far from it. For one we’re all over the place as a group. This is always the case unless you cater to a specialized niche market. But DELL serves both consumers and enterprises form 1 person shops to fortune 500 companies in all fields of human endeavor. That makes for nice cocktail of views and opinions I suspect.

Even more importantly than listening is processing what you hear from your customers. Do you ignore, react, or take it away as more or less valuable information. Information on which to act or not, to use in decision making, and perhaps even in executing those decisions. And let’s face it without execution decisions are pretty academic exercises. In the end management is in control and for all the feedback, advise, research that gathered and done, they are at the steering wheel and they are responsible for the results.

One thing that I do know from my fellow MVPs and the community is that for the past 12 months any vendor who would address those questions with a good plan and communications would be a top favorite while selecting hardware at many customers for a lot of projects.

Intel X520 Series NIC on Windows 2012 With Hyper-V Enabled Port Flapping Issue


When you install Windows Server 2012 RTM to a server with X520 series NIC cards you’ll notice that there is a native driver available and the performance of that driver is fantastic. It’s really impressive to see.

image

That’s great news but I’ve noticed an issue in RTM that I already dealt with in the release candidate.

The moment you install Hyper- V some of the X520 NIC ports can start flapping (connected/disconnected).  You’ll see the sequence below endlessly on one port, sometimes more.

image

image

image

As you can imagine this ruins the party in Hyper-V networking an bit too much for comfort Confused smile But it can be fixed. The root cause for this I do not know but it is driver related. The same thing happened in the release candidate. But now things are easier to fix. Navigate to the Intel Site to download their freshly released driver for the X520 series on Windows Server 2012 and install it (you don’t need to install the extra software with Advanced Network Services => native Windows NIC teaming has arrived). After that the flapping will be gone.

image

Hope this helps some folks out!

Hyper-V Shared Nothing Live Migration In Windows Server 2012– VM Mobility Rules


I see and hear some people shrug at the idea of Shared Nothing Live Migration, dismissing it as marginally useful. Some do state they’ll have it as well but that it’s not that valuable. Well I disagree totally. A lot of the time these remarks are due to a lack of understanding about how several technologies in the Microsoft stack work together. Combine this with tunnel vision and the fear of some vendors and you get a lot of FUD.

I advise you to look beyond the virtualization stack, to the issues that people who are building infrastructure for dynamic, flexible and * cloud  data centers are dealing with.

Look, as “architects” we have to design & build for failure. We all know that it’s just a matter of time before things go BOINK.  So we build in redundancy, some of this within a silo, some of this is between silos. The two approaches compliment each other. What this gives you is options and everybody who knows me, especially those who work  with me has heard my mantras: “Assumptions are the mother of all F* Ups” and “Options, options, options”. Make sure you design & build in options. This way you can maneuver your self out of a bad situation. Don’t ever assume you’re out of options, especially not when you put some in the design on purpose Winking smile. It’s also very useful beyond that because a lot of you might agree with me that silos and fork lift, down time inducing upgrades, migrations, transitions or replacements are expensive and bad. This is where Share Nothing Live Migrations comes into play. You gain mobility over silos. That silo might be a server, a cluster, storage or mixtures of them all.

With Shared Nothing Live Migration we can migrate virtual machines between those silos with nothing more than a network cable.This is huge people. You are no longer trapped in that silo. In this context it provides you with all the options & flexibility mobility gives you. even it the technology itself is not about high availability.

Some very useful scenarios

Migrate virtual machines from an old cluster to a new cluster with out any down time

  1. Migrate virtual machines from stand alone hyper-V hosts to a fail over cluster with out any down time
  2. Migrate virtual machines from one stand alone host to another one for maintenance, again, without any down time
  3. Choose different types if storage & Hyper-V deployment depending in IOPS, redundancy, availability, manageability needs. With Shared Nothing Live Migration you can be confident  that  you can move your virtual machine from one environment to the other when needs change. This is breaking the storage silo boundaries open people! This is huge … think about it.

How it works

The details are for another post but basically is made possible by the combination of Live Storage Migration and Live Migration.

First the Storage is Live Migrated

image

After the Live Storage Migration is done the state of the virtual machines is copied and synchronized.

image

This Is Mobility

I hear the competition shrug.  It isn’t high availability. Well indeed no one who understands the feature ever said it was. It’s virtual machine mobility. Look at the scenarios above and you’ll see that this ability could very well be game changer in how we look at storage & design solutions.

Speed & Performance

What did we hear on this front: “it will be too slow to be really useful”. Really? Well let’s see:

  1. The world is converging to 10Gbps and after that 40Gbps and up will come
  2. NIC Teaming in box With Windows 2012 which can provide more bandwidth.
  3. SMB 3.0 Multichannel. This provides multiple channels per connection spreading the load over multiple CPUs
  4. SMB Direct, have you seen the speeds this achieves?

Before you state that this doesn’t work on Live Migration … as confirmed at TechEd 2012 Europe with Jose Baretto this does work when both the source AND the target is an SMB 3.0 share. This means yet another reason to use SMB 3.0 share for your Hyper-V storage needs! So unlike what Tad at vLimited keeps saying, unhindered by any knowledge, it is a very valuable feature and it can be extremely fast given the right connectivity and storage that can handle the IOPS. And no, the fact that it’s unbuffered doesn’t impact this to much. Test this by using xcopy/robocopy /J with a VHD over your infrastructure.

image

Even if you’re on a budget and cannot go for the RDMA NICs & SMB 3.0 you have several options to get very decent virtual machine mobility and not be stuck in a silo. And for those who want to leverage this feature to create and agile & mobile virtual environment you have some very nice technologies available to optimize to your needs & budgets.

Conclusion

Virtual Machine mobility and storage mobility are very interesting features that provide for a previously unknown flexibility. Windows Server 2012 makes us rethink our storage approaches (I sure am) and I’m very interested in seeing how this will evolve.