About workinghardinit

IT without the sales brochure gloss

Where SMB Direct, RoCE, RDMA & DCB fit into the stack


I’m assuming most of you are at least familiar with the concept of converged networking and SMB Multichannel and SMB Direct. This is not going to be a lesson on these subjects. We’re just setting the stage here for our simple demo configuration and its relation to real world scenarios. This to remind you of the why and where of what we do an demo in our next blog posts on SMB Direct over RoCE with two DCB features: Priority Flow Control (PFC) and Enhanced Transmission Selection (ETS).

Generalized and simplified a modern virtualized data center network looks a lot like this:

image

It’s more or less converged, that means all kinds of traffic move over the same infrastructure, which is great for standardization and your budget. Unless you get into performance issues. That’s where QoS can help.  As we’re doing SMB Direct over RoCE we’ll use DCB to handle QoS. Mind you, QoS is an aid and it will not help to do too much over too little bandwidth. Let’s zoom in a bit on the Hyper-V & storage side of things. In general the RDMA capable variant of a  modern SOFS / Hyper-V environment network looks as below in a bit more detail:

image

The RDMA capable traffic is SMB Direct over RoCE in this use case. This is used for Live Migration, CSV Traffic & storage traffic to the SOFS Server.

DCB cannot distinguish between these SMB traffic uses cases. It’s all RDMA traffic over port 445 the DCB configuration will not distinguish between these. That’s why on top of DCB we leverage SMB Bandwidth Limit (see https://blog.workinghardinit.work/2013/09/03/preventing-live-migration-over-smb-starving-csv-traffic-in-windows-server-2012-r2-with-set-smbbandwidthlimit/). This prevents the live migration traffic form pushing aside the Storage traffic. This is a windows configured feature and does not rely on DCB or other forms of QoS.

To make sure cluster traffic itself, backups, data copies, management etc… don’t starve each other we implement QoS leveraging DCB (the ETS part). As we need to use DCB with RoCE in real worlds scenarios to make it lossless (the PFC part) and as you do not mix different QoS approaches on the same networks stack we stick with DCB for the other workloads on the same networks stack.

image

Mind you this does not prevent scenarios where management and backups are done over vNICs on the Hyper-V switch and where we leverage Hyper-V QoS as that’s on another network stack.

In our lab demos we’ll keep things simple: We’ll do live migration over SMB Direct (RoCE)and we’ll simulate intense backup traffic over the same pair of NICs to illustrate a RoCE configuration to guarantee minimal bandwidth for both and keep the RDMA traffic lossless (PFC). To make it very clear we’ll do a demo setup where we use two 10GbE NICs per host and allocate a minimum bandwidth of 90% for live migration and allocate the remaining10% minimum bandwidth to all other traffic (i.e. which includes our intense backup traffic). Read more about the configuration in SMB Direct over RoCE Demo – Hosts & Switches Configuration Example.

SMB Direct with DCB, PFC, ETS … How do I know it works?!


A question that comes up over time, again and again, is how do you know SMB Direct is working. The question stems from a nagging feeling that configuring DCB is a bit of playing wizard’s apprentice and we might not completely know what we’re doing, i.e. lack of experience.

image

Many have suspected me of brewing up DCB configurations in a dark corner of the data center where no one else dares venture. But those are unsubstantiated rumors. But in coming blog posts we’ll address how to configure it end to end and we’ll show how to find out if it’s really working and how to test that.

Finding out if it really works, testing and monitoring isn’t magic. It boils down to using tools you know. Performance counters for RDMA Activity and SMB direct are natively available in Windows. Use them!The NIC vendors also provide very detailed counters, those are excellent and of great value when testing and confirming things work as they should. The latter is very important. Because after people are satisfied SMB Direct works they want to know if DCB is configured correctly. Does PFC work, are pause frames being send and received? Is it really lossless?  Does ETS really kick in when needed, do I get the minimum bandwidth I configured? These are very valid questions people struggle with. But the answer eludes many, almost like the question if the refrigerator light really goes out when you close the door.

It’s hard to do deep down in the network packets … that often requires a very specialized skillset and experience with packet analyzers etc. Nothing most of you can’t learn but often this is not a priority. But with some creativity and the performance counters on windows provided by the NIC vendors and the statistics counters on the switches you can demonstrate that both PFC & ETS doe work and kick in.

So in upcoming blogs & videos I’ll demonstrate the configuring SMB Direct over RoCE leveraging 2 parts of DCB:

  • PFC (Priority Flow Control) – mandatory for SMB Direct over RoCE
  • ETS (Enhanced Transmission Selection) – optional but I advise you to leveraged it for SMB Direct over RoCE

Actually, when doing true converged, no matter what route you go, QoS is not really optional any more.

The biggest challenge is to get people to wrap their heads around the concepts and it’s behavior. Once you do that you’ll understand how and why to configure it. It took me time and effort, there’s no way around it, but it’s well worth the effort.

Look, DCB is not 100% fully matured or perfect especially in large scale environments over > 2 or 3 hops. Frak, while I love tinkering, testing and playing with this stuff I have never been a “QoS first person”. If I can I thrown resources at the problem (CPU cycles; memory, bandwidth, …). QoS is like a gun. You only draw it when you must use it and than you’d better do it right otherwise you don’t touch it, bar for practice/training/ education. While perfection is not of this world and improvements are being worked on (ECN) it does work and deliver. How many of you had a large scale > 2 hops , > 20 switches deployment with FC, FCoE or iSCSI to worry about? So can it deliver what you need today in most scenarios? Yes! Can I fix the short comings of any random technologies? No. Can I leverage current technologies with great success despite this? Yes! So can you. There is a reason I get hired and paid. Trust me it’s not my looks, my bed side manner or charismatic appearance Winking smile.

Side note 1: I’m cannot possibly provide a switch configuration guide in a step by step fashion as the details vary by vendor, they can also be switch model/type specific and it all depends on your environment & needs. So no I cannot and will not attempt to write a bunch of these. This would be way too much work and way too expensive (time, hardware etc.), so unless I’m paid very generously to do so, you’re out of luck. It might be cheaper to hire me or to come to the free community sessions, presentations, ATE evenings and study up.

Help! Active Directory cannot read forest and domain functional level anymore or much ado about nothing


The other day I got a very worried request for support. Apparently the Forest and domain functional level of a Active Directory deployment could no longer be read. Nothing else was wrong, everything was working just fine. If I could have a quick look? So they shared the screen with me and this is what I saw.

image

And this …

image

Well that didn’t surprise me, they are supposed to be already on domain and forest functional level 2012 R2 as all their domain controllers where already on Windows 202 R2 for over a year. That error message is not right!  After being puzzled for a moment when it hit me. This was a Windows 2008 R2 host without updated tools!

Once they used the correct version of RSAT or checking on a Windows 2012 R2 host all was show to be well and the scare was over.

image

Nothing so see here, move along.

Windows Server Technical Preview and Hyper-V Server Technical Preview Expiration Extension


Great news for all those of us that are running Windows Server Technical Preview v1 in their labs. It was due to expire on April 15th but Microsoft announced they were working on a fix to extend that deadline. They did not mention an ETA for it bit it’s here now, see http://www.microsoft.com/en-us/download/details.aspx?id=46447

image

So download, install, reboot and you’re good to go until we get our hands on Technical Preview v2! We’ve been saved by the cavalry and life is good!

Changing the segment size of a virtual disk in PowerVault MD Storage Series


It happens to the best of us, sometime we selected the wrong option during deployment and or configuration of our original virtual disks. Or, even with the best of planning, the realities and use cases of your storage changes so the original choice might not be the most optimal one. Luckily on a DELL MD PowerVault storage device you do not need to delete the virtual disk or disks and lose your data to reconfigure the segments size. Even better is that you can do this online as a background process., which is a must at it can take a very long time and it would cause prohibitively long down time if you had to take the data off line for that amount of time.

image

You have a certain control over the speed at which this happens via the priority setting but do realize that this takes a (very) long time. Due to the fact it’s a background process you can keep working. I have noticed little to none impact on performance but you mileage may vary.

image

image

How long does it take? Hard to predict. This is a screenshot of two 50TB virtual disks were the segment size is being adjusted on line …

image

You cannot always go to the desired segment size in one step. Sometime you have only an intermediate size available. This is the case in the example below.

image

The trick is to first move to that segment size and then repeat the process to reach the size you require.  In this case we’ll first move to 256 KB and than to 512 KB segment size. So this again take a long time. But again, it all happens on line.

In conclusion, it’s great to have this capability. When you need to change the size when there is already data on the PowerVault virtual disks you have the ability to do so online while the data remains available. That this can require multiple steps and take a long time is not a huge deal. You kick it off and let it run. No need to sit there and watch it.

Optimizing Backups: PowerShell Script To Move All Virtual Machines On A Cluster Shared Volume To The Node Owing That CSV


When you are optimizing the number of snapshots to be taken for backups or are dealing with storage vendors software that leveraged their hardware VSS provider you some times encounter some requirements that are at odds with virtual machine mobility and dynamic optimization.

For example when backing up multiple virtual machines leveraging a single CSV snapshot you’ll find that:

  • Some SAN vendor software requires that the virtual machines in that job are owned by the same host or the backup will fail.
  • Backup software can also require that all virtual machines are running on the same node when you want them to be be protected using a single CSV snap shot. The better ones don’t let the backup job fail, they just create multiple snapshots when needed but that’s less efficient and potentially makes you run into issues with your hardware VSS provider.

image

VEEAM B&R v8 in action … 8 SQL Server VMs with multiple disks on the same CSV being backed up by a single hardware VSS writer snapshot (DELL Compellent 6.5.20 / Replay Manager 7.5) and an off host proxy Organizing & orchestrating backups requires some effort, but can lead to great results.

Normally when designing your cluster you balance things out a well as you can. That helps out to reduce the needs for constant dynamic optimizations. You also make sure that if at all possible you keep all files related to a single VM together on the same CSV.

Naturally you’ll have drift. If not you have a very stable environment of are not leveraging the capabilities of your Hyper-V cluster. Mobility, dynamic optimization, high to continuous availability are what we want and we don’t block that to serve the backups. We try to help out to backups as much a possible however. A good design does this.

If you’re not running a backup every 15 minutes in a very dynamic environment you can deal with this by live migrating resources to where they need to be in order to optimize backups.

Here’s a little PowerShell snippet that will live migrate all virtual machines on the same CSV to the owner node of that CSV. You can run this as a script prior to the backups starting or you can run it as a weekly scheduled task to prevent the drift from the ideal situation for your backups becoming to huge requiring more VSS snapshots or even failing backups. The exact approach depends on the storage vendors and/or backup software you use in combination with the needs and capabilities of your environment.

cls

$Cluster = Get-Cluster
$AllCSV = Get-ClusterSharedVolume -Cluster $Cluster

ForEach ($CSV in $AllCSV)
{
    write-output "$($CSV.Name) is owned by $($CSV.OWnernode.Name)"
    
    #We grab the friendly name of the CSV
    $CSVVolumeInfo = $CSV | Select -Expand SharedVolumeInfo
    $CSVPath = ($CSVVolumeInfo).FriendlyVolumeName

    #We deal with the \ being and escape character for string parsing.
    $FixedCSVPath = $CSVPath -replace '\\', '\\'

    #We grab all VMs that who's owner node is different from the CSV we're working with
    #From those we grab the ones that are located on the CSV we're working with
      $VMsToMove = Get-ClusterGroup | ? {($_.GroupType –eq 'VirtualMachine') -and ( $_.OwnerNode -ne $CSV.OWnernode.Name)} | Get-VM | Where-object {($_.path -match $FixedCSVPath)} 
     
    ForEach ($VM in $VMsToMove)

    {
        write-output "`tThe VM $($VM.Name) located on $CSVPath is not running on host $($CSV.OwnerNode.Name) who owns that CSV"
        write-output "`tbut on $($VM.Computername). It will be live migrated."
        #Live migrate that VM off to the Node that owns the CSV it resides on
        Move-ClusterVirtualMachineRole -Name $VM.Name -MigrationType Live -Node $CSV.OWnernode.Name
    }

Now there is a lot more to discuss, i.e. what and how to optimize for virtual machines that are clustered. For optimal redundancy you’ll have those running on different nodes and CSVs. But even beyond that, you might have the clustered VMs running on different cluster, which is the failure domain.  But I get the remark my blogs are wordy and verbose so … that’s for another time Winking smile

Essential to my Modern Datacenter Lab: Azure Site 2 Site VPN with a DELL SonicWALL Firewall


If you’re a serious operator in the part of IT that is considered the tip of the spear, i.e. you’re the one getting things done, you need a lab. I have had one (well I upgraded it a couple of times) for a long time.  When you’re dealing with cloud as an IT Pro, mostly Microsoft Azure in my case, that need has not changed. It enables you gain the knowledge and insights that you can only acquire by experimenting and hands on work, there is no substitute. Sometimes people ask me how I learn. A lab and lots of hands on experimenting is a major component of my self education and training. I put in a lot of time and some money, yes.

Perhaps you have a lab at work, perhaps not, but you do need one. A lab is a highly valuable investment in education for both your employers and yourself. It takes a lot of time, effort and it cost a bit of money. The benefits however are huge and I encourage any employer who has IT staff to sponsor this at the ROI is huge for a relative small TCO.

I love the fact that in a lab you have (and want) complete control over the entire stack so you experiment at will and learn about the solutions you build end to end. You do need to deal with it all but that’s all good, you learn even more, even when at times it’s tough going. Note that a home lab, even with the associated costs, has the added benefit of still being available to you even if you move between employers or between clients.

You can set up a site to site VPN using Windows Server 2012 R2 RRAS (see Site-to-Site VPN in Azure Virtual Network using Windows Server 2012 Routing and Remote Access Service (RRAS) that works. But for for long term lab work and real life implementations you’ll be using other devices. In the SOHO lab I run everything virtualized & I need internet access for other uses cases than the on premises lab. I also like to minimize the  hosts/VMs/appliances I need to have running to save on electricity costs. For enterprise grade solutions you leverage solutions form CISCO, JUNIPER, CheckPoint etc. There is no need for “enterprise grade” solutions in a SOHO or small branch office environment.Those are out of budget & overkill, so I needed something else. There are some options out there but I’m using a DELL SonicWALL NSA 220. This is a quality product for one and I could get my hands on one in a very budget friendly manner. UTMs & the like are not exactly cheap, even without all the subscription, but they don’t exactly cater to the home user normally. You can go higher or lower but I would not go below a TZ-205 (Wireless) which is great value for money and more than up for the task of providing you with the capabilities you need in a home lab.

SonicWALL NSA 220 Wireless-N Appliance

I consider this minimum level as I want 1Gbps (no I do not buy 100Mbps equipment in 2015) and I want wireless to make sure I don’t need to have too many hardware devices in the lab. As said, the benefit over the RRAS solution are that it serves other purposes (UTM) and it can remain running cheaply so you can connect to the lab remotely to fire up your hosts and VMs which you normally power down to safe power.

Microsoft only dynamic routing with a limited number of vendors/devices but that doesn’t mean all others are off limits. You can use them but you’ll have to research the configurations that work instead of downloading the configuration manual or templates from the Microsoft web site, which is still very useful to look at an example configuration, even if it’s another product than you use.

Getting it to work is a multistep proccess:

    1. Set up your Azure virtual network.
    2. Configure your S2S VPN on the SonicWall
    3. Test connectivity between a on premises VM and one in the cloud
    4. Build out your hybrid or public cloud

Here’s a reference to get you started Tutorial: Create a Cross-Premises Virtual Network for Site-to-Site Connectivity I will be sharing my setup for the SonicWall in a later blog post so you can use it as a reference. For now, here’s a schematic overview of my home lab setup to Azure (the IP addresses are fakes). At home I use VDSL and it’s a dynamic IP address so every now and then I need to deal with it changing. I’d love to have a couple of static IP address to play with but that’s not within my budget. I wrote a little Azure scheduled run book that takes care of updating the dynamic IP address in my Azure site-to-site VPN setup. It’s also published on the TechNet Gallery

image

You can build this with WIndows RRAS, any UTM, Firewall etc … device that is a bit more capable than a consumer grade (wireless) router. The nice things is that I’ve had multiple subnets on premises and the 10 tunnels in a standard Azure  site-to-site VPN accommodate that nicely. The subnets I don’t want to see in a tunnel to azure I just leave out of the configuration.

Tip to save money in your Azure lab for newbies, shut down everything you can when your done. Automate it with PowerShell. I just make sure my hybrid infra is online & the VPN active enough to make sure we don’t run into out of sync issues with AD etc.