Video Interview On Rolling Cluster Upgrades in Windows Server vNext


Carsten Rachfahl from Rachfahl IT-Solutions (quite possibly  Germany’s leading Hyper-V, Storage Spaces & Private cloud consultancy) and I got together in Berlin last November at the Microsoft Technical Summit 2014. Between presenting (I delivered What’s new in Failover Clustering in Windows Server 2012 R2), workshops, interviews we found some time to do a video interview.

We discussed a very welcome new capability in Windows Server vNext: “Rolling cluster updates” or “Cluster Operating System Rolling Upgrade” in Windows Server Technical Preview as Microsoft calls it. I blogged about this rather soon after the release of the Technical Preview First experiences with a rolling cluster upgrade of a lab Hyper-V Cluster (Technical Preview).

Videointerview with Didier Van Hoye about Rolling Cluster Upgrade Thumb1

We’ve been able to do rolling updates of Windows NLB for a long time and we’ve been asking for that same capability in Windows Failover Clustering for many years and now, it’s finally coming! And yes, as you will notice we like that a lot!

You need to realize that making the transition form one version to another as smooth, easy and risk free as possible is of great value to the customer as it enables them to upgrade faster and get the benefits of their investment quicker. For Microsoft it means they can have more people move to more modern environments faster which helps with support and delivering value in a secure and modern environment.

At the end we also joke around a bit about DevOps and how this is just as set of training wheels on the road to true site resilience engineering. All fun and all good. Enjoy!

Don’t Forget To Leverage The Benefits of RD Gateway On Hyper-V & RDP 8/8.1


So you upgraded your TS Gateway virtual machine on W2K8(R2) to RDS Gateway on W2K12(R2) too make sure you get the latest and the greatest functionality and cut off any signs of technology debt way in advance. Perhaps you were inspired by my blog series on how to do this, and maybe you jumped through the x86 to x64 bit hoop whilst at it. Well done.

Now when upgrading or migrating from W2K8(R2) a lot of people forget about some of the enhancements in W2K12(R2). This is especially true of you don’t notice much by doing so. That’s why I see people forget about UDP. Why? Well things will keep working as they did before Windows Server 2012 RDS Gateway over HTTP or over RPC-HTTP (legacy clients). I have seen deployments where both the Windows and the perimeter firewall rules to allow UDP over 3391 were missing. Let alone that port 3391 was allowed in the RAP.  But then you miss out on the benefits it offers (a better user experience over less than great network connections and with graphics) ass well on those of that ever more capable thingy called RemoteFX, if you use that.

For you that don’t know yet:  HTTP and UDP protocols are both used preferably by RD Gateway and are more efficient than RPC over HTTP which is better for scaling and experience under low bandwidth and bad connectivity conditions. When HTTP transport channels are up (in & outgoing traffic), two UDP side channels are set up that can be used to provide both reliable (RDP-UDP-R) and best-effort (RDP-UDP-L) delivery of data. UDP also leveraged SSL via the RD gateway because is uses Datagram Transport Layer Security (DTLS). For more info RD Gateway Capacity Planning in Windows Server 2012. Further more it proves you have no reason not to virtualize this workload and I concur!

So why not set it up!?  So check you firewall rules on the RD Gateway Server and set the rules accordingly. Do the same for your perimeter firewalls or any other in between your users and your RD Gateway.

image

Under properties of your RS Gateway server you need to make sure UDP is enabled and listening on the needed IP address(es)

image

A client who connects over your RDS Gateway server, Windows Server 2012(R2) that is, and checks the network connection properties (click the “wireless NIC” like icon in the connection bar) sees the following: UDP is enabled. imageIf they don’t see UDP as enabled and they aren’t running Windows 8 or 8.1 (or W2K12R2) they can upgrade to RDP 8.1 on windows 7 or Windows Server 2008 R2! When they connect to a Windows 7 SP1 or Windows 2008R2  machine make sure you read this blog post Get the best RDP 8.0 experience when connecting to Windows 7: What you need to know as it contains some great information on what you need to do to enable RDP 8/8.1 when connecting to Windows 7 SP1 or Windows 2008 R2:

  1. “Computer Configuration\Administrative Templates\Windows Components\Remote Desktop Services\Remote Desktop Session Host\Remote Session Environment\Enable Remote Desktop Protocol 8.0” should be set to “Enabled”
  2. “Computer Configuration\Administrative Templates\Windows Components\Remote Desktop Services\Remote Desktop Session Host\Connections\Select RDP Transport Protocols” should be set to “Use both UDP and TCP” => Important: After the above 2 policy settings have been configured, restart your computer.
  3. Allow port traffic: If you’re connecting directly to the Windows 7 system, make sure that traffic is allowed on TCP and UDP for port 3389. If you’re connecting via Remote Desktop Gateway, make sure you use RD Gateway in Windows Server 2012 and allow TCP port 443 and UDP port 3391 traffic to the gateway

Cool you’ve done it and you verify it works. Under monitoring in the RD Gateway Manager you can see 3 connections per session: one is HTTP and the two others are UDP.

image

Life is good. But if you want to see the difference really well demonstrated try to connect to Windows 7 SP1 computer with RDP8 & TCP/UDP disabled and play a YouTube video, then to the same with RDP8 & TCP/UDP enabled, the difference is rather impressive. Likewise if you leverage RemoteFX in VM. The difference is very clear in experience, just try it! While you’re doing this look a the UDP “Kilobytes Sent” stats (refresh the monitoring tab, you’ll see UDP being put to work when playing a video on in your RDP session.

image

Load balancing Hyper-V Workloads With High To Continuous Availability With a KEMP Loadmaster


I’m working on some labs and projects with KEMP Loadmaster load balancing appliances (LM 2400, LM-R320) That will lead to some blog post on  load balancing several workloads, which are all on Windows Server 2012 R2  Hyper-V or integrate in to Azure. The load balancers used in the labs are the virtual appliances, depending on the needs and environment these are a very good, cost effective option for production as well and depending on the version you get they scale very well. Hence their use in cloud environments, they will not hold you back at all!

To stimulate your interest in load balancing and high availability I’ve put up a video on load balancing RD Gateway services. Consider it a teaser or introduction to more about the subject.

Why use an appliance (hardware/virtual)? Well let’s look at the 2 alternatives:

  • Round robin DNS, which is also sometimes used is just to low tech for most real life scenarios and sometimes can’t be used or is less efficient which impacts scalability and performance. On top of that it doesn’t provide health checking for failover purposes.
  • I’ve also said  before that while Windows NLB  provides layer 4 load balancing out of the box it’s pretty basic. It also often causes a lot of network grief and the implementation can be tedious. This has not improved in an ever more virtualized & cloud based world. On top of that, when network virtualization comes into play you might paint yourself into a corner as those two don’t mix. But if that’s not a concern and you’re on a budget, I’ve used it with success in the past as well.

Load Balancing In An Ever More Demanding Virtualized & Cloudy World


We’ve been using the Kemp Loadmasters for many years now and they have served us very well. You might know that Microsoft Azure has a partnership with Kemp technologies to provide full featured load balancing in your public & hybrid cloud solutions. I pretty happy with that as when talk about load balancing with Microsoft we always end up discussing the need for more features and layer 7 support. I sometimes jokingly tease them that this is due to their Windows NLB legacy. While I have done some magic with that, it is way too limited for today’s (and yesterdays) demands and needs. Also the hacks they use to get it to work can’t be used in network virtualization. In the cloud Microsoft has the Azure Load Balancer. Whilst nice when combined with availability sets many of the current workloads need more. That’s exactly what the KEMP Virtual LoadMaster for Azure delivers in their partnership with Microsoft:

  • Layer 4, Layer 7 Load Balancing
  • Layer 7 (or Cookie) Persistence
  • SSL Offload/SSL Acceleration
  • Application Health Checking
  • Adaptive (Server Resource) Load Balancing
  • Layer 7 Content Switching
  • Application Acceleration: HTTP Caching, Compression & IPS

To me (and many other IT Pros) Kemp is the company that opened load balancing up to everyone on this planet with budget friendly but high value solutions. They took away the barrier to better & more capable load balancing for the masses. Furthermore they keep improving and I have seen many existing customers, including me get ever more benefits with the newer firmware releases, even on their entry level, older models like the LM2200 that are not for sale anymore. So you can keep using them or move them to the lab. They have great support and respond very quickly to vulnerabilities like Heartbleed, Shellshock and Poodle.

image 

Another benefit of this partnership is that we can use the load balancing solution we know and trust in all our environments: on premises (physical or virtual appliance), in the cloud & at our hosting companies. Partner ships with OEMs ensure that you can use the hardware you prefer (the DELL R320 is a nice example) and their Virtual Load Master now even extends into the cloud. So our options are to …

… deploy an appliance …

image

…  virtualize the LoadMasters …

image

… leverage Kemp in the cloud

image

…. or select your own preferred OEM …

image

They cover all our bases with that line up and it helps with operational ease & efficiencies.

As I’m investigating some scenarios with KEMP LoadMasters in a Hyper-V environment (on premises, multi sites, Azure IAAS & Multifactor Authentication you can expect to see some blog posts on this. Some of these will leverage technologies available in Windows Server vNext (Technical Preview). Lot’s of very interesting ideas to support high availability & flexibility that are affordable and not just point solutions.

Ah the joy of being in virtualization is that one gets great exposure to storage, networking, cloud solutions and on premises. The experience & knowledge of the entire stack isn’t just fun (yes working can be fun) but it is also what allows to build great solutions.

Windows Server Technical Preview delivers integration services updates through Windows Update


Benefits of delivering updates to the integration services via Windows Updates

In Windows Server  vNext aka the Technical Preview the integration services are being delivered through Windows Update (and as such the well know tools such a s WSUS, …). This is significant in reducing the operational burden to make sure they are up to date. Many of us turned to PowerShell scripting to handle this task. So did I and I still find myself tweaking the scripts once in a while for a condition I had not dealt with before or just to get better feedback or reporting. Did I ever tell you that story about the cluster where a 100VMs did not have a virtual DVD drive (they removed them to improve performance) … that was yet another improvement to my script => detect the absence of a virtual DVD drive. In this day and age, virtualization has both scaled up and out with ever more virtual machines per host and in total. The process of having to load an ISO in a virtual DVD drive inside a virtual machine to install upgrades to integration services seems arcane and it’s very timely that it has been replaced by an operation process more befitting a Cloud OS Winking smile.

I have optimized this process with some PowerShell scripting and it wasn’t to painful anymore. The script upgrades all the VMs on the hosts and even puts them back in the state if found them in (Stopped, Saved, Running). A screenshot of the script in action below.

image

I’m glad that it’s now integrated through Windows Update and part of other routine maintenance that’s done on the guests anyway.

But is not only good news for us “on premises” system administrators and integrators. It’s also important for service/cloud providers and (hosted) private cloud hosters. This change means that the tenants  have control of updates to the integration services of their virtual machines. They update their Windows virtual machines with all updates during their normal patch cycles and now this includes the integration services. This provides operation ease (single method) and avoids some of the discussions about when to upgrade the integration services.

Legacy Operating Systems

Shortly after the release of the Windows Server Technical Preview, updates to integration services for Windows guests began being distributed through Windows Update. This means that on that version the vmguest.iso is no longer needed and as such it’s no longer included with Hyper-V.  This means that if you run an unsupported (most often legacy) version of Windows you’ll need to grab the latest possible vmguest.iso from an W2K12R2 Hyper-V host and try to install that and see if it works.

What about Linux and FreeBSD?

Well nothing has changed and how that’s taken care of you can read here: Linux and FreeBSD Virtual Machines on Hyper-V

Hyper-V Technical Preview Live Migration & Changing Static Memory Size


I have played with  Hot Add & Remove Static Memory in a Hyper-V vNext Virtual Machine before and I love it. As I’m currently testing (actually sometimes abusing) the Technical Preview a bit to see what breaks I’m sometimes testing silly things. This is one of them.

I took a Technical Preview VM with 45GB of memory, running in a Technical Preview Hyper-V cluster and live migrate it.  I then tried to change the memory size up and down during live migration to see what happens, or at least nothing goes “BOINK”. Well, not much, we get a notification that we’re being silly. So no failed migrations, crashed or messed up VMs or, even worse hosts.

image

It’s early days yet but we’re getting a head start as there is a lot to test and that will only increase. The aim is to get a good understanding of the features, the capabilities and the behavior to make sure we can leverage our existing infrastructures and software assurance benefits as fast a possible. Rolling cluster upgrades should certainly help us do that faster, with more ease and less risk. What are your plans for vNext? Are you getting a feeling for it yet or waiting for a more recent test version?

Hyper-V Amigos Showcast Episode 7: The Year 2014 in Review


In the seventh edition of the Hyper-V Amigos Carsten Rachfahl and I looked back at 2014, talk  a bit about our future plans and basically have some fun Smile

You can watch it here or on YouTube. We’ll be back this year with more show casts & we hope to see you there!

A MVP once more in 2015 – happy New Year from a renewed MVP


Happy New Year people! May 2015 bring you happiness,  good health,  and good jobs/projects/customers with real opportunities for growth & advancement. Don’t forget to step out of the office, away from the consoles once in a while to enjoy the wonderful experiences and majestic views this world has to offer.

image

Being an January 1st MVP (my expertise is Hyper-V) means that every year on new years day I might get an e-mail to inform me I have been renewed, or not. Prior to that our MVP lead will contact us to make sure we have updated our community activities and they’ll decide on whether we’re MVP material, or not.  Today I received this e-mail awarding me the MVP award for 2015.

image

It remains a special feeling to receive the award.  It’s recognition for what you’ve done and it means that I can enjoy the benefits that come with it: the MVP Global Summit and the interaction with the product groups at Microsoft. The summit is very valuable to me and if I knew the dates I would already book my flights and the hotel right now.

Some people think it makes us fan boys but I can assure you that’s not the case. Microsoft hears the great, the good, the bad and the ugly from us. And yes, they appreciate that as they cannot and do not want to live in an Ivory tower. So they need feedback and we’re a part of the feedback loop. We MVPs are a good mix of customers, consultants, partners & businesses working with their technologies & helping out the community to make the best use of them. Microsoft puts it like this:

“The Microsoft Most Valuable Professional (MVP) Award is our way of saying thank you to exceptional, independent community leaders who share their passion, technical expertise, and real-world knowledge of Microsoft products with others.”

The fact that we are independent is an important factor here. It makes us a valuable resource pool of hands on experience to mix in with other feedback channels. As Aidan Finn wrote in his blog post, Feedback Matters Once Again In Microsoft, it does indeed matter. Again? It always did but they listen more and better now Winking smile. They don’t need an "echo chamber" they value opinions, insights and experiences. The MVP award is for the things you’ve done and do. Sure, there is a code of conduct but that doesn’t mean you cannot voice your concerns. "Independent" means that what we say doesn’t have to be sugar coated marketing. Our value is in the fact that we help out the community (their customers, partners and Microsoft itself) in the better use and development of their solutions base on our real world experiences. Microsoft discusses that here.

It opens up doors and creates opportunities, and for that I’m grateful as well. For my employers/customers it means that when you hire me you get access to not just my skills and expertise but to the collective knowledge and experience of a global network of passionate experts that have a proven track record of engagement and are recognized internationally for that. Not too shabby is it Winking smile.

SMB Direct With RoCE in a Mixed Switches Environment


I’ve been setting up a number of Hyper-V clusters with  Mellanox ConnectX3 Pro dual port 10Gbps Ethernet cards. These Mellanox cards provide a nice amount of queues (128) for DVMQ and also give us RDMA/SMB Direct capabilities for CSV & live migration traffic.

Mixed Switches Environments

Now RoCE and DCB is a learning curve for all of us and not for the faint of heart. DCB configuration is non trivial, certainly not across multiple hops and different switches. Some say it’s to be avoided or can’t be done.

You can only get away with a single pair of (uniform) switches in smaller deployments. On top of that I’m seeing more and more different types of switches being used to optimize value, so it’s not just a lab exercise to do this. Combine this with the fact that DCB is an unavoidable technology in networking, unless it get’s replaced with something better and easier, and you might as well try and learn. So I did.

Well right now I’m successfully seeing RoCE traffic going across cluster nodes spread over different racks in different rows at excellent speeds. The core switches are DELL Force10 S4810 and the rack switches are PowerConnect 8132Fs. By borrowing an approach from spine/leave designs this setup delivers bandwidth where they need it a a price point they can afford. They don’t need more expensive switches for the rack or the core as these do support DCB and give the port count needed at the best price point.  This isn’t supposed to be the top in non blocking network design. Nope but what’s available & affordable today in you hands is better than perfection tomorrow. On top of that this is a functional learning experience for all involved.

We see some pause frames being sent once in a while and this doesn’t impact speed that very much. It does guarantee lossless traffic which is what we need for RoCE. When we live migrate 300GB worth of memory across the nodes in the different racks we get great results. It varies a bit depending on the load the switches & switch ports are under but that’s to be expected.

Now tests have shown us that we can live migrate just as fast with non RDMA 10Gbps as we can with RDMA leveraging “only” Multichannel. So why even bother? The name of the game low latency and preserving CPU cycles for SQL Server or storage traffic over SMB3. Why? We can just buy more CPUs/Cores. Great, easy & fast right? But then with SQL licensing comes into play and it becomes very expensive. Also storage scenarios under heavy load are not where you want to drop packets.

Will this matter in your environment? Great question! It depends on your environment. Sometimes RDMA is needed/warranted, sometimes it isn’t. But the Mellanox cards are price competitive and why not test and learn right? That’s time well spent and prepares you for the future.

But what if it goes wrong … ah well if the nodes fail to connect over RDAM you still have Multichannel and if the DCB stuff turns out not to be what you need or can handle, turn it of and you’ll be good.

RoCE stuff to test: Routing

Some claim it can’t be done reliably. But hey they said that for non uniform switch environments too Winking smile. So will it all fall apart and will we need to standardize on iWarp in the future?  Maybe, but isn’t DCB the technology used for lossless, high performance environments (FCoE but also iSCSI) so why would not iWarp not need it. Sure it works without it quite well. So does iSCSI right, up to a point? I see these comments a lot more form virtualization admins that have a hard time doing DCB (I’m one so I do sympathize) than I see it from hard core network engineers. As I have RoCE cards and they have become routable now with the latest firmware and drivers I’d love to try and see if I can make RoCE v2 or Routable RoCE work over different types of switches but unless some one is going to sponsor the hardware I can’t even start doing that. Anyway, lossless is the name of the game whether it’s iWarp or RoCE. Who know what we’ll be doing in 5 years? 100Gbps iWarp & iSCSI both covered by DCB vNext while FC, FCoE, Infiniband & RoCE have fallen into oblivion? We’ll see.

VEEAM Invests in Faster & More Efficient Data Protection With Backup & Replication 8


Ever more data to protect without breaking the systems or the bank

One of my major concerns today in IT, weather it is on premises or in the cloud, is the cost, time, reliability and feasibility of backup and restores. This true for most of us. Due to the environments in which I deliver my services my main issue with backups is the quantity of data. The amount of data is staggering and growth is not showing a downward trend.

The big four: CPU, Memory, Network & Storage

Over the years we have seen a vast increase in compute, memory, network and storage capabilities and pricing. CPUs are up to 18 cores per socket as I write this. DDR4 memory is here and the cost is relatively low. We have affordable 10Gbps networking to throw at the problem as well or in some case 8 to 16Gbps Fibre Channel. So when it comes to CPU, memory and network we’re pretty well served.

Storage is evolving as well and we’re getting ever bigger and, if you have the budget that is, faster storage arrays in different flavors. But it remains a challenge. First of all to get the right amount of IOPS and storage capacity at an affordable price point is a balancing act. Secondly when dealing with backups we need to manage the source IOPS & latency against the target. But that’s not all, while you might want to squeeze every last IOPS & 1ms latency out of your backup target you can’t carelessly do that to your source storage. If you do, this might constitute a Denial Of Service attack against your applications and services. Even today storage QoS is either non existent, in it’s infancy or at best limited to particular workloads on storage solutions.

The force multiplier: Backup software capabilities & approaches

If you’ve made sure the above 4 resources are not your killer bottle neck the backup software, methods algorithms and the approach used will be either your biggest problem or you best friends. You need your backup software to be:

  • Capable
  • Scalable
  • Fast
  • Configurable
  • Scale Out

There are some challenging environments out there. To deal with this backup software should be able to leverage the wealth of capabilities compute, network, memory & storage are offering to protect large amounts of data reliable and fast. This should be done smart and in an operationally supportable manner. VEEAM has been working on this for a long time and they keep getting better at this with every release and it allows for scale out designs in regards to backups targets.

VEEAM Backup & Replication 8.0

There are many improvements in v8 but a couple stand out.

image

Consistency groups (Hyper-V)

Backup jobs can execute more than one VM backup task simultaneously from the same volume snapshot with “Allow Processing of Multiple VMs with a single volume snapshot”.

image

This means you can reduce the number of snapshots significantly where in the past you needed a volume snapshot per VM. VEEAM limits the the maximum amount of VMs you can backup per snapshot to 4 when using software VSS and to eight with hardware VSS. They do this because under heavy load VSS/CSV sometimes has issues. This number can be tweaked to fit your needs (no all environments are created equally) with 2 registry values under HKLM\SOFTWARE\Veeam\Veeam Backup and Replication key:

  • MaxVmCountOnHvSoftSnapshot (DWORD)
  • MaxVmCountOnHvHardSnapshot (DWORD) registry values

Reducing the number of snapshots to be taken is good as it saves resources, speeds up things & as VSS can be finicky, not needing more than absolutely necessary is a good thing.

Backup I/O Control.

Another improvement is backup I/O Control which delivers capability to dynamically adjust the number of backup tasks based on IOPS latency. Under Options you’ll find a new Tabbed sheet, I/O Control. It contains the parallel processing option that used to be under “Advanced” tab in Veeam B&R 7.

image

The idea is to move to a more “policy driven” approach for handling the load backups can put on the storage. Until now we’d configure a number of X amounts of tasks to run against the source storage in order to keep IOPS/Latency in check. But this is very static and in a dynamic / elastic “cloud” world this isn’t very flexible nor is it feasible to keep tuned to the best number for the current workload.

I/O Control let’s you set limits on how much latency is acceptable for your data stores. Removing or adding VMs to the data store won’t invalidate your carefully set number of tasks allowed as it’s now the latency that’s used to dynamically tune that number for you.

I/O control has two settings:

 “Stop assigning new tasks to datastore at: X ms” :VEEAM looks at the latency (IOPS) before assigning a proxy (backup target) to a virtual disk or won’t launch the task until the load has dropped.  This prevents the depletion of IOPS by launching to many backups.

“Throttle I/O of existing tasks at: Y ms”: This will throttle the IO of already running  backup jobs when needed due to some application workloads in the VMs running on the source storage kicking in. The backups will be throttled so they’ll take longer but they won’t kill the performance of the applications while they are running.

These two setting allow for the dynamic and on the fly tweaking of the number of backups tasks running as well as their impact on the storage performance. Once you have determined what latency values are acceptable to you you’re done, VEEAM handles the tweaking for you. The default values seems to reflect industry best practices (sustained > 20 ms is considered problematic)

The below screenshot is for the backup job log and shows latency being monitoredclip_image002

With VEEMA B&R v8 Enterprise + You can even do this per data store, meaning you can optimize this per backup source. This recognizes that is no “one sizes fits all perfectly” and allows for differentiation. Yet it does so in a way that does not compromise on the simplicity of use that VEEAM offers. This sounds easy but from experience I know this isn’t. VEEAM manages to offer a great balance between simplicity and functionality for companies of all sizes.

Select “Configure”

image

In the “Datastore Latency Settings” you can add one, more or all data store you are protecting with VEEAM. This allows for differentiation when you have CSV that are used for SQL Server VMs versus stateless web servers of or other workloads that are not storage I/O intensive.

image

Select the datastore (in our case the CSV volumes in Hyper-V Cluster)

image

By selecting the desired datastore and clicking “Edit”  you can individually adjust the settings for that datastore.

image

Conclusion

It looks like we have some great additional capabilities in an already very good solution. I’ll be using these new capabilities in real life scenarios to see how these work out for us and optimize the backups of the virtualized environment under my care. Hardware VSS Providers, SANs, CSV’s normally need some tweaking and care to make them run well, so that’s what we’ll be doing.