This blog post is a summary of the content that I went trough on my session today on Microsoft Meets Community 3rd Edition – Where I had the session Migrate to WVD and Beyond. Essentially looking into how to migrate to WVD and migrating other workloads from an existing datacenter and VDI platform. (https://info.microsoft.com/WE-WVD-WBNR-FY21-12Dec-11-MicrosoftmeetsCommunityWindowsVirtualDesktop-SRDEM48757_LP01Registration-ForminBody.html?wt.mc_id=AID3017714_QSG_490080)
You can view the recording here as well –> Updated when link is available….
This blog post is split into different sections.
-
Plan
-
Assess
-
Foundation
-
Migrate/Rebuild/Extend
-
Operate and Govern
Plan
So if you want to move from an existing VDI platform to Windows Virtual Desktop (WVD) there are a lot of requirements that you need to understand (and not just the technical requirements). You are essentially moving to a cloud based ecosystem where WVD might be one of the workloads that is part of the big puzzle.
And of course when hosting a VDI platform you need to have the applications and the data close together to ensure that you have an optimized end-user experience.
In most organizations you have already invested time and money into training and processes to manage and operate the existing VDI platform. Also you might have 3.party management tools which are used to manage your VDI Service.
Understand the state you are coming from:
- Existing VDI Solution – Technology
- Management & Operations – Process
- Knowledge and Expertise – People
These will of course change when you migrate to WVD and you would need to ensure that you have the right skill and competency in place to manage a WVD enviroment ( in addition to the other workloads in Azure as well) Just to showcase how the difference might be from a technology perspective.
Otherstanding the VDI workloads and requirements
- End-user requirements – Devices, Peripherals and Working Patterns
- End-user endpoints – Domain Join or Azure AD Based
- Workloads – Power Users or Office Workers
- Supporting Services – Print, Fileshares, Office 365, Security, VPN
- Workloads requirements – Applications and Data(bases)
Then we of course have the existing user-base which might today have a list of defined requirements when it comes to what kind of hardware/operatingsystem they have (might also be thin clients) and where some workers might also have different workloads/applications that they need to run in the VDI platform (GPU capabilities for instnance?) and if our existing user-base are using Azure AD based clients as well this will also affect how the sign-in experience will be on WVD.
Compliance
- Metadata – stored in the US coming to EMEA Q1 2021
Another aspect is compliance, they way that WVD is built as of now Microsoft stores metadata information in the US based datacenters. Which might prohibit some companies with strict compliance policies to start using WVD. However Microsoft will be providing the ability to store the metadata in the west europe datacenter in March next year.
So what is collected by the Metadata information? well a lot… It consists of UPN, Who connects, when, session information and such. You can view the WVD table information list here –>
https://docs.microsoft.com/en-us/azure/azure-monitor/reference/tables/wvdconnections
And just look trough the different tables that have WVD* prefix which contains the data which is collected by the metadata.
We also have to understand the destination. Since if we are moving to WVD as as services there are two latency factors I want you to consider.
- Latency from end-user to WVD control/data plane (Running in Azure)
- Latency from the VDI desktop to application data (database or fileserver)
If we manage to provide low latency connection from the end-user to WVD but have a horrible latency from the application to the database. That is not gonna be any good… Microsoft has it’s own virtual desktop experience estimator to calculate the latency from the end-user to WVD data plane.
Virtual Desktop Experience Estimator –> https://azure.microsoft.com/en-us/services/virtual-desktop
/assessment/#estimation-tool
But both these are important aspects when building any cloud based VDI solution. Group Applications to close as the data as possible
Of course within Azure we can use services such as Azure Proximity Groups to place workloads to close as possible to one another (often configured in combination with Accelerated Networking to use SR-IOV to buypass the virtual switch and reduce the latency) NOTE: That accelerated networking is only available for Windows Server.
Of course we need to understand the WVD ecosystem in Azure as well. While Microsoft has some responsibility for the main components. There are still a lot of different options in terms of Storage, Networking, Management, Security features as well.
More information about the different support systems here –> https://msandbu.org/windows-virtual-desktop-ecosystem/
If you also want to build your enviroment in an automated fashion. You can now also use Terraform which recently introduced support for it in one of the latest releases.
Want to Visio overview? –> https://bit.ly/wvdeco
There are also some limitations that you should be aware of (and the workarounds to fix them) before you plan to build your enviroment in Azure and also what kind of supporting services you should consider in Azure.
NOTE: One thing in particular is Accelerated Networking and Proximity Groups which has had big effects for some database workloads that we have worked on. If you plan to use Availability Zones we have seen that segmentating worklaods across different availability zones can affect the latency in to high detail on legacy applications.
Just remember to turn of any read/write caching for database related workloads in Azure.
One thing I want to highlight here, is when working on WVD projects, plan ahead and get the allocated capacity you need in terms of vCPU cores. By default you have a soft quota and you need to enter a support ticket to get access to more cores. In some regions there might be limited access and therefore you might need to wait before you start the project. Secondly not all Azure regions are equal. This means that not all supporting services are available in all the Azure datacenters to check before you start a project.
Also we need to understand what kind of instances types we want to be using for our WVD desktops. Which VM’s provides the most power for the lowest cost? at the moment the recommendation from Microsoft is using the D2(4)_v3 which is using a Intel based CPU, but during the summer Microsoft started to roll our the Das_v4 which is powered by a new AMD EPYC based CPU with the same CPU/Memory ration, a bit cheaper then v3 but packs a punch with the CPU. Which I highly recommend.
Another thing is that with Azure Files now which is the preffered choice when it comes to WVD hosting FSLogix Profiles, remember that it now supports Multichannel, which boost the SMB performance –> https://azure.microsoft.com/en-us/updates/smb-multichannel-preview-is-now-available-on-azure-files-premium-tier/
When it comes to GPU based instances, there are two main offerings. (depending on the amount of GPU memory you need to have)
- Nvv3 (Nvidia M60 GPU) – Supported Windows 10 and Windows Server and Linux – GPU Passtrough.
- Nvv4 – Supports Windows 10 and Windows Server. (Nvv4 using MxGPU) can read more about those here –> https://msandbu.org/amd-radeon-gpu-on-microsoft-azure-nvv4-series-and-vdi/
Just remember that use GPU instances which support SSD based storage. Just remember that Some VM instances type are not available in all regions so check that before you start a project!
Assess
When you want to migrate your existing workloads to Microsoft Azure, essentially doing a lift-and-shift you need to make sure that you do a proper assessment of the current enviroment and making sure that you workloads are supported to run in Azure.
For Azure Migrations you can use an services from Microsoft called Azure Migrate which can both assess and migrate workloads (which are the most suiteable for non-VDI servers.
Azure Migrate comes in two flavours, either using agent based or agentless depending on which access you to provide to your on-premises workloads.
- Agentless means that you have an Azure Migrate appliance which connects to vCenter in this example to read information about the virtual machines and also does application dependecy assessment of workloads.
- Agent based means that you have an agent installed on the virtual machines which then are used for the assessment and then you need to setup an replication appliance which will be used to help manage the data replication when you want to migrate the virtual machines. NB: Agent based is requires if you have UEFI based virtual machines running on VMware.
All the data that is collected by Azure Migrate will be collected into a log Analytics workspace. Which is essentially a database where it stores data that is collected by the agents such as:
- CPU, Memory, Disk Usage & Performance
- VM information
- OS Version and
- Dependency Data:
- Collects TCP Connection Data
- Name of Processes with active connection & destination port
- Installed Windows VM applications
- Installed Windows VM Features
- Installed Linux VM applications
This data is then used to provide an overall assessment and if a virtual machine is supported to run in Public Cloud. Azure Migrate also provides an assessment which shows suggested VM sizes and disk types and also the cost. Now what I have seen IN MANY cases is that Azure Migrate or other 3.party tools does not detect use of Layer 2 protocols (which are not supported in Azure) also reference this support page for Microsoft based supported workloads.
https://docs.microsoft.com/en-us/troubleshoot/azure/virtual-machines/server-software-support
Azure Migrate can of course see if a VM and OS is supported by it still might not be from the vendor if you have a blackbox OS solution from Cisco, F5 or Citrix. In those cases reach out to your application vendors. Also you might also be thinking about migrating to PaaS services and Some services can be lifted to PaaS but be vary of support from third party vendors, in most cases do NOT support moving to an Azure PaaS service.
As part of the assessment you can use Log Analytics to analyze the data sets that are collected. Log Analytics collects data into a Workspace which is essentially a read only database.
Then by using Kusto queries (which are essentially a read only SQL query). This is an example where you query data from the VMConnection table in Log Analytics which collected information from virtual machines and processes with network connections. Quite important to map firewall rules and any external IP addresses that it might communicate with.
VMConnection (##Collected by Azure Migrate Agents##) | where TimeGenerated > ago(9d) |where Computer == “computername" "replace with something else // Ignore RDP Protocol - mostly admin traffic | where DestinationPort <> 3389 // Ignore Existing Monitoring tools | where ProcessName <> "HealthService" | where ProcessName <> "k06agent" | where ProcessName <> "kntcma" | where RemoteIp <> "127.0.0.1" // Ignore Link-Layer Multicast | where DestinationIp <> "224.0.0.252" // Ignore Symatec Update | where DestinationPort <> 8014 // Ignore Netbios | where DestinationPort <> "138" | where Direction == "inbound" | distinct ProcessName, RemoteIp, DestinationPort, Protocol
You can also use the Log Analytics module Service Map which visualizes the data that is collected and shows the VM connection between other virtual machines.
Of course before you do an overall migration it is important to plan the required sizing of supporting services that you might need in Azure. So just some examples.
- FSLogix Azure Files IOPS planning using this –> https://github.com/RMITBLOG/FSLogix
- Azure P2S VPN (Between 128 – 10000 Active Connections depending on SKU)
- Azure VPN Gateway (650 Mbps – 2,5 Gbps troughput shared between S2S and P2S)
- Remember UDP based VPN if possible (IKE2V, not SSTP)
- NAT & Azure Firewall with Public multiple IP addresses
- Be vary Port exhaustion
- Outlook can use up to 8 outbound ports alone
- Rule of thumb: ~6,000 users behind a single NAT (Applies only to the VDI platform)
- Exclude these public IP addresses from Conditional Access
- Azure Standard Load Balancer if applies will also change outbound flow of virtual machines in Azure.
Foundation
Before your build or migrate any services in public cloud (regardless of which ones) you need to build a secure foundation in place. Just to give you an indication on what this might look like from the other cloud providers as well.
- Amazon:
- AWS Landing Zone: https://amzn.to/33g9yOQ
- Google Cloud
- Google Cloud Foundation Design: https://bit.ly/39gqGHZ
As within any of the foundation or reference architecture framework there are some essentially things that it needs to include as part of the essentials.
- Organisational Structure
- Network & Shared Services
- Connectivity
- Security and Governance
- Monitoring
- Identity and Role based access
Of course for Microsoft perspective they also have predefined architectures focused on compliance demands such as NIST or PCI-DSS also available. The most important aspect is that you use the reference architecture as starting point and adjust according to size and requirements. It is not a one size fits all model. Here are some examples to get you started depending on what kind of IaC ecosystem/tooling you want to use.
- Microsoft:
- Azure Well Architected Framework: https://bit.ly/368MOlP
- Terraform based foundation –> https://github.com/azure/caf-terraform-landingzones
- ARM based foundation –> https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/enterprise-scale/implementation
- Azure Security Benchmark v2 –> https://docs.microsoft.com/en-us/azure/security/benchmarks/overview
Also when you are building a foundation and before you deploy WVD make sure that you have configured firewall rules or policies in place for the different WVD services.
- WVD Safe URL List –> https://bit.ly/2UXunKv
- Azure Firewall Rules –> https://bit.ly/3pYSlTR
If you are using Azure Firewall as part of your design you can use this kusto query as an example to look for denied traffic to ensure that you have the proper rules in place before building WVD.
AzureDiagnostics | where Category == "AzureFirewallApplicationRule" | search "Deny"
Migrate / Rebuild / Extend
Then we come to the phase where we are about to build the services on migrate depending on the workload. For WVD in general in most scenarioes you will most likely build the new WVD setup using multi-user Windows 10 and secondly using a new way to build golden images. There are of course ways to do this differently, my main approach is using Terraform and Packer
You can of course use Azure Resource Manager (ARM) / Pulumi / BICEP as well, but I prefer Terraform. You can also use Azure Image builder to build the golden image (but that is based upon Packer)
So use these two tools we can use them to build both the foundation and the image itself. These tools can also be defined using automated Azure DevOps Pipelines to continuously automate the infrastructure.
So when you are using Terraform to build the WVD components you essentially have 4 resources that you can use.
azurerm_virtual_desktop_workspace
- Needs to be in US because of Metadata
- (Coming to EMEA Q1 2021)
azurerm_virtual_desktop_host_pool
- Requires registration_info block to get token
- Define as Output
- Type = Personal or Pooled
- Validation_environment = false
As defined in this screenshot
- azurerm_virtual_desktop_application_group
- Require type (RemoteApp or Desktop)
- azurerm_virtual_desktop_workspace_application_group_association
You have the documentation for the Terraform based configuration here –> https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/virtual_desktop_host_pool
For Packer you have to create a Packer Configuration file in JSON as defined in my previous blog post here –> https://msandbu.org/automating-windows-virtual-desktop-image-build-with-hashicorp-packer/
When defining the OS image you can use this table as a reference of the different OS versions
Publisher Name | Offer | SKU | Description |
MicrosoftWindowsDesktop | windows-10 | 20h1-evd | Win10 Ent MS 2004 |
windows-10 | 20h1-ent | Win10 Ent 2004 – Gen1 | |
windows-10 | 19h2-evd | Win10 Ent MS 1909 | |
windows-10 | 19h2-ent | Win10 Ent 1909 – Gen1 | |
windows-10 | 19h1-evd | Win10 Ent MS 1903 | |
office-365 | 20h1-evd-o365pp | Win10 Ent MS 2004 with O365 | |
office-365 | 19h2-evd-o365pp | Win10 Ent MS 1909 with O365 | |
office-365 | 1903-evd-o365pp | Win10 Ent MS 1903 with O365 | |
MicrosoftWindowsServer | WindowsServer | 2019-datacenter | Win Server 2019 datacenter |
When it comes to the actual migration of the workload, ensure that you define a playbook which contains the different steps on how to migrate servers and what to do. One thing that should be in place is that you should always do failover testing, essentially making sure that servers are replicated and started in a test virtual network in Azure which is not connected to the production enviroment.
This of course to:
- Verify OS Booting properly
- Verify Disks and App functionality (if possible)
- Determine steps for Agent installation for Azure support
But the most important aspect when doing migration is to understand and defining the move groups. Which applications and services should be moved together? this to ensure that application will work consistent and that latecy between the application and other data sources is as low as possible.
Operate and Govern
Great! so now that you have built and migrated the workloads to Public Cloud, now what? There is a lot of data that will be collected by the different services in Azure, just to highlight a few of them. If you are using some services such as VPN, Azure Firewall, Network Security Groups and such where do I analyze and inspect the data?
The best approach is to configure diagnostics to collect this into a centrallized Log Analytics space.
An overview of the data that is collected and Category names.
There are of course a set of features that should be in place to ensure monitoring of the enviroment and health of the services.
Configure and setup Log Analytics/Azure Monitor as a centralized logging service which should be configured to collect logs from all services within Azure including WVD.
In addition to this you should also enable Azure Monitor for VM’s for your WVD instances since this also collects performance metrics from your machines and also provides us with the ability to use WVD workbooks.
https://github.com/wvdcommunity/AzureMonitor
The second aspect, have monitoring tools in place to monitoring external endpoints of WVD! no I’m not kidding. Since Microsoft is responsible for the availability of these services it is important that you get notified as soon as possible if an outtage happened.
Another part is to define proper monitoring within Azure Monitor such as
- Service Health (provides feedback on service outages, maintaince, health advisory)
- Action Groups (Defines an action to be taken if threshold is reach on an alert or as part of an service health) or if someone made changes to the enviroment
There are of course other great 3.party tools available to make management a bit more easier which I’m not going to cover as part of this blog post.
The final I want to mention is that there is a lot of development being done on WVD and the Ecosystem surrounding it. With WVD you also take a higher bite of the cloud ecosystem so it is important to understand the supporting services that you might use.
- Using Defender ATP for Endpoint multi-user is current in Preview
- Intune support for multi-session Windows 10 is also in Preview
- Pay attention to the latest updates and roadmap