s Last week I had a session at the Norwegian Citrix User Group about best practices regarding setting up Citrix in Azure, while it was only in Norwegian, I also wanted to share some of the tips and tricks here as well.
1: Planning and Limitations
Before any project, you need to understand what kind of building blocks and what kind of limitations that you can encounter before starting a project with building VDI in Azure.
- Latency is a key factor to understand how far it is to the closest Azure Datacenter. This is a good indicator to determine the latency between the end-user and Azure datacenters –> Virtual Desktop Experience Estimator https://azure.microsoft.com/en-us/services/virtual-desktop/assessment/#estimation-tool
- Also remember that latency between the virtual desktops and the backend such as databases and other data sources is also important to have data so close as possible.
- Allocate capacity: In some cases, you might have a large project where you need a large amount of compute capacity. Start early and make sure that you have enough compute quota long before the project starts –> Request an increase in vCPU quota limits per Azure VM series – Azure supportability | Microsoft Docs
|Why is this important?
|Azure NetApp Files
|1,000 virtual IP addresses in a Virtual Network
|If there are more than 1,000 allocated IP addresses the NetApp service will fail, unless you are using the new standard network feature service
|24 Hours RPO only, meaning that it can only run backup one backup each day
|Depending on RPO demands, but is supported for Azure Files as well
|Azure ARM API Calls
|12000 Read and 1200 write operations per hour against a specific subscription
|Using Enterprise Scaffold Reference Architecture and using multiple subscriptions
|Azure Active Directory Domain Services
|No support for Hybrid AD
|Lack of Enterprise Admin means that you cannot setup inbound trust (only outbound)
|Soft Kvote for vCPU
|Make sure that you have defined enough quota fro vCPU
|Only supported Windows Server (and Linux) not Windows 10
|For services that require low latency you should use Accelerated Networking
|Not all services are available in every region
|There is a significant difference between all the different regions in terms of services availability.
|Since Azure uses VXLAN it requires a lower MTU for network traffic
|Some services have hardcoded MTU limits, previously issues with EDT as well because of MTU Discovery was not working
|Azure AD Active Directory Domain Services and Seamless SSO
|Does not work with Seamless SSO
|Because SSO is important
|IOPS difference between Standard and Premium
|Standard Files = 300 MiB/sec
Premium Files = 6,204 MiB/sec egress (also supports SMB Multichannel)
|Be aware that there are differences in terms of disk and network throughput for the different virtual machine types.
|If you have low network bandwidth it affects all logon processes.
|Azure Virtual Network
|No support for traditional layer two protocols such as GARP/RARP
|So, services such as Netscaler ADC uses those protocols for failover, so we need to solve failover using another Azure service.
|Azure Files and Azure NetApp Files
|Can be integrated with AD Domains but only single domain
|If you require multi domain support
|Microsoft M365 and Azure
|Microsoft 365 (with Office 365 applications) might not be in the same region as well
|Latency can also affect the performance
Also, when it comes to outbound traffic, if you are using services like Azure Firewall it will by default come with one public IP address, so a single IP can support up to ~5000 users. Such as Outlook uses eight outbound ports alone, so make sure that you have enough IP addresses for outbound connections. If you have services behind Load Balancer, NAT Gateway for a service it will use that as the default outbound.
2: Secure Foundation
One of the crucial issues that most organizations have with using Azure is by starting with a PoC or demo or services that become production. Before you start using anything in Azure start with a secure foundation.
Like with Microsoft they have a reference architecture and a set of guidelines that are common best practices which are documented in Enterprise Scaffold (Azure enterprise scaffold is now the Cloud Adoption Framework for Azure – Cloud Adoption Framework | Microsoft Docs)
While this picture example is overkill for certain customers, but the main idea is to have a
- Virtual Network with a secure hub such as using Azure Firewall and or 3. party network vendor
- Azure Policies (which are like Group Policies for Azure) which can apply centralized policies for Azure environments, some examples such as:
- Services should only be provisioned within the approved locations
- Services should not be configured with a public IP address
- Services should have backup and monitoring activated
- Infrastructure should have antivirus enabled
- RBAC: (Using services like Privileged Identity Management) which allows you to configure an IT personnel to elevate themselves to higher access (which can apply for both Azure AD and Azure resources) or you can use Access Packages with Entitlement Management where users can also have a similar workflow to get access, but this also supports Azure AD Security Groups in addition.
- Monitoring: Also make sure that Azure Monitor and Diagnostics are enabled for core services and that information is collected into a centralized log analytics workspace. You can read more about setting up Azure Monitor and such here –> Deep dive Azure Monitor and Log Analytics | Marius Sandbu (msandbu.org) all resources in Azure can be monitored using Azure Monitor and Workbooks (which provides dashboards) however you should configure Action Groups as well to provide alert forwarding. The most common setup is integration with ITSM tools as well (which is also supported)
This also requires that you configure logging for the different data sources that needs to be enabled. If you have a Citrix environment in Azure using Cloud Connectors and virtual netscaler instances you should also configure a syslog collector on a separate VM. All of these events can be collected into a single log analytics workspace.
NOTE: If you want to collect custom Windows Event logs you need to configure Data Collection Rules and apply to that Group of machines.
Log Sources that are important
|Microsoft 365 Security
|Microsoft 365 Security
|Citrix VDA Logger
|Windows Security Events
|Azure Monitoring Agent
|Azure Monitoring Agent (Syslog forwarder) have a VM with a log analytics agent with forwarding enabled.
|Citrix Delivery Services
|Azure Monitoring Agent
|Citrix Cloud Connector
|Citrix Cloud Logger
|Azure Monitoring Agent
|Azure Resource Manager
|Azure MFA Usage
3: Virtual Infrastructure
When you are building virtual infrastructure in Azure as well. There are different t-shirt sizes of VM’s that are available, Microsoft has a general recommendation in terms of example Azure instance types. Which is much usage upon the latest AMD EPYC CPU.
It should be noted that Microsoft is now coming with the next edition running v5 which is on newer Intel and AMD CPU’s which packs more power and at a cheaper cost as well using the same CPU: Memory ratio which should be the general recommendation.
There are also other machine types that support GPU where you have
- NV6 or NVv3 – Running Nvidia M60 GPU (GRID)
- Windows 10, 2012, 2016 & 2019 & Linux (Ubuntu, Redhat)
- Nvv4 – AMD Radeon MI25 (Using MxGPU)
- Windows 10, 2016 & 2019
- It should be noted that there have been some issues using the Nvv4 and the AMD Driver so that the VDA registration happens before the driver works so I’ve needed to configure a setting on the Broker called SettlementPeriodBeforeUse to allow the VDA to wait before it starts working to ensure the drivers are working
Another thing to consider when setting up the virtual infrastructure is availability and proximity since those are mutually exclusive. As part of availability, you can configure virtual machines to either be available within a certain datacenter using Availability Sets or using Availability Zones.
Availability Zones = redundancy across multiple data centers within a specific region
Availability Sets = redundancy across multiple racks within a specific datacenter.
While Zones provider higher level of redundancy latency can be a crucial factor for certain applications.
If you want the lowest possible latency between the virtual desktops and backend you need to consider two parts. (Proximity Groups and Accelerated Networking) Proximity Groups ensures that workloads are as close as possible to each other such as desktops and backend databases. Accelerated Networking is similar to SR-IOV which bypasses the virtual vSwitch to accelerate the packet flow. This feature is supported for Windows Server (and should be used by default) and secondly is also supported for Citrix ADC version 13.0 build 76.29.