Monthly Archives: March 2018

Azure Stack – in-depth architecture

This was a session that I was going to present on NIC 2018, but because of a conflict, I was unable to attend. Therefore I decided to write a blog post on the subject instead since I see a lot of traffic against my earlier article on the subject (http://msandbu.org/what-is-azure-stack-and-want-is-the-architecture/) where I spoke a lot about the underlying architecture of Azure Stack and especially on how storage and networking are working together. So in this post, I wanted to go a bit more in-depth on some of the subjects but also on the limitations of Azure Stack and things you need to be aware of. Also, I wrote a piece on Brian Madden here about Azure Stack being an appliance and what it means for the type of hardware it uses http://www.brianmadden.com/opinion/Why-do-Azure-Stack-appliances-have-to-be-certified also that Microsoft has now as well published a roadmap on Azure Stack –> https://azure.microsoft.com/en-us/roadmap/?category=compute but this is part one of the in-depth archictecture!

The Core Architecture:
In the core of Azure Stack, we have the software-defined architecture, where it using both Storage Spaces Direct for underlying storage and VXLAN for cross-host communication. Since SPD ( Storage Spaces Direct) has a requirements part of RDMA that is part of the hardware design, this also makes the current limit of nodes at 12 physical servers. It is also running Hyper-V on Server Core, in an integrated system we also have an HLH (Hardware lifecycle Host) which is used to run OEM vendor-provided management tools for hardware. There are multiple virtual machines which run on Azure Stack which makes our part of the ecosystem. 
core

How does Storage Work:

The bare-metal servers are running Windows Server 2016 with Hyper-V as the underlying virtualization platform. The same servers are also running a feature called Storage Spaces Direct (SPD) which is Microsoft’s software-defined storage feature. SPD allows for the servers to share internal storage between themselves to provide a highly-available virtual storage solution as base storage for the virtualization layer.

SPD will then be used to create a virtual volume with a defined resiliency type (Parity, Mirrored, Two-way mirror) which will host the CSV shares and will use a Windows Cluster role to maintain quorum among the nodes.

SPD can use a combination of regular HDD disks and SSD disks (Can also be all-flash – Which Cisco announced earlier today) to enable capacity and caching tiers which are automatically balanced so hot data is placed on the fast tier and cold data on the capacity tier. So when a virtual machine is created and storage is placed on the CSV share, the virtual hard drive of the VM is chopped into interleaves of blocks which by default are 256KB and is then scattered across the different disks across the servers depending on the resiliency level. Now in Azure Stack by default we have a three-way mirror which is used to provide redundancy in the Stack. We have Service Fabric Cluster running  which is used to provide the tenant and admin API’s across using Azure Resource Manager and we have a underlying controler called ACS. On each VM that is configured with a standard HDD, the (ACS) Storage controller which insert a IOPS limit on the hypervisor to 500 IOPS to provide consistency with Azure.

storage
The Network Fabric:

The network consists of multiple modules, such as the software load balancer (MUX) which is a feature running on the hyper-v switch as a host agent service, and is also managed centrally by the network controller which acts as a central management for the network. The load balancer works on layer two and is used to define a public IP with a port against a backend pool on a specific port. The software load balancer is load balancing using (DSR) direct server return which means that it only load balances incoming traffic and the return traffic from the backend servers are going directly from the server back to the requesting IP address via the Hyper-V switch. This feature is also presented in Azure Stack as the regular load balancer.

To ensure that the software load balancing rules are in place and that the distributed firewall policies are synced and maintained and of course when we have VXLAN in place all the hosts needs to have a IP table so each node knows how to communicate with all the different virtual machines on different hosts. This is where there needs to be a centralized component in place which takes care of that and that is the network controller.

On Azure Stack the network controller runs as a highly available set of three virtual machines which operates as a single cluster across different nodes. The network controller has two API interfaces, one which is the northbound API which accepts requests using REST API, so for instance if we go and change a firewall rule or create a software load balanced in the Azure Stack UI the Northbound API will get that request. The network controller can also be integrated with System center but that is not a part of Azure Stack.

The southbound API will then propagate the changes the different virtual switches on the different hosts. The Network controller is intended to be a centralized management component for the physical and virtual network since it uses the Open vSwitch standard, but the schema it uses is still lacking some key features to be able to manage the physical network.

The Network Controller is also responsible for managing the VPN connections and advertisement of the BGP routes and maintaining sessions states across the hosts.

network

From a network perspective once, you have a Site-to-site gateway established you have essentially have two virtual machines which are powering the site-to-site VPN solution for all tenants. Therefore you will not have a dedicated public site IP for each gateway.

Troubleshooting and management:

When troubleshooting issues, make sure that you check if there is anything documented on the version build and yes there are a lot of documentet issues and bugs (https://docs.microsoft.com/en-us/azure/azure-stack/azure-stack-update-1712 for instance) and if you run into any issues such as alerts of such on the admin portal you will need to get logging information from the PEP (Privileged End Point) to get some assistance from Microsoft. Here is an example script you can run using the PEP to collect logs on an integrated system (note on a integrated system, there are always 3 instances of the PEP running) oh and Microsoft recommends that you connect to the PEP from a secure VM running on the HLH.

Get-AzureStackLog -OutputPath C:\AzureStackLogs -FilterByRole VirtualMachines,BareMetal -FromDate (Get-Date).AddHours(-8) -ToDate (Get-Date).AddHours(-2)

You will need to define what kind of role you want to get logs for
You can read more about the logging feature from this Github MD https://github.com/Azure/AzureStack-Tools/blob/master/Support/ERCS_Logs/ReadMe.md
But if an update fails you are pretty in the dark, and will need to extract these logs on different levels and roles and send across to Microsoft to get it troubleshooted, and we have had some times already now needed their assistance in order to troubleshoot an failed upgrade.

Security:
Of course, Microsoft focused alot on Security in Azure Stack which of course is something of the core advantages of it. Below is some of the settings which are configured on the Stack.
* Data at rest encryption – All storage is encrypted on disk using Bitlocker, unlike in Azure where you need to enable this on a tenant level. Azure Stack still provides the same level of data redundancy using three-way copy of data.
* Strong authentication between infrastructure components
* Security OS baselines – Using Security Compliance Manager to apply predefined security templates on the underlying operating system
* Disabled use of legacy protocols – Disabled old protocols in the underlying operating system such as SMB 1 also with new security features protocols such as NTLMv1, MS-CHAPv2, Digest, and CredSSP cannot be used.
* Constrained Administration (such as the PEP  endpoint uses PowerShell JEA (Just Enough Administration)
* Least privileged account – The platform itself has a set of service accounts used for different services which are running with least privilege
* Administration of the platform can only happen via Admin portal or Admin API.
* Locked down Infrastructure which means that we have no direct access to the hypervisor level.
Windows Credential Guard – Credential Guard uses virtualization-based security to isolate secrets so that only privileged system software can access them.
Uses Server Core to reduce attack surface and restrict the use of certain features.
Windows Defender on each host
Network ACL defined in TOR, SDN and Host and guest which are deployed using Ansible
Group Managed Service Accounts
Rotates secrets every 24 hours

Limitations:
Of course, as with a new platform, there are a lot of limitations that you need to be aware of (especially if you have read up on the consistency between Azure and Azure Stack) and if you have a system which as support for Azure using some form of PaaS service it does not necessarily mean that it supports Azure Stack. The application vendor will need to ensure that their product also is compatible with Azure Stack feature level. Here is also a list of other limitations that I have encountered. 

  • Limited amount of instance types (A, D and Dv2 series)
  • Single fault and update domain (UPDATE: Changed in 1802 build)
    • Only LRS Storage
  • No support for IPv6
  • No support for Managed Disks
  • Limited support for Premium disk (Cannot guarantee performance)
  • No support for Application Gateway or Traffic Manager
  • No support for VNET Peering
  • No support for Azure SQL (Only SQL Server which is served through a SQL Connector)
  • Only support for Basic VPN SKU (and only two pair HA nodes which provides VPN for all tenants)
  • No Network QoS on NIC (Can allow for noisy neighbors)
  • Only some marketplace items (Such as Windows 10 is missing out and other fun part in marketplace)
  • No customer specific gateway (Same IP for all gateway connections)
  • A lot of Azure Services such as Data Factory cannot use Azure Stack Storage (Hardcoded URL on the different services)
  • No support for SQL Server and AzureStack (Stretched database or SQL Backup) functionality which is part of SQL Server
  • No support for Citrix on Azure Stack (Meaning no Citrix NetScaler and Provisioning options available)
  • No support for Azure files
  • Max blob size 195 GB (UPDATE: Changed in 1802 build)
  • Max disk size 1 TB
  • No support for Point-to-site VPN
  • No support for docker machine drivers.
  • Troubleshooting is mainly dumping logs to the Microsoft support team
  • Some UI bugs such as defining DNS settings on virtual network

 Now since the release, there has been one update each month since the release, this shows the dedication to the platform and the ecosystem, but Microsoft has to make it easier to run edge processing and have Azure features that support Azure Stack integration. Also, one thing I want to highlight is that Azure Stack has one thing that it excels in which is networking 😉 but no wonder with the networking backend it provides.

rdmaazurestack

Also earlier today Cisco also came with an all-flash version of Azure Stack so now Microsoft really needs to fix the scale ability issues.

Azure Standard load balancer

A While back Microsoft announced a new Load balancing tier in Microsoft Azure called Azure load balancer standard, which is a new SKU of the existing load balancing service in Azure, which is still in preview when I’m writing this blogpost.
Azure provides different load balancing solutions, where we have Application Gateway (provides layer 7 and SSL based load balancing) Traffic Manager which provides geo redudancy using DNS based load balancing) and Load Balancer service which has been aimed at layer 4 load balancing.

Now there are many differences between the standard one and the old one (basic)

Standard SKU Basic SKU
Backend pool endpoints any virtual machine in a single virtual network, including blend of virtual machines, availability sets, virtual machine scale sets. virtual machines in a single availability set or virtual machine scale set
Availability Zones zone-redundant and zonal frontends for inbound and outbound, outbound flows mappings survive zone failure, cross-zone load balancing /
Diagnostics Azure Monitor, multi-dimensional metrics including byte and packet counters, health probe status, connection attempts (TCP SYN), outbound connection health (SNAT successful and failed flows), active data plane measurements Azure Log Analytics for public Load Balancer only, SNAT exhaustion alert, backend pool health count
HA Ports internal Load Balancer /
Secure by default default closed for public IP and Load Balancer endpoints and a network security group must be used to explicitly whitelist for traffic to flow default open, network security group optional
Outbound connections Multiple frontends with per rule opt-out. An outbound scenario must be explicitly created for the virtual machine to be able to use outbound connectivity. VNet Service Endpoints can be reached without outbound connectivity and do not count towards data processed. Any public IP addresses, including Azure PaaS services not available as VNet Service Endpoints, must be reached via outbound connectivity and count towards data processed. When only an internal Load Balancer is serving a virtual machine, outbound connections via default SNAT are not available. Outbound SNAT programming is transport protocol specific based on protocol of the inbound load balancing rule. Single frontend, selected at random when multiple frontends are present. When only internal Load Balancer is serving a virtual machine, default SNAT is used.
Multiple frontends Inbound and outbound Inbound only
Management Operations Most operations < 30 seconds 60-90+ seconds typical
SLA 99.99% for data path with two healthy virtual machines Implicit in VM SLA
Pricing Charged based on number of rules, data processed inbound or outbound associated with resource No charge

Now like AWS, Microsoft now charges based upon Load balancing rules and data processed (only for the standard SKU, the basic one is still free)

Load Balancing rules:
First 5 rules: $0.025/hour
Additional rules: $0.01/rule/hour

Data processed trough the load balancer.
$0.005 per GB

The biggest change in this new tier is that 1: It supports availability zones (Which today was GA), It has a much better diagnotics options and lastly it provides something called HA ports which ill come back to a little bit later in this post. To get started to configure a azure load balancer standard you might need to use the CLI or PowerShell, this example belove shows using Azure CLI.

Create Resource Group
az group create –name changename–location changelocation

Create Public IP with Standard SKU
az network public-ip create –resource-group myResourceGroupSLB –name myPublicIP –sku Standard

Create Standard load balancer
az network lb create –resource-group changename –name changename –public-ip-address myPublicIP –frontend-ip-name myFrontEnd –backend-pool-name myBackEndPool –sku Standard

Now looking back at some of the new functionality such as HA ports, this feature helps you with  high availability and scale for network virtual appliances (NVA) inside virtual networks. It can also help when a large number of ports must be load balanced since we can load balance entire ports ranges, setting the port value to  0, and the protocol to All. The internal Load Balancer resource then balances all TCP and UDP flows, regardless of port number. NOTE: that if you want to use HA ports to provide HA for NVA in Azure please make sure that the vendor has verified the appliances to work with HA ports.

One of the most welcome features is that we can now monitor VIP and DIP availability directly from the console, which was a really cumbersome process in basic load balancer tier.

lbazure

You can now also see availability on DIP based upon a single IP address.

lbazure2

So using Azure Load Balancer standard will allow us to easier load balance NVA inside where we for instance have a hub and spoke layout on the virtual network, and also the ability to now monitor the availabilty of DIP to make it easier to troubleshoot the health probes is something that I’ve been missing for some time!