Author Archives: Marius Sandbu

Azure Stack – in-depth architecture

This was a session that I was going to present on NIC 2018, but because of a conflict, I was unable to attend. Therefore I decided to write a blog post on the subject instead since I see a lot of traffic against my earlier article on the subject (http://msandbu.org/what-is-azure-stack-and-want-is-the-architecture/) where I spoke a lot about the underlying architecture of Azure Stack and especially on how storage and networking are working together. So in this post, I wanted to go a bit more in-depth on some of the subjects but also on the limitations of Azure Stack and things you need to be aware of. Also, I wrote a piece on Brian Madden here about Azure Stack being an appliance and what it means for the type of hardware it uses http://www.brianmadden.com/opinion/Why-do-Azure-Stack-appliances-have-to-be-certified also that Microsoft has now as well published a roadmap on Azure Stack –> https://azure.microsoft.com/en-us/roadmap/?category=compute but this is part one of the in-depth archictecture!

The Core Architecture:
In the core of Azure Stack, we have the software-defined architecture, where it using both Storage Spaces Direct for underlying storage and VXLAN for cross-host communication. Since SPD ( Storage Spaces Direct) has a requirements part of RDMA that is part of the hardware design, this also makes the current limit of nodes at 12 physical servers. It is also running Hyper-V on Server Core, in an integrated system we also have an HLH (Hardware lifecycle Host) which is used to run OEM vendor-provided management tools for hardware. There are multiple virtual machines which run on Azure Stack which makes our part of the ecosystem. 
core

How does Storage Work:

The bare-metal servers are running Windows Server 2016 with Hyper-V as the underlying virtualization platform. The same servers are also running a feature called Storage Spaces Direct (SPD) which is Microsoft’s software-defined storage feature. SPD allows for the servers to share internal storage between themselves to provide a highly-available virtual storage solution as base storage for the virtualization layer.

SPD will then be used to create a virtual volume with a defined resiliency type (Parity, Mirrored, Two-way mirror) which will host the CSV shares and will use a Windows Cluster role to maintain quorum among the nodes.

SPD can use a combination of regular HDD disks and SSD disks (Can also be all-flash – Which Cisco announced earlier today) to enable capacity and caching tiers which are automatically balanced so hot data is placed on the fast tier and cold data on the capacity tier. So when a virtual machine is created and storage is placed on the CSV share, the virtual hard drive of the VM is chopped into interleaves of blocks which by default are 256KB and is then scattered across the different disks across the servers depending on the resiliency level. Now in Azure Stack by default we have a three-way mirror which is used to provide redundancy in the Stack. We have Service Fabric Cluster running  which is used to provide the tenant and admin API’s across using Azure Resource Manager and we have a underlying controler called ACS. On each VM that is configured with a standard HDD, the (ACS) Storage controller which insert a IOPS limit on the hypervisor to 500 IOPS to provide consistency with Azure.

storage
The Network Fabric:

The network consists of multiple modules, such as the software load balancer (MUX) which is a feature running on the hyper-v switch as a host agent service, and is also managed centrally by the network controller which acts as a central management for the network. The load balancer works on layer two and is used to define a public IP with a port against a backend pool on a specific port. The software load balancer is load balancing using (DSR) direct server return which means that it only load balances incoming traffic and the return traffic from the backend servers are going directly from the server back to the requesting IP address via the Hyper-V switch. This feature is also presented in Azure Stack as the regular load balancer.

To ensure that the software load balancing rules are in place and that the distributed firewall policies are synced and maintained and of course when we have VXLAN in place all the hosts needs to have a IP table so each node knows how to communicate with all the different virtual machines on different hosts. This is where there needs to be a centralized component in place which takes care of that and that is the network controller.

On Azure Stack the network controller runs as a highly available set of three virtual machines which operates as a single cluster across different nodes. The network controller has two API interfaces, one which is the northbound API which accepts requests using REST API, so for instance if we go and change a firewall rule or create a software load balanced in the Azure Stack UI the Northbound API will get that request. The network controller can also be integrated with System center but that is not a part of Azure Stack.

The southbound API will then propagate the changes the different virtual switches on the different hosts. The Network controller is intended to be a centralized management component for the physical and virtual network since it uses the Open vSwitch standard, but the schema it uses is still lacking some key features to be able to manage the physical network.

The Network Controller is also responsible for managing the VPN connections and advertisement of the BGP routes and maintaining sessions states across the hosts.

network

From a network perspective once, you have a Site-to-site gateway established you have essentially have two virtual machines which are powering the site-to-site VPN solution for all tenants. Therefore you will not have a dedicated public site IP for each gateway.

Troubleshooting and management:

When troubleshooting issues, make sure that you check if there is anything documented on the version build and yes there are a lot of documentet issues and bugs (https://docs.microsoft.com/en-us/azure/azure-stack/azure-stack-update-1712 for instance) and if you run into any issues such as alerts of such on the admin portal you will need to get logging information from the PEP (Privileged End Point) to get some assistance from Microsoft. Here is an example script you can run using the PEP to collect logs on an integrated system (note on a integrated system, there are always 3 instances of the PEP running) oh and Microsoft recommends that you connect to the PEP from a secure VM running on the HLH.

Get-AzureStackLog -OutputPath C:\AzureStackLogs -FilterByRole VirtualMachines,BareMetal -FromDate (Get-Date).AddHours(-8) -ToDate (Get-Date).AddHours(-2)

You will need to define what kind of role you want to get logs for
You can read more about the logging feature from this Github MD https://github.com/Azure/AzureStack-Tools/blob/master/Support/ERCS_Logs/ReadMe.md
But if an update fails you are pretty in the dark, and will need to extract these logs on different levels and roles and send across to Microsoft to get it troubleshooted, and we have had some times already now needed their assistance in order to troubleshoot an failed upgrade.

Security:
Of course, Microsoft focused alot on Security in Azure Stack which of course is something of the core advantages of it. Below is some of the settings which are configured on the Stack.
* Data at rest encryption – All storage is encrypted on disk using Bitlocker, unlike in Azure where you need to enable this on a tenant level. Azure Stack still provides the same level of data redundancy using three-way copy of data.
* Strong authentication between infrastructure components
* Security OS baselines – Using Security Compliance Manager to apply predefined security templates on the underlying operating system
* Disabled use of legacy protocols – Disabled old protocols in the underlying operating system such as SMB 1 also with new security features protocols such as NTLMv1, MS-CHAPv2, Digest, and CredSSP cannot be used.
* Constrained Administration (such as the PEP  endpoint uses PowerShell JEA (Just Enough Administration)
* Least privileged account – The platform itself has a set of service accounts used for different services which are running with least privilege
* Administration of the platform can only happen via Admin portal or Admin API.
* Locked down Infrastructure which means that we have no direct access to the hypervisor level.
Windows Credential Guard – Credential Guard uses virtualization-based security to isolate secrets so that only privileged system software can access them.
Uses Server Core to reduce attack surface and restrict the use of certain features.
Windows Defender on each host
Network ACL defined in TOR, SDN and Host and guest which are deployed using Ansible
Group Managed Service Accounts
Rotates secrets every 24 hours

Limitations:
Of course, as with a new platform, there are a lot of limitations that you need to be aware of (especially if you have read up on the consistency between Azure and Azure Stack) and if you have a system which as support for Azure using some form of PaaS service it does not necessarily mean that it supports Azure Stack. The application vendor will need to ensure that their product also is compatible with Azure Stack feature level. Here is also a list of other limitations that I have encountered. 

  • Limited amount of instance types (A, D and Dv2 series)
  • Single fault and update domain (UPDATE: Changed in 1802 build)
    • Only LRS Storage
  • No support for IPv6
  • No support for Managed Disks
  • Limited support for Premium disk (Cannot guarantee performance)
  • No support for Application Gateway or Traffic Manager
  • No support for VNET Peering
  • No support for Azure SQL (Only SQL Server which is served through a SQL Connector)
  • Only support for Basic VPN SKU (and only two pair HA nodes which provides VPN for all tenants)
  • No Network QoS on NIC (Can allow for noisy neighbors)
  • Only some marketplace items (Such as Windows 10 is missing out and other fun part in marketplace)
  • No customer specific gateway (Same IP for all gateway connections)
  • A lot of Azure Services such as Data Factory cannot use Azure Stack Storage (Hardcoded URL on the different services)
  • No support for SQL Server and AzureStack (Stretched database or SQL Backup) functionality which is part of SQL Server
  • No support for Citrix on Azure Stack (Meaning no Citrix NetScaler and Provisioning options available)
  • No support for Azure files
  • Max blob size 195 GB (UPDATE: Changed in 1802 build)
  • Max disk size 1 TB
  • No support for Point-to-site VPN
  • No support for docker machine drivers.
  • Troubleshooting is mainly dumping logs to the Microsoft support team
  • Some UI bugs such as defining DNS settings on virtual network

 Now since the release, there has been one update each month since the release, this shows the dedication to the platform and the ecosystem, but Microsoft has to make it easier to run edge processing and have Azure features that support Azure Stack integration. Also, one thing I want to highlight is that Azure Stack has one thing that it excels in which is networking 😉 but no wonder with the networking backend it provides.

rdmaazurestack

Also earlier today Cisco also came with an all-flash version of Azure Stack so now Microsoft really needs to fix the scale ability issues.

Azure Standard load balancer

A While back Microsoft announced a new Load balancing tier in Microsoft Azure called Azure load balancer standard, which is a new SKU of the existing load balancing service in Azure, which is still in preview when I’m writing this blogpost.
Azure provides different load balancing solutions, where we have Application Gateway (provides layer 7 and SSL based load balancing) Traffic Manager which provides geo redudancy using DNS based load balancing) and Load Balancer service which has been aimed at layer 4 load balancing.

Now there are many differences between the standard one and the old one (basic)

Standard SKU Basic SKU
Backend pool endpoints any virtual machine in a single virtual network, including blend of virtual machines, availability sets, virtual machine scale sets. virtual machines in a single availability set or virtual machine scale set
Availability Zones zone-redundant and zonal frontends for inbound and outbound, outbound flows mappings survive zone failure, cross-zone load balancing /
Diagnostics Azure Monitor, multi-dimensional metrics including byte and packet counters, health probe status, connection attempts (TCP SYN), outbound connection health (SNAT successful and failed flows), active data plane measurements Azure Log Analytics for public Load Balancer only, SNAT exhaustion alert, backend pool health count
HA Ports internal Load Balancer /
Secure by default default closed for public IP and Load Balancer endpoints and a network security group must be used to explicitly whitelist for traffic to flow default open, network security group optional
Outbound connections Multiple frontends with per rule opt-out. An outbound scenario must be explicitly created for the virtual machine to be able to use outbound connectivity. VNet Service Endpoints can be reached without outbound connectivity and do not count towards data processed. Any public IP addresses, including Azure PaaS services not available as VNet Service Endpoints, must be reached via outbound connectivity and count towards data processed. When only an internal Load Balancer is serving a virtual machine, outbound connections via default SNAT are not available. Outbound SNAT programming is transport protocol specific based on protocol of the inbound load balancing rule. Single frontend, selected at random when multiple frontends are present. When only internal Load Balancer is serving a virtual machine, default SNAT is used.
Multiple frontends Inbound and outbound Inbound only
Management Operations Most operations < 30 seconds 60-90+ seconds typical
SLA 99.99% for data path with two healthy virtual machines Implicit in VM SLA
Pricing Charged based on number of rules, data processed inbound or outbound associated with resource No charge

Now like AWS, Microsoft now charges based upon Load balancing rules and data processed (only for the standard SKU, the basic one is still free)

Load Balancing rules:
First 5 rules: $0.025/hour
Additional rules: $0.01/rule/hour

Data processed trough the load balancer.
$0.005 per GB

The biggest change in this new tier is that 1: It supports availability zones (Which today was GA), It has a much better diagnotics options and lastly it provides something called HA ports which ill come back to a little bit later in this post. To get started to configure a azure load balancer standard you might need to use the CLI or PowerShell, this example belove shows using Azure CLI.

Create Resource Group
az group create –name changename–location changelocation

Create Public IP with Standard SKU
az network public-ip create –resource-group myResourceGroupSLB –name myPublicIP –sku Standard

Create Standard load balancer
az network lb create –resource-group changename –name changename –public-ip-address myPublicIP –frontend-ip-name myFrontEnd –backend-pool-name myBackEndPool –sku Standard

Now looking back at some of the new functionality such as HA ports, this feature helps you with  high availability and scale for network virtual appliances (NVA) inside virtual networks. It can also help when a large number of ports must be load balanced since we can load balance entire ports ranges, setting the port value to  0, and the protocol to All. The internal Load Balancer resource then balances all TCP and UDP flows, regardless of port number. NOTE: that if you want to use HA ports to provide HA for NVA in Azure please make sure that the vendor has verified the appliances to work with HA ports.

One of the most welcome features is that we can now monitor VIP and DIP availability directly from the console, which was a really cumbersome process in basic load balancer tier.

lbazure

You can now also see availability on DIP based upon a single IP address.

lbazure2

So using Azure Load Balancer standard will allow us to easier load balance NVA inside where we for instance have a hub and spoke layout on the virtual network, and also the ability to now monitor the availabilty of DIP to make it easier to troubleshoot the health probes is something that I’ve been missing for some time!

 

Is this the year of Unified Application Delivery?

Always amusing when I see on social media “Is this year going to be the year of VDI?” Which has been going back a lot of years already? The issue with VDI projects back in the day (Starting to write like an old guy…) was that the architecture and storage didn’t quite scale to that extent since VDI projects were heavy when it came to resource usage, and therefore launching a successful VDI project was difficult of course there weren’t a lot of good products in the market as well. I don’t know how many blog posts and reference architectures I looked at about “scaling and designing VDI”, “How to scale VDI for IOPS” etc.. The first uprising in VDI projects started when the new players in the software-defined market came along and changed the market with its hyper-converged infrastructure made VDI projects a lot easier in terms of performance and more players in the VDI space as well. Now I’ve seen large reference VDI projects from most of the different hyper-converged / software-defined players in the market and also seeing more and more VDI deployments leveraging the cloud as well since you have the economics which can make it suitable for many use-cases. So is 2018 going to be the year of VDI? Not gonna happen! My personal opinion is that the year of VDI is not going to happen this year as well. I believe that the VDI ship has sailed, and moving forward more and more SaaS-based services are going to replace the windows based applications at a much faster rate. There is, of course, going to be a need for a VDI solution to deliver applications for a long time to come, but we need to look away from VDI solutions and focus more on application delivery (and not just windows based)

However, the key moving forwards it to be able to deliver all these applications in a single unified manner. Combining all those Windows-based applications, Linux based applications, those sloppy web-based apps which have their own authentication mechanism or using Basic Web authentication. Also to deliver those modern web applications which support open authentication mechanisms such as SAML and OAuth. The key is also to have a single security plane as well to control access and maintain security to these applications and also have a single source of truth for identity as well.

last ned

Handling security in a Cloud-based scenario such as with Google Gsuite or Office 365 also requires more investment into the different CASB products such as Microsoft 365 with Cloud App Security to allow integration directly with the cloud providers. So is it actually possible to build this type of unified application delivery platform? We are pretty close, so what kind of products do we need?

Identity:
There are multiple identity solutions that can be used, most tend to look at the cloud-based identity sources such as  Azure AD and Google Identity since they are proven for scale and have advanced functionality both to setup federation/trust but also a rich ecosystem which allows for building new applications which support these are identity sources. Both have a lot of built-in mechanisms to handle authentication such as SAML and OAuth. Azure AD, however, has a bit more security features built-in compared to Google at this time. There are also other solutions which provide these built-in authentication solutions such as Ping Identity, Okta, and One Identity. For on-premises deployments, we also have VMware identity manager. This depends on where you want the identity source to be located. Of course, vendors such as Ping, Okta and One identity are companies which only focus identity field and have proven products for that purpose. 

Application Delivery Platform:
Here we also have a couple of solutions which can deliver both Windows/Linux applications from a single platform, such as Citrix XenDesktop and VMware Horizon. Both of these platforms support LDAP/AD but also other authentication mechanisms such as SAML to support user-based authentication from end-user to backend. This allows us to for instance authenticate against XenDesktop or Horizon using any of the identity sources listed above.

Gateway:
This is of course to handle traffic and proxy connections with authorization rules against internal resources such as on-premises web applications which only supports basic web authentication for instance or to handle traffic to a backend VDI or RDS host. Both Citrix NetScaler and VMware Identity Manager can handle authentication mapping from SAML to for instance basic web authentication, both different in terms of function since NetScaler is more an advanced beast since it is a network application focusing on ADC but has advanced functionality to handle authentication and authorization while Vmware Identity Manager is more aimed at handling user lifecycle management and application access. VMware has traffic flow through its Unified Gateway. But also the online identity providers can also handle traffic against Basic Web applications. Microsoft also has Azure Active Directory App Proxy which allows authentication and traffic flow against on-premises web applications using Kerberos for instance.

Security:
Identity is the new firewall, which makes even more sense in this type of environment where we cannot control the end-users traffic and therefore we need to make sure there are security mechanisms in place to ensure data confidentiality. Only Microsoft of the large vendor has a solution which falls within the CASB (Cloud Access Security Broker) domain to handle connections and activities done against a SaaS product. The product Cloud App Security is now tightly integrated within Azure AD as well. VMware and Citrix have some policies controls which determine what kind of activity an end-user can do within a Terminal server environment and conditional access on the device connecting, but they do not have any functionality to control what a user can do within a SaaS service. Of course, controlling the user and what the user does is only a small part of the puzzle, we also need to be able to control to an extent, how the endpoint is which the end-user is using as well. Most all vendors Citrix, VMware, Microsoft, and Okta have MDM solutions which allow us to make more advanced authentication rules to determine if an end-user should have access to a business critical application from certain endpoints. Microsoft and VMware both have Conditional Access rules where we can build a solution to ensure that device is compliant before gaining access to an application or system.

Unified Application Portal:
There are multiple portals which we can utilize which can expose all these different applications from multiple sources, however we will be taking a closer took at those from Citrix, VMware and Microsoft.

Azure AD My Apps:
My Apps in Azure is directly integrated into Azure Active Directory and can expose applications which have been added to Azure AD. This can also be Office365 which also has its own App Launcher solution which reflects and shows the same application. It can also add other 3.party applications using SAML (Citrix and VMware can be added here but just as a hard link) but VDI desktops cannot be shown directly here. Azure also has support for an internal web application using Azure AD application proxy which can publish internal web applications and supports SSO using Kerberos and NTLM. The good thing is that this also integrates directly with Office365 så applications can be shown directly to the end-users App Launcher. Of course that we can protect access using Azure MFA and Azure Conditional Access which can now be integrated into Cloud App Security.

2017-01-18_11-39-57

Workspace One is a combination of VMware Identity Manager, AirWatch Enterprise Mobility Management Suite, and Horizon which is running as a local server setup.  The Workspace portal allows us to present out VDI/RDSH Desktops and application combined with web-based applications using SAML, where can also protect resources using conditional access rules. And also VMware has its own MFA solution as well that be used to provide additional security on top. The advantage here is that we can present both Windows/Linux application and web application within a single portal and multiple security policies on top.

VMware Workspace One:

Citrix Workspace Services which is the future workspace portal from Citrix which will be able to serve both applications and data from the same UI. Which will be more similar to Office365 app launcher with Sharepoint data in it. A Similar setup is possible with Unified Gateway where we can present out Windows/Linux applications and desktop, SAML based applications using NetScaler as the SAML SP and RDP sessions using RDP proxy etc. Citrix also has the advantage of being able to deliver VPN solutions as well so it provides a strong range of different solution which can be presented from within the same portal.

Citrix Workspace / Unified Gateway:
Bilderesultat for citrix workspace services

So what does the workspace of tomorrow look like? I’m guessing that this is the product that many vendors are working towards solving or finding the secret ingredients.
With more and more business applications become more and more web-based there is no denying that the workspace will need to have tight integrations with modern SaaS products such as Google GSuite, Salesforce, Service Now, Workday, Office365 and such, but also be able to integrate with on-premises based applications on legacy systems such as Windows/Linux based applications and Desktop but also older internal web applications. The workspace will also need to have certain security policies in place such as conditional access to give a more granular approach to security when it comes to giving access to applications and or SaaS services. We also need to have certain security products backend as well to take control of the data and API access to the SaaS services to ensure compliance and so on.

 

 

 

Windows 10 and Server 2016 network enhancements

There has been a lot of new enhancements done to the networking stack in Windows 10 and Server 2016, which I wanted to write a bit more about. Earlier I wrote a bit about TCP Fast Open which was available in Windows 10 and Microsoft EDGE to reduce the initial TCP SYN process http://msandbu.org/increasing-microsoft-edge-performance-using-tcp-fast-open-on-netscaler/ but looking at the rapid release cycle in Windows ther has been more new stuff that has been introduced over the last couple of years. Much of the functionality is defined is NDIS (https://docs.microsoft.com/en-us/windows-hardware/drivers/network/overview-of-ndis-versions) Which is the Windows specificiations on how drivers should be created for network communication. Now some of the new features that have been introduced are things as:

  • CUBIC Support: In Windows 10 creators update they also came with support for the congestion algoritm CUBIC, which is actually the default congestion algoritm in Linux. The main goal behind CUBIC is to improve the scalability of TCP over fast and long distance networks, and also to keep the CW much longer at the saturation point.
    The following commands can be used to enable CUBIC globally and to return to the default Compound TCP (requires elevation):

    • netsh int tcp set supplemental template=internet congestionprovider=cubic
    • netsh int tcp set supplemental template=internet congestionprovider=compound
  • Fast Connection Teardown: TCP connections in Windows are by default preserved for about 20 seconds to allow for fast reconnection in the case of a temporary loss of wired or wireless connectivity.  However, in the case of  such as docking and undocking this is long delay, Fast Connection Teardown feature can signal the Windows transport layer to instantly tear down TCP connections for a fast transition.
  • ISATAP and 6to4 disabled by default: With the uptake in IPV6, these protocols are now disabled by default, but can be enabled using Group Policy, Teredo is the last transition technology that is expected to be in active use because of its ability to perform NAT traversal to enable peer-to-peer communication.
  • Windows TCP AutoTuningLevel: Before the Creators Update the TCP receive Window autotuning algorithm depended on correct estimates of the connection’s bandwidth and RTT, the new algoritm adapts to BDP (Bandwidth-delay product) much more quickly than the old algorithm and has increased performance when it comes to converge on the maximum receive window value for a given connection.
  • Recent ACKnowledgement (RACK): RACK uses the notion of time, instead of  packet or sequence counts, to detect losses, for modern TCP implementations that can support per- packet timestamps and the  selective acknowledgment (SACK) option. RACK is enabled only for connections that have an RTT of at least 10 msec in both Windows Client and Server 2016. This is to avoid spurious retransmissions for low latency connections. RACK is also only enabled for connections that successfully negotiate SACK.
  • Windows Low Extra Delay BAckground Transport (LEDBAT): LEDBAT is a way to transfer data in the background quickly,  without clogging the network. Windows LEDBAT transfers data in the background and does not interfere with other TCP connections. LEDBAT does this by only consuming unused bandwidth. When LEDBAT detects increased latency that indicates other TCP connections are consuming bandwidth it reduces its own consumption to prevent interference. When the latency decreases again LEDBAT ramps up and consumes the unused bandwidth. LEDBAT is only exposed through an undocumented socket option and can only be used by approved partners.
  • RSSv2: Compared to RSSv1, RSSv2 shortens the time between the measurement of CPU load and updating the indirection table. This avoids slowdown during high-traffic situations. This is part of the  Windows 10, version 1709 kernel.

This youtube video from Ignite last year goes into detail on the different improvements that have been introduced into Windows over the course of the last year –> https://www.youtube.com/watch?v=BlBWUGcYCQQ 

And of course having a strong networking stack is important to handle the modern web applications and connections from different endpoints and different network connectivity. In the next blog post I will focus on a bit more on the container networking aspects that have been introduced in Windows.

 

 

 

My thoughts on Citrix buying Cedexis and what it is?

Earlier today Citrix announced publicly that they have bought the company Cedexis. (If you didn’t catch the news you can read the official blogpost here –>  https://www.citrix.com/blogs/2018/02/12/citrix-acquires-cedexis/)

Being the tech-curious mind that I am, and started to read through the official blogpost didn’t give me any clarity in what kind of value it would actually bring to Citrix. Also, I haven’t heard about the company before (other than some on social media from time-to-time, so I started to do some research) so therefore I decided to take a closer look and how Citrix can benefit from it.

Looking into the company I noticed that they have a set of products which make up the core which is called Cedexis ADP (Application Delivery Platform) which is actually aimed at making more intelligent load balancing using a combination of (Real-user monitoring & synthetic monitoring) to make the correct decision on where to route the data.

clip_image002

The platform is split into smaller parts where the core is three applications.

Radar: Is a product which contains and gathers real-user telemetry from thousands of users worldwide (you can see some of the interesting statistics here https://live.cedexis.com/) so with this we have detailed mappings on outages and response times and such. This is using a simple JavaScript script embedded within a content page or application provider’s pages to collect information about the performance and availability of a data center or delivery platform. (You can also access some nifty reports here as well à https://www.cedexis.com/get-the-data/country-report/)

Sonar: is a live-ness check service that can be used to monitor web-based services for availability. Sonar works by making HTTP or HTTPS requests from multiple points-of-presence around the world to a URL, Sonar checks are performed from multiple test locations from around the world.

Openmix: a SaaS Global Load Balancing which uses information from for instance Radar to consider real-time data feeds of end-user telemetry, and server or application monitoring data from Sonar to do Intelligent Global Load Balancing. Using all these different tools we can also combine this with other data such as Cost/Performance and define our own rating on a service if we for instance have a service available on multiple locations/platforms. The cool thing about Openmix being a cloud service and all is that Is available via DNS and HTTP, example here à https://github.com/cedexis/openmixapplib/wiki/Openmix-HTTP-API#overview

Fusion: In addition to Radar and Sonar data, Openmix can use 3rd party data as part of its decision criteria, which can integrate an existing synthetic monitoring service you already use. Or make cost-based decisions using usage data from a CDN provider. Here is a picture of the supported integrations that Fusion has which can be used to determine the best path.

clip_image004

There are also some new integrations such as Datadog, which also allows us to do for instance more efficient routing application logic based upon Datadog alerts.

So, looking at the products we can see that Cedexis have multiple tools to determine the optimal path, including the use of real-time user information and synthetic testing combined with third party integrations using custom metrics also a global SaaS load balancing service. For instance if we have a service which is available in multiple locations on multiple cloud providers, how can we ensure that an end-user is directed to the optimal route? We have multiple logic such as Radar (How is the network performing or CDN where the content is served from? ) and Sonar: (what is the RTT of the application from the ongoing test?) and also information from Fusion(New relic integration for instance APM which shows that Service Y is performing slow because of DB errors) and deduct from that information the correct path. However, Cedexis is missing the product to handle the actually load balancing in between the end-users and the backend services and is depedant on someone else to actually do the local load balancing and handle SSL traffic. While NetScaler on the other hand is missing the products to do more intelligent load balancing based upon real user telemetry, instead of just doing health-checks to the backend web server or doing GSLB based upon user proximity or such.

I can see the value of integrating the Cedexis platform into the NetScaler portfolio seeing that it can make it a much more powerful smart application delivery system. So, this is just my personal idea on how the portfolio could look like from an integrated solution. We could have NetScaler MAS feeding fusion using Web Analytics for instance and also seeing the performance usage on the NetScaler’s) which will then make it easier for Openmix to make the decision if the end-users should be load balanced to region X or Y based upon the weight that was defined on the application or service.

image

So just some intial thoughts on the Cedexis platform. Looking forward to try the platform and testing it out in real-scenario and what plans Citrix have for the platform moving forward.

Cloud Wars – IBM vs Microsoft vs Google vs Amazon IaaS

In my previous blog post I did a short overview of the different cloud vendors, a bit about their focus areas and also a bit about strengths and weaknesses. The blogpost can be found here –>http://bit.ly/2CrBgZA .In this post I want to focus more on IaaS and the offerings surrounding it, first I want describe a bit about each vendor and then ill go into a bit more comparison and also include the price/performance factor here as well and end it with some focus on automation functionality and additional services.

IBM:
As mentioned in my previous blogpost, IBM with its Softlayer capabilities has had extremely focus on bare-metal, with the addition on traditional IaaS and also with the extended partnership with VMware, they can also provide vCloud Foundation package (which also is a prerequisite package for VMware HCX) or just plain ESXi with vCenter deployment. On the bare-metal options we can choose between hourly or monthly pre-configured servers or customize with single to quad processing solutions that range from 4 to 72 cores. We can also order bare-metal servers with dedicated GPU offerings such as K2, K80, M60, P100). One of the cool features in terms of pure IaaS is that they offer pure Block storage as an option as well, using iSCSI or just plain file storage using NFS which also is an option. In terms of scale ability they can only offer up to 56 cores and 242 GB RAM for a single virtual machine which is a lot smaller then most offerings in Azure, Google and AWS. IBM like AWS and Azure also offers pre defined instances sizes which can be used and when setting up an instance you can also define what kind of network connectivity you want to have, by default you get 100 mbps uplink and private connectivity which is free, but if you want to up it to 1 GB you need to pay a cost. The main issue is that of all much of the concepts such as availability zones and other options for HA is not an option in IBM compared to GCP, AWS and Azure.

In terms of Automation, IBM has a feature called Cloud Schematics which is natively based upon Terraform, so it is basically wrapping REST API calls using a IBM Provider in Terraform, https://ibm-cloud.github.io/tf-ibm-docs/ we also have the ability to run provision scripts which can be run at boot as part of a deployment. One of the things I feel is missing on IBM when it comes to automation is the ability to provider more overall system management capability such as Azure automation or AWS systems manager.

Google:
Google compared to the others have the most simplified deployment of virtual machines. Also they are the only vendor that has the option to defined custom instance sizes (for a bit higher prices of course) and also the flexibility when it comes to GPU flexibility for instance we can add GPU instances to any type of instance and also when it comes to disk type and sizes.

With Automation, Google has an API framework called Google Cloud Deployment Manager, which uses a YAML based syntax, but can also be using providers from Terraform or Puppet to do the deployment as well. Google also has the option do run start-up scripts on each virtual machine which allows for scripting of software and services inside the virtual machines. Google provides up to 96 vCPU and 1433 GB of memory on their largest instances. They however do not have any form of bare metal options, compared to IBM but that is not their focus either but like AWS, Google has gone into a partnership with Nutanix on a Hybric Cloud model which is going to be interesting to see how it turns out. Another cool thing about Google is that they provide live migrations of instances as default to handle maintance updates on their infrastructure.

For deployment of redudant solutions you need to be able to deploy instances across multiple zones within a region (Which is a simliar setup as Amazon Web Services do and Azure does with Availability Zones)

From a management perspetive, Google has been really good at developing their Cloud Shell solutions which allows for easy access to virtual instances directly from their browser and also allows for simple access with auto inserting the SSH key as part of the setup. One of the coolest things about Google is their core infrastructure and the network backbone which is called Andromeda https://cloudplatform.googleblog.com/2017/11/Andromeda-2-1-reduces-GCPs-intra-zone-latency-by-40-percent.html which now has allowed them to provide low latency high bandwidth connections on east-west traffic. Also that they SDN is also worldwide meaning that if you create a virtual network by default it will be available on all the different regions (where different subnets are placed within each region  but are all interconnected)

Azure:
Microsoft has also been doing a lot of work recently and investing heavily into new options such as new GPU offerings with the P100 and P40 cards but also with the introductions of availability zones (Still in preview for most services) which now allows for a great level of redundancy which is now pretty similar to Zones on GCP and AWS. Microsoft has also introduced loads of different new instances types with the burstable compute (B-series) and also now with the introduction of GA on Accelerated networking which allows for SRV-IO based network deployment of instances in Azure.

From a management perspective Microsoft has been doing alot around regular operations, such as with Log Analytics which can now do patch management and provide multiple pieces of monitoring across different platforms and also integrating different PaaS serivces to allow for a single hub to do monitoring across most of the services. Also with simple EDI based tools such as Logic Apps and Azure Automation allows us to setup simple and down to more complex automation jobs to do automated deployments and start/stop virtual instanced based upon a trigger or schedule. Also that they provide alot more tools when it come to migration and backup tools compared to the other vendors, with Azure Migrate and Azure Site Recovery.

Also Microsoft has been doing alot of investment into their Cloud shell solution as well which allows us to run az cli (bash based) and Azure PowerShell cmdlets directly from the browser. (As as of t01.02 they now also support Ansible directly from the cloud shell interface)

One of the issues with Azure from an IaaS perspective is the lacking flexibility to mixing like GPU cards with different instances, scaleable IOPS together with disk size. Also Microsft is focusing alot on building partners in the ecosystem to support automation and have been doing alot when it comes to Terraform which now covers alot of the resources in Azure directly.

AWS:
When it comes to IaaS, Amazon provides most of the services both when it comes to bare-metal(coming) and support with VMware. Also different options depending on if needed reserved instances or just need to get reserve capacity or godzilla virtual machines. They also provide different storage options and scale options on IOPS depending on the size of the storage. Now with the upcoming support with VMware will also provide a whole new level of infrastructure solutions (the service is now available but still limited to certain regions in the US)

AWS also provides multiple management tools to make things easier such as AWS systems manager (which can also target on-premises virtual machines), and they even provide their own AWS Managed Services where they manage the IaaS solutions for you . AWS also has a service called OpsWorks which provides automation solution based upon Puppet and Chef as a managed service which can be then used to deliver configuration management against your own enviroment in AWS. AWS also has CloudWatch and CloudTrail to track events, logs and activity and API usage across AWS subscriptions.

AWS also has multiple options when it comes to GPU offerings such as P2 and G2 series which comes with a dedicated GPU card, or use the flexible GPU which is a software-defined GPU offering which allows us to add a GPU card to almost any type of instance.

Summary:
Now the fun part is that most providers are now delivering more and more services to help with automation and system management, such as managed container engine cluster and also different advisor roles which can detect cost or security issues. This can be to check for best-pratices according to the cloud provider.

Now the interesting part is mainly around the container solutions that most providers are now fighting about. Both Microsoft and AWS have their own Container instance solution, where you just provision a container based upon an image and don’t have to worry about the infrastructure beneath (AWS Fargate and Azure Container Instance) and both of them also provide other container solutions such as Amazon Container Engine and Azure Container Engine. The fun part is that all 4 providers supports Kubernetes as the container orchestration engine and have supported features to build upon it, this can be a container registry solution or CI/CD solutions.

Technical Comparison: So the intention here is to have a short table to compare some of the different infrastrucutre services from each vendor, it does not measure the quality of service but just defines that they have a service and service name.

Provider Microsoft Google Amazon IBM
High Performance Computing Services Azure Batch Amazon Batch  IBM Spectrum, IBM Aspera
Reserve Capacity instances Low Priority VM’s Preemptible instance Spot instances
Reserved Instances Reserved Instances Committed use EC2 Reserved Instances
Dedicated instances EC2 Dedicated Instances
Bare Metal hosts Yes (Announced) Yes
Burstable Instances Yes Yes Yes No
VM Metadata support Yes Yes Yes Yes
Custom Instance Sizes Yes Yes
Compute Service Identity Yes Yes Yes No

 

High performance disk Premium Disk SSD persistent disk, Local SSD SSD EBS SSD Octane
GPU-instances N-series (NV, NC, ND) Flexible GPU P2 instances / Flexible GPU Only as bare metal
Nested virtualization support Yes Yes (Beta) Yes
Hybrid Story Azure Stack Nutanix VMware VMware
GPU cards support M60, K80, P40, P100 K80, P100, AMD S9300 M60, Custom GPU, V100 P100, M60, K80
Desktop as a service Third Party Third party Workspaces & AppStream Third Party
Scale set VM Scale Set Instance Group Auto Scaling Auto Scale
Godzilla VM Standard_M128 128vCPU, 3800 GB N1-highmem 96vCPU, 1433 GB X1.32large 128vCPU, 4 TB 56 vCPU, 242 GB Memory (Other Bare Metal)
Skylake support Yes Yes Yes
VMware support Yes (Announced) Yes (Limited to the US) Yes
Billing for VM Per minute Per Second Per Second (For some) Per Hour
Deployment & Automation service Azure Resource Manager Google Deployment Manager Cloudformation IBM Cloud Schematics
CLI PowerShell, AzureCLI GCloud CLI, Cloud Tools for PowerShell AWS CLI, AWS Tools for PowerShell Bluemix CLI
Monitoring & Logging Microsoft Log Analytics, Azure Monitor StackDriver CloudWatch, Cloudtrail Monitoring and Analytics
Optimization Azure Advisor Native Service in UI Trusted Advisor
Automation tools Azure Automation Amazon CloudOps for Chef and Puppet Cloud Automation, Workload Scheduler
Support for third party configuration and infrastructure tools Chef, Puppet, Terraform, Ansible, SaltStack Chef, Puppet, Terraform, Ansible, SaltStack Chef, Puppet, Terraform, Ansible, SaltStack Terraform
Cloud Shell support Yes Yes
EDI Tools Azure Logic Apps

In the next blog post I will take a closer look at some price comparison and comparing apples and apples in some benchmarks which measures speed of deployment using the different deployment tools and in VM spped on different levels.

Citrix FAS with Azure AD and Error 404 Not Found

So a short blogpost on a issue I faced this week.

Working at a customer this week we were working on setting up Citrix with SAML based authentication from MyApps Portal using Azure Active Directory. In order to setup this properly we needed to implement Citrix FAS in order to do SSO directly from a Azure AD Joined Windows 10 device. One of the issues we were facing was when a user clicked on Citrix app from myapps portal and opening multiple tabs or closing the existing tab where Citrix application was opened. The end user received a standard 404 error from Citrix Storefront

The reason for this was because of the Gateway session cookie was inserted when the user was trying to access Gateway from Azure MyApps. The request from Azure AD was redirecting to /cgi/samlauth and forwarded to the IIS server since Session cookie matched with an existing connection the connection failed. So my initial idea was to use Responder or rewrite policies but after some thinking I noticied that they were ignored due to AAA processing in the NetScaler packet flow take precedence of those feature.

The end solution was quite simple. We created a virtual directory on Storefront IIS.

iis2

and created a redirect on that virtual directory back to the netscaler gateway setup.

iis2

After I did this, the end user could open up the application as normal.

Cloud Wars IBM vs Microsoft vs Amazon vs Google – Part 1 Overview

Looking back and 2017 there has not been as much activity as I’ve planned on my blog and one of the main reasons for this is because I have been quite caught up in work. Again one of the main reasons is because my work takes me back and forth between multiple products, platforms and customer cases. It can be from a DevOps project on Azure, to IoT project on GCP to a DR solution on IBM and a HPC setup on AWS. So after plunging into most of these platforms I’ve decided to start my blog with 2018 on some fresh perspectives on the major cloud platforms and focus on their strengths and weaknesses. So this post will reflect my personal experiences with the platform and showing some of the core capabilities, since one of the most frequent questions I get at work is “Where do I start? and why should I choose X over Y?”

I like to compare Cloud Platforms to cars. Most of them can drive you from Place A to B, but all models have different exterior and comfort levels and maybe seven seats in the car, and some have a faster and stronger engine. So the point is that most platforms provide most of the same services, some have a better quality, different prices and different options. So like for instance all the four vendors provide a simliar form of Cloud orchestration language such as Cloud Deployment Manager, Azure Resource Manager, Cloud Formation and IBM Cloud Schematics

cloudwars

So let us start of this post series with an overview of the four major cloud platform on the market. (Also note that I’ve been part on the technical comparison on the major cloud platform on whatmatrix.com which you can see here
–> https://www.whatmatrix.com/comparison/Public-Cloud-Platforms#) and ill get back into a more technical comparison on part two of this blog series and focus a bit more into some different levels such as IaaS/Bare-Metal, Identity, PaaS, Bigdata & IoT, ML and Containers.

IBM Cloud:
ibmcloud

Historically IBM has been focusing a lot on IaaS services, with its Softlayer platform, which has been IBM’s public cloud offering on Iaas and bare-metal offerings. On the other hand, IBM has been building up Bluemix as well which has been focusing on the PaaS services which is based upon CloudFoundry. This is also where ML/AI service Watson has its home as well. The problem is that Public Cloud on IBM has been available two different platforms with Bluemix and Softlayer, and also IBM has multiple regions where they other both but some places where they only offered IaaS and not the other. This has been really confusing at times, and has been noticied by Gartner as well since they haven’t had the complete service offering compared to the others. IBM is now focusing a lot on merging these two platforms to provide all cloud functionality from what is now called IBM Cloud.

IBM unlike the other competitors when it comes to PaaS services is mostly building their own services using third party open-source products. Like for instance  the serverless feature in IBM is based upon Apache OpenWhisk unlike Azure which as Functions, Amazon which has Lambda and so on which are closed. Also they have other IaaS options based upon VMware and Veeam for instance where they are a lot further in the race against AWS, and last but not least their underlying orchestration tool for infrastructure as code is based upon Terraform.

Also one of the things I value when working with a platform is the community around it, especially on stack overflow and other social media channels such as Twitter and such, unfortunately IBM has the smallest community
based upon statistics I’ve seen on Stackoverflow, Social media and looking at meetups in the Nordics.

When it comes to PaaS Services even if IBM is focusing alot on reusing open-source platforms such as CloudFoundry also they are standardizing on Kubernetes, they are nowhere the same functionality offerings as the others in the marked. The core
strength of IBM Cloud at the moment as I see it is the IaaS/Bare-metal and VMware offering that they have. One of the core strengths they also have is the focus on Private Cloud with their IBM Private Cloud solution where they can provide a scale able PaaS solution for on-premises solutions.

Google Cloud Platform:


An example of Google with Google Cloud Shell
To be honest I haven’t done a lot of work on GCP before I started working with it about 1,5 year ago, and what I see is that Google’s cloud platform is pretty similar to their search engine, focus on ease of use and speed.
Also I’ve seen that in those cases I’ve been working on GCP, it has come out as the cheapest option between the four vendors, also that Google offers the fastest (compute, storage, network) infrastructure as well, but don’t take my word for it, see for yourself –> https://www.cloudbenchmark.com/

Also Google offers the most flexible IaaS offering, where we can define custom VM instances, any type of disk configuration and we can also use Skylake processors and multiple GPU offerings as well. So I can easily say that Google has the most impressive core infrastructure. However Google does not have any bare metal offerings such as IBM has with Softlayer, and also compared with Microsoft and IBM, Google has no private cloud offering and have therefore went into a partnership with Nutanix in order to bridge the gap –> https://www.nutanix.com/press-releases/2017/06/28/nutanix-teams-google-cloud-fuse-cloud-environments-enterprise-apps/ and they have limited support and integrations with on-premises infrastructure. Then again this allows them to focus entirely on their public cloud offering.

Also Google is missing some of the PaaS services compared to what AWS and Azure is providing. I think that Google’s strategy is not to provide a bunch of different PaaS services which can overlap, but to streamline on a few selected services. Some serviecs that Google provide a quite unique like for instance BigQuery which is one my favorite services! Now I also belive that one of the “weaknesses” that Google currently has, is the ecosystem surrouding it. Many third party companies and vendors today support or have one form of integration or support with AWS and Azure, but not with Google (and the same goes with IBM)

Another thing with Google is the community. Unlike IBM I see there is a lot more meetups in the Nordics in particular and many partners focusing on it as well, but little activity on social media.The last thing I want to mention is since Google is the home of Kubernetes they also have the best managed container engine for it on GCP, but unlike Azure for instance they do not have support for other orchestration frameworks such as Swarm or DC/OS as a service. Also with the release of Azure Container Instances and AWS Fargate now released into the wild as well which focuses more on containers itself and now on managing a cluster consisting of a set of virtual machines underneath changes the game a little, so I hope that Google will release something here soon.

Microsoft Azure
azure

So much has happened in Azure the last year, they have announced multiple new regions (which makes Microsoft the one vendor with the most regions, but not the largest) to cover more ground. We can also see based upon all the announcements from Microsoft Ignite is that their core focus in Azure, Azure and Azure moving forward, with little to no announcements around their current private cloud core products. Also more focus has shifted into Private/Public Cloud offerings with Azure Stack as well.

Microsoft now provides an impressive list of virtual machine instances (however not as flexible and scale able as the other vendors) and a impressive list of different PaaS services and that they have done a great job on the container focus in Azure. Based upon all the announcements from 2017, I believe that Microsoft has done the largest investment into containers & devops features of the four vendors. Microsoft is also building close integrations with existing software that they sell to customers today to make it east for them to move resources to Microsoft Azure moving forward, and for some customers make it the only logical choice. I can also see based upon ETL tools that they provide to make it easy for customers to move and transform data from multiple sources to Microsoft Azure. Also that Microsoft in my opinion has the best visualization options with PowerBI.

Of the 4 vendors, I see that Microsoft is most focused on the regular infrastructure customers, where Azure provides easy support for delivering backup services for IaaS and on-prem with integrates directly with Hyper-V and VMware, and also migration tools which makes it easy to migrate workloads from on-prem or other cloud providers into Azure, and with all the different integrations options with Azure AD as well and building new features such as new modern version of RDS, SQL Server with Stretched database and other scenarios such as Hybrid Active Directory, makes Azure a strong player in that market for hybrid cloud scenarios, both from pure IaaS, Big Data and Identity options as well. Lastly Microsoft has also released Azure Stack where they try to bring the Azure ecosystem to on-premises workloads as well, which makes Azure even more the logical choice when a customer wants to move to public cloud.

One of the downsides with Azure is that performance is not their key asset on certain features, and also limited options on certain areas on IaaS makes it somewhat difficult at times. Might be that they are focusing to much on adding value add services that they are forgetting to focus on the underlying platform itself.

Amazon Web Services
aws
Amazon is the clear market leader when it comes to public cloud. Even if they do not have the most regions, they still have the largest market share. Looking at the technical capabilities and also the range of different PaaS features, most of the others are nowhere near when it comes to the PaaS ecosystem they have, for instance just looking at Amazon RDS and Amazon S3 and how extensive the service capability is. Also having customers such as Netflix using their cloud platform is a good pretty statement of their status. What I also see on Amazon since they have spent the long time in this marketspace is the community around it. Looking at all the user-group and meetups and different communities on AWS it is huge! Also I forgot to mention is that Amazon is ranked as the clear market leader in both IaaS and Storage in the Gartner magic quadrant. Also what I often see is that most third party vendors (if they support cloud or have some form of cloud integration) you can bet 10$ that it is mostly integration with AWS.

One of the things I’ve also noticed at Re:Invent is there was a high focus on DevOps and Containers with support for Kubernetes and AWS Fargate (Which is focusing on Container instances instead of a managed container cluster) but also on the partnership with VMware which will now allow customers to provision VMware ESXi hosts running on AWS infrastructure (Combining the market leader in private cloud and public clouds) which is a strong statement when it comes to hybrid cloud. Still a bit behind IBM especially on the VMware support and worldwide availability. Also one of the large focus areas in AWS was also on machine learning capabilities and media services (Which mostly has been services which Azure have had the upperhand on) 

One of the downsides of AWS has so far been the lacking interest for hybrid or private and some limited offering for on-prem solutions (one of the features is the storage gateway which)

Other ramblings
So that was a short introduction on some of the core strengts and weaknesses when it comes to the four vendors. Looking at the community and ecosystem on the vendors there is no denying that the largest is focused on AWS. This is just a screenshot from the stackoverflow developer report from last year. Showing AWS and Azure in the top 10 of categories on questions asked.

caputre2

Also showing all the threads on reddit, seems like there is alot more activity focused on AWS then on the other providers as well.

reddit

But of course I have focused a bit too much on the IaaS and PaaS offerings on a cloud provider, there is also no denying that the close integration between a cloud provider and other SaaS offerings to provide a consistent Identity and access control across the solutions is something that Microsoft and Google especially are quite good that. For instance having one account to access Collaboration tools and with the Cloud Platform. So this has been a somewhat short introduction to the Public Cloud Vendors and some of my experience on these, I would also love to get any feedback on your own experinces on these platforms and your take on it, the next blog in the series will focus a bit more in-depth on IaaS and look a bit more in details on the differences between the vendors.

 

 

 

So why choose Citrix over Microsoft RDS?

A question came a couple of days ago, to do a refresh on this blogpost since this is a topic that appears frequently on Twitter from time to time so therefore I decided to do a rewrite of this blogpost.  So why should we choose Citrix over Microsoft RDS? Isn’t RDS good enough in many circumstances? and has Citrix out-played its role in the application/desktop delivery marked?  Not yet… So this questions has also appeard in my head many times over the last year, what is an RDS customer missing out on compared to XenDesktop? So therefore I decided to write this blogpost showing the different features which IS not included in RDS and an architectual overview of the different solutions and strenghts to both of them. NOTE: However I’m not interested in discussing the pricing here, I’m a technologist and therefore this is mostly going to be a feature matrix show-off

Architecture Overview

Microsoft RDS has become alot better over the years, especially with the 2012 release and actually having central management in Server Manager, but alot of the architecture is still the same. Also that we can now have the Connection broker in Active/Active deployment as lon as we have a SQL server (Note: 2016 TP5 now supports Azure Database for that part) External access is being driven by the Remote Desktop Gateway (Which is a web service to forward proxy TCP and UDP traffic to the actual servers / vdi sessions) and we also have the web interface role where users can get applications and desktop and allow them to start remote connection.

image

But still the remote desktop application which is built-into the operating system still does not have a good integration with a RDS deployment to show “buisness applications” and with Microsoft pushing alot to Azure they should have a better integration there to show buisness applications and web applications from the same kind of portal.

From a management perspective as I mentioned still done using Server Manager (Which is a GUI addon to PowerShell where also alot is done, but server manager is still kinda clunky for larger deployments and also it does not give any good insight in how a session is being handled or such, you would require to have System Center or digg into events logs or third party tools to get more information. But we can now centrally provision the different roles directly from Server Manager and the same with application publishing which makes things alot easier!

Microsoft is coming with RDmi as well most likely next year, which will also introduce a easier way to deliver RDP using App Services in Azure which allows us to host services such as RDmi Gateway, web, connection broker and diagnostics in Azure and place our RDSH servers anywhere with most likely using some form of connector between local servers and Azure Web Apps  (Quite similar to what Citrix is doing with Citrix Cloud and Cloud Connectors as well)

image

Also Microsoft has released Honolulu which is a modern take on server manager which is based upon HTML5 and has support for extensions where RDmi will be supported when it is released.

image

Citrix has adopted the FMA architecture from the previous XenDesktop versions, but the architecture might still resemble RDS. NOTE: That the overview is quite simplified but this is because I will dig into the features later in the blog. With Citrix we have more moving parts. Yet a bit simplified. With RDS I would need a load balancer for my Gateways and Web Interface servers. With Citrix in larger deployments you have NetScaler which can serve as an Proxy server and load balance the requires Citrix services as well. Also with Citrix we have a better management solution using Desktop Studio, which also allows for easy integration with other platforms and also simple image management using MCS  plus that we have Director as well which can be used for troubleshooting and monitoring of the Citrix infrastructure as well and can also be used to troubleshoot and do define end-user support.

image

The Protocol

So in most cases, and what I often see as well is HOW GOOD IS THE PROTOCOL? Again and again I’ve seen many people state that RDS is as good as Citrix ICA, but again ill just post this picture and let it state the obvious. You need facts!

Luckily I’ve done my research on this part.

While RDP as mostly a one-trick pony which we can do some adjustments in Group Policy to adjust the bandwidth usage or using regular QoS, it is still quite limited to the networking stack of the Windows NDIS architecture, which is not really adjustable. NOTE: That with Windows Server 2016 most traffic is being redirected trough the UDP port, but it is difficult to define what kind of remoting channel should use in terms of KB/s

(ThinWire vs Framehawk vs RDP) https://msandbu.wordpress.com/2015/11/06/putting-thinwire-and-framehawk-to-the-test/
Now with Citrix we can have different protocols depends on the use-case, for instance me and a good friend of mine, did an Citrix session over a 1800 MS latency connection using ThinWire+ and it worked pretty well, while RDP didn’t work that well, on another hand we tried Framehawk on a 20% packet loss connection where it worked fine and RDP didn’t work at ALL.

But again this shows that we have different protocols that we can use for different use-cases, or different flavours if you will. 

clip_image002

Another trick to it is that in most cases, XenDesktop is deployed behind a NetScaler Gateway, which has loads of options to customize TCP settings at more granular level then we could ever do in Windows without messing in Registry in some cases. So is RDP a good enough protocol for end-users? Sure it is! but remember a couple of things

  • Mobile users access using a crappy Hotel Wifi (Latency, packet loss)
  • Roaming users on 3G/4G connection (TCP retransmissions, packet loss)
  • Users with HIGH requirements in terms of performance (Consuming alot of bandwidth)
  • Connections without using UDP (Firewall requirements)
  • Multimedia requirements (3D, CAD applications)

With these types of end-users, Citrix has the better options also now with Adaptive Transport.

UPDATE: Now by default, Citrix has released EDT which by default uses UDP as the transport mechanism  ( you can see a bit more about protocol benchmarking here –> http://msandbu.org/xendesktop-edt-over-netscaler-benchmarking/ ) which performs alot better then regular TCP is most scenarioes.  You can also see a comparison of HDX versus RDP here as well –> https://bramwolfs.com/2017/11/29/a-comparison-between-display-protocols-and-codecs/ note that RDP operates at 4:4:4

Also as of late Citrix now also supports H.265 (Which is the successor to 2.64 –> https://docs.citrix.com/en-us/receiver/windows/current-release/about.html, note however that this requires a physical GPU server side)

Image management

Image management is the top crown, being able to easily update images and roll-out the changes when updates are needed in a timely fashion without causing to much downtime / maintance.

With RDS there is no straight forward solution do to image management. Yes RDS has single-image management but this is mainly for VDI setups running on Hyper-V which is now the supported solution for it. But a downside to this is that it requires Hyper-V in order to be able to do this using Server Manager. It is still not shown yet how this will be affected with RDmi, but against Azure it is possible to do ARM based templates to deploy RDS servers automatically.

Citrix on the other hand has many more options in terms of management OS image management. For instance Citrix has Machine Creation Services which is a Storage way to handle OS provisioning and changes to virtual machines, which I described in my other post on MCS and Shadow Clones ( https://msandbu.wordpress.com/2016/05/13/nutanix-citrix-better-together-with-shadow-clones/ )

image

Also Citrix has Provisioning Services, which allows Images to be distributed / streamed using the network. So virtual machines and physical machines can be configured with PXE boot and stream and operating system down and store in RAM. Doing updates to the image just requires an reboot.

Another thing to think about here is the hypervisor support, where in most cases PXE supports both physical and virtual. MCS is dependant on doing API calls to the Hypervisor layer, but it already has support for

  • * VMware
  • * XenServer
  • * Hyper-v w SCVMM
  • * Azure (With native support for most of the azure components)
  • * Amazon EC2
  • * Cloudplatform
  • * Nutanix

Other features that Citrix has:
* Cloud based services available now (Services such as Citrix Cloud, XenApp Essentials, XenDesktop Essentials)

  • * RemotePC (This golden gem which allows a physical computer to be accessed remotely using the same Citrix infrastructure) just need to install an VDA agent and publish it and can then be accessed using Citrix * Receiver. Even thou if Microsoft has RDP built into each OS there is not central management of it and there is no support to add these to the gateway builtin, each user has to remember the IP and FQDN in case.
  • * App-V and Configuration Manager integration and management (Citrix actually has App-V management capabilities directly from Studio, they also have an integration pack with Configuration Manager which allows for use of WoL for RemotePC for instance. It can also leverage the Configuration Manager integration do to application distirbution and direct publishing for that leverage Configuration Manager heavily
  • * App Layering which allows us to do application and user layers (based upon Unidesk)
  • * WEM – Workspace Enviroment Manager to allow more in-depth policy control and system resource management.
  • * NetScaler Insight – To allow better insight on the HDX channel to see how the traffic flow is distributed between screen, printer, audio, video for instance.
    * Smart Tools – Allows us too use for instance smart scale which works flawlessly in Cloud Settings to stop/start XenApp hosts based upon a schedule http://msandbu.org/citrix-smartscale-and-microsoft-azure/
  • * VM hosted application (allows us to publish applications which for under some scenariones can only be installed on a client computer)
  • * Linux support (Citrix can also deliver virtual desktops or dedicated virtual desktops from Linux using the same infrastructure)
  • * Full 3D support (Microsoft still has alot of limitations here using RemoteFX vGPU, and it can also support DDI using Hyper-V also on Azure) but Citrix has multiple solutions for instance to do vGPU from NVidia or do GPU-passtrough directly from XenServer, VMware or even AHV.
  • * Full VPN and endpoint analysis using NetScaler Gateway (NetScaler Gateway using Smart Access has alot of different options to do endpoint analysis using OPSWAT before clients are allowed access to a Citrix enviroment.
    * Integration with Citrix NetScaler and Intune to deliver Conditional Access – Many are adopting EMS with Intune for MDM which now supports Citrix deployment and access via NetScaler and Azure AD integration
  • * Skype for Buisness HDX optimization pack (Allows to offload Skype audio and video directly to an endpoint from the servers)
  • * Universal Print Services (Allows for easier management of print drivers)
  • * System Center Operations Manager management packs (Part of the Comtrade deal which allows platinum customers to use management packs from ComTrade to get a full overview of the Citrix infrastructure. Citrix now also provides OMS modules to leverage OMS to do monitoring of Citrix enviroments as well
  • * More granluar control using Citrix Policies (Which allows us to define more settings on Flash redirection, Sound quality, bandwidth QoS and much more)
  • * Browser content redirection
  • * HTML5 based access (Storefront supports HTML 5 based access, which opens up for Chromebook access, Microsoft is still developing their HTML 5 web front-end)
  • * Hell of a lot better management and insight using Director!
  • * Local App Access (Allows us to “present” locally installed applications into a remote session)
  • * Better Group policy filtering (based upon where resources are connecting from and using Smart Access filters from NetScaler)
  • * Performance optimization (Using for instance PVS and Write Cache to RAM with Overflow to Disk you don’t have to be restrained to the resources on the backend infrastructure, but allows for a better user experience
  • * Zone based deployment which allows users to be redirected to their closest datacenter based upon RTT
  • Mix of different OS-versions, with Citrix we have an VDA agent that can be used on different OS versions and be managed from the same infrastructure while Microsoft has limited management for each OS version.
  • * SAML based authentication to provide SSO directly to a Citrix enviroment.

NOTE: Did I forget a crucial feature or something in partciular please let me know!

One of the things however I do feel that Microsoft is doing right now is with Project Honolulu and developing a more HTML5 / REST based UI to make server management easier, so I sure hope that Citrix is also moving in that direction as well.

Summary

So why choose Citrix over Microsoft RDS? Well to be honest Citrix has a lot of feature which makes it more enterprise friendly.

  • Easier management and monitoring capabilities
  • Better image-management and broad hypervisor/cloud support + Performance Optimization
  • Better protocol which is multi-purpose (ThinWire, EDT, Adaptive Transport, etc)
  • Broader support for other ecosystem (Linux, HTML5 Chromebooks)
  • NetScaler (Optimized TCP, Smart Access, Load balancing)
  • GPU support for different workloads
  • Remote PC support
  • Collabaration support with Skype for Buisness
  • Zone based deployment
  • Layering capabilities (Personlization and Application)

But it is also no denying that RDS works in most cases and it all comes down to requirements of the business, but the most important fact in any type of app delivery platform is that it provides the best possible end-user experience.

So to sum it up, you can have a Toyota Yaris which can get you from A to B just fine or you can have a garage filled with different cars depending on requirements with bunch of different features which makes the driver experience better, because that is what matters in the end… End-user experience!

Review – Goliath application availability monitor

One of the issues with a RDS/Citrix/Horizon enviroment is actually capturing how the experience feels like for an end-user and being able to detect and see how the end-user sees the logon process. Most monitoring tools today focus on the performance on the terminal servers looking at CPU/Memory and storage available or looking at services that are actually running using service monitoring tools like System Center Operations Manager and so on. The issue with these is that they are infrastructure focused which of course is an important aspect but we also need to look at the end-user layer as well. This is something that Goliath have worked closely on with the release of Application Availability Monitor, which allows us to do monitor of enduser applications and desktop using realtime logon test as an end-user from different locations. They also provide visibility into all applications and desktops being launched, with reports and drilldown analytics detailing whether logons succeeded, failed, or where slow.

They also provide screenshots of each process to make it easier for helpdesk to determine where the issue lies.

The architecture is pretty product is pretty simple, it consists of the Goliath Availabilty server which stores and maintance state of the connectivity and stores the result in a SQL Server database which can either be locally installed as part of the goliath server or using a remote setup NOTE: If you download the trial from their website the product will by default install with SQL Express embedded with the installation. We also have the availability agents which actually performs the tests against the different enviroment regardless if it is Microsoft RDS, Citrix XenDesktop or Horizon View.

image

Of course depending on what kind of enviroment you want to do testing against there are some small differences on what we need to configure on the endpoint and configure the enviroment we want it to test against. So we define a schedule to check application availablity from each of our enviroments, and Goliath will do a step by step interaction and take screenshots to determine where any type of error might occur. For instance in this example below we can see that my resource Administrative Desktop is suddenly not available.

image

The test is based upon a schedule which I have defined and which agent it is run from. Here we can see an example from where a desktop is not available but all other components are available and are working hence what we see in the availability analysis. In a scenario where there are issues further in the session you will get a screenshot which shows where the issues lies.

Citrix-Epic-Login-Monitoring

So using Application Availability Monitor from Goliath it can allow us to get a clearer image of how the enviroment is doing but not just by monitoring individual services and processes, but actually combining this with a simulated end-user logon process to see where the process stops.