Author Archives: Marius Sandbu

Citrix and utilizing it with EMS and Azure AD Joined Devices

This is based upon a session that I presented at Citrix User Group Ireland and you can view the SlideShare presentation here –> https://www.slideshare.net/mariussandbu/citrix-with-microsoft-ems but the session was about, how can we leverage Citrix with EMS ( Enterprise Mobility and Security) and also shows the configuration for Citrix FAS together with Azure AD.

Now the focus on this post in purely about having Azure AD with Azure AD Joined Devices (Not Hybrid) and authentication is happening in Azure AD and not On-premises, but there are some supported workloads or topologies further down. 

I have previously written about setting up SSO between Azure AD and Citrix FAS (Which is one of the core components to setting up a simple way to get SSO to an on-premises environment (http://msandbu.org/setting-up-citrix-sso-with-windows-10-and-azure-ad-join/) and also how to tune Storefront to get SSO working properly especially in cases where the end-users close the browsers it self (http://msandbu.org/citrix-fas-with-azure-ad-and-error-404-not-found/)

This allows end-users to access Citrix as part of Azure AD using, for instance, the My Apps Portal. (Or end-users can continue to use NetScaler Gateway as their application portal but Azure AD portal can be easily accessed from Windows 10 Azure AD Joined devices.

citrix-azuread

If customers are moving towards Azure AD, it also means that computer objects and user objects are stored in Azure Active Directory, and it therefore also requires some other tools to handle security as well and some other features as well such as Printing.

Moving Clients out to Azure AD brings a lot of security benefits, because now we don’t have a large Kerberos domain where we might have 10,000+ of clients which have direct communication with each other and able to communicate with fileservers / print-servers and able to communicate directly with the Active Directory Domain Controllers, where it makes it easier for an end-client to spread ransomware across.

With EMS we also have other services such as (Which I will come back to in another blog post)

  • Azure AD (Allows us to monitor Azure AD users and take actions against suspicious activities)
  • Cloud App Security (Allows us to secure end-users and data across SaaS using a Cloud Access Security Broker)
  • Windows Defender ATP (Allows us to monitor the end-user device for suspicious activity and take actions against the device)
  • Azure ATP (Allows us to monitor against suspicious activity against Active Directory)
  • Intune (Allows us to deploy policies and compliance rules against end-users devices)

Of course in the middle of this is Conditional Access, which allows us to use data from both Azure AD and Windows Defender ATP to determine if an end-user should be allowed access to a certain application. If we can also trigger that all traffic to a specific SaaS application should only go through Cloud App Security such as a forward web-proxy. So how do these features work with Citrix? Using Azure AD and FAS we can only connect to Citrix using Receiver for Web.
NB: If you are using Azure MFA and enabled that for all users, this will effectivly override Conditional Access Rules

citrix-ca

So what other aspects of Citrix can we manage or configure using Microsoft EMS?

We can now manage VPN deployment of VPN profiles in Microsoft Intune, which allows us to deploy for instance a Always-ON VPN Profile directly to Intune managed devices. This was previously only available for iOS and Android, but is now supported on Windows 10 as well as long as we have NetScaler 12.0.57 endpoint client installed to be able to read the configuration.

citrix-vpn

And also since Citrix is supported running VPN in Microsoft Azure it allows us to easily build a new modern workspace client with VPN together with Citrix in Microsoft Azure. And using authentication with Certificate and using SCEP protocol on Intune as well we can easily have a process where we deploy a fully new endpoint to end-users. Also with us defining Auto Triggered VPN as well, we can connect a VPN profile directly to a desktop application that we have running on Desktop.  (https://docs.microsoft.com/en-us/windows/security/identity-protection/vpn/vpn-profile-options)

When it comes to application deployment via Intune, we have two options that work we can either deploy applications using the native built-in which only supports MSI based deployments, which works great with the NetScaler Gateway plugin, this is, of course, an issue with Citrix Receiver since that is an exe file, luckily Aaron Parker made a Citrix Reciever installer which can be used through the PowerShell (https://github.com/aaronparker/Intune/tree/master/Apps)

Now there are also other supported workloads which I’ve not described in detail but we have.

  • Netscaler with Azure MFA using NPS with Extension ( You can read more about it here –> https://docs.microsoft.com/en-us/azure/active-directory/authentication/howto-mfaserver-nps-vpn but this is only useful if you want to replace your current MFA provider through a RADIUS provider against the NetScaler and should not be combined with Conditional Access.
  • NetScaler with Intune and Graph API NAC access. This requires that we have an Enterprise license, but it allows the NetScaler to check for end device compliance trough Intune before they are allowed to authenticate with a NetScaler Gateway (Note this works with the NetScaler VPN configuration) and you can think of this as a replacement of OPSWAT or Endpoint Scan (https://docs.citrix.com/en-us/netscaler-gateway/12/microsoft-intune-integration/configuring-network-access-control-device-check-for-netscaler-gateway-virtual-server-for-single-factor-authentication-deployment.html)
  • Storefront with Native Receiver and Azure AD SAML Authentication. As mentioned earlier native receiver doesn’t work well with Azure AD authentication as long as it is on the outside, but Citrix Receiver works with SAML Authentication when it is on the Inside and this can be configured to be setup with Azure AD and MFA using Conditional Access. This is useful if you want to have two-factor authentication on the inside for certain users, such as the business executives.

Now, of course, this is some of the steps involved in setting up a simple SSO mechanism and building up VPN to reach those legacy applications. In the next posts I will focus a bit more on building up a security policy which combines WDATP with Conditional Access.

 

A review of Citrix Analytics

This post is also based on a session I had at Citrix User Group about Citrix Analytics. Now even though that Citrix Analytics is still not released and I did a lot of research about the product in advance. So in this post, I will go into a bit of depth about the product and about features that are available now and also what I think is missing in the product as of now.

Citrix Analytics was announced at Citrix last year. In its core, it is about machine learning and analytics of data that is already available. So is about gathering the data from these different sources into a big data platform and using historical data from these sources to build a baseline and predict what normal behavior is and what abnormal behavior is. It is also about moving from being reactive to being proactive.

Like most monitoring tools today they are reactive meaning that they see that a process stops, a server goes down or that a service stops running and therefore we need to go and troubleshoot. With analytics, we try to shift that focus to be proactive seeing that “here we have the historical data, showing that based upon the last 12 months this occurred on the same data it was because of user load on the server” and based upon this historical data we can take actions. The same method that is looked at from a security perspective. For instance, if we have someone in HR let’s call him Dave and every day he accesses the HR system, and this is his trend for the last 6 months from the same physical device in the same location. Suddenly he accesses another application from another system from another location, and this then falls into abnormal behavior and based upon this we might have a risk, then we need to have an automation action.

tsgpsljd.png

Citrix Analytics is going to be available in three modules, but right now only the Security module is available (Which is now in Preview and you can request access here –> https://www.citrix.com/products/citrix-cloud/form/citrix-analytics/ ) Analytics can gather information from Citrix products only which means XenDesktop/XenApp, ShareFile, NetScaler and Citrix XenMobile.

The data collected from these sources is then placed into Citrix Analytics (Which is a cloud-only service) which consists of a data lake, event processing, and machine learning and will then store information for 13 months to generate a baseline (or user trends) based upon the historical data. Of course, having data stored for this long period allows the system to create more accurate models on user behavior.

NOTE: Even if Analytics is a cloud-only platform it can still get data from existing on-premises deployments

Now Citrix Analytics can also take actions against these systems. If we, for instance, have a user which suddenly is marked as a high risk (based upon risk indicators, failed EPA scan, unknown location for instance) we can then directly disconnect the end-user from XenDesktop or terminate the session. So all the data collected from the different sources can then be turned into actions.

To get the data into Analytics, we need to have other agents installed. For NetScaler we need to MA Service, which is actually sending AppFlow data to see the session information, for XenDesktop we need to have the an agent installed on the delivery controller and we also need to define Citrix Director access because it taps into it to get the historical data stored there.

Source: https://docs.citrix.com/content/dam/docs/en-us/citrix-analytics/downloads/citrix-analytics-getting-started-guide.pdf

Now as mentioned, Analytics creates a user score to determine if they are seen as a high risk or not, and if they are on a certain risk level based upon risk conditions, we can take actions.
So for instance if we see an excessive level of external file sharing on a particular user.

Or any other type of activity which might be a risk indicator.

We can take action on that rule such as disabling the user’s access or log off the account to NetScaler.

Source: https://www.youtube.com/watch?time_continue=1389&v=BJc_ePqHTa4

Now as a product Citrix Analytics has some promise. That it can enable to automatically detect abnormal behavior and react to it. Now when it comes to the limitations of the product so far as I see it is that.

1: It doesn’t as of now have any integrations with a SIEM tool to forward alerts/actions directly. Or any form of API that can be called upon to get that type of information (at least to my knowledge, there might be some API underneath but it is not documented yet)
2: It is Citrix only – when it comes to sources and actions is now only to other Citrix products, which is something that they to extend. Citrix announced something called Citrix Access Control as well during Synergy (source: https://www.citrix.com/blogs/2018/05/08/secure-the-access-and-use-of-saas-web-apps-in-your-digital-workspace/) which provides SaaS access control. Now, this also extends into Office 365 so it might be in the long run that Analytics can also handle against Office 365. Hopefully, Analytics can also re-use information across tenants, so for instance if they can see suspicious behavior from the same IP address across tenants that they can take action on it.
3: I see it a bit overlapping with Azure AD / Intune and Conditional Access – With Conditional Access also we have multiple conditions that we can use to determine or take action of a particular user or device. Now Conditional Access doesn’t use any form of analytics but we have risk levels which are based upon information from Azure ATP, Windows Defender ATP, Azure AD and Device Security which determines if a user should get access or not. Also, Microsoft has its own Security Graph API which has a lot of historical and analytics data. Microsoft also has Cloud App Security which can act as a proxy in web sessions and deny/allow access to the application.

Now what I would love to have here is a integration between Citrix and Microsoft so we could have an integration point between Conditional Access and Citrix when it comes to sources


and then have actions on Azure AD and SaaS and Citrix environments, then it would be really awesome!

 

 

 

 

Gotchas with Citrix Cloud

After spending a couple of days now with the best Citrix User Group in the world! (cugtech.no) I wanted to publish this blog post which was based on one of my sessions, which was about Citrix Cloud Gotchas. I got some personal feedback after the session because they felt like I delivered my honest feedback about the product in general and the current limitations, what works and what I feel that Citrix needs to improve on the product itself moving forward which I want to the blogs to focus about. Now the focus of this blog post is on the XenApp and XenDesktop offerings on Citrix Cloud, have another one on Analytics coming a bit later. Now some interesting fun facts about the backend architecture.

Backend:
Communication between the Control Plane and On-premises is done trough Cloud Connectors. The Cloud Connectors are just Windows Servers installed with that specific component.  Most of the backend services are running on Microsoft Azure and using a combination of App Service, Service Bus, Storage Blog and Virtual Infrastructure. The Control Plane is now available either in the US, EMEA or Asia Pacific, and the NGaaS Service is available in 12 regions worldwide and uses a form of GSLB with proximity to route users to the closest region. Because of the Service Bus architecture the cloud connector acts as a Service Bus Subscriber and listens for jobs from the control plane, therefore the Cloud Connector doesn’t need any public IP since traffic is never initiated from the Citrix Cloud down to the Cloud Connectors. Also with Citrix Cloud, the Cloud Connectors replace the DDC role and acts as the control point for the VDA’s but the Cloud Connector is stateless, unlike the DDC.

  • Note: If you are like me and an early adopter of Citrix Cloud you might be placed in the US plane, and as of now there are not any migration offerings to move one tenant from one location to another. In most cases, you would need to rebuild your environment. 

Citrix has a goal is to maintain at least 99.9% SLA which is equal to 45 minutes downtime each month.

Picture2

 

Offerings: Now Citrix Cloud with XenApp and XenDesktop comes in many different flavors. I’m not going into detail on each of these offerings because the differences between them are listed here –> https://www.citrix.com/content/dam/citrix/en_us/documents/reference-material/xa-xd-deployment-options-feat-comp-matrix.pdf the biggest challenge I have with these offerings right now are two things.
1: No capability to mix between different options. Which means that we cannot have for instance 10 users on XenApp Essentials and 20 users on XenDesktop essentials. 
2: No ability to use concurrent licensing, only user/device.
3: No unified UI across the offerings, right now some are still using Citrix Studio while Citrix is also making a new web UI offering. 

Now as part of Citrix Cloud, there are two components which are optional which are NGaaS and Citrix Workspace, both services can be enabled through the Control Plane.

Picture2

NetScaler Gateway as a Service: This service which runs as a managed cloud service which replaces regular NetScaler ICA-Proxy to a Citrix environment, since the traffic is going through the Cloud Connector to the VDA. As mentioned there will always be traffic through the Cloud Connector through a Windows Service which is responsible for the traffic. When an end-user connects through a NetScaler (GaaS) it will be routed to one of the 12 closest endpoints worldwide.

Picture1

Pros:
Runs as a Managed Service

Doesn’t require any dedicated public IP or certificate since the service is running on top of the Cloud Connector
Highly available worldwide (on 12 different Points of presence)

Cons:
Only ICA-proxy service

No options for advanced features such as Smart Access, HDX Insight (AppFlow) Some additional latency
No support for EDT (UDP based transport)

Citrix Workspace: Which is the new name for the cloud-based storefront, which is now available for all customers on Citrix Cloud after December 2017. (NB: Not yet available for the customer which subscribed to Citrix Cloud before yet, will be migrated soon) and like NGaaS is a fully managed service which now can aggregate all Citrix applications and has a feature in Tech Preview to provide SSO to 3.party based applications.

Pros:
Runs as a Managed Service

Doesn’t require any dedicated public IP or certificate
Cons:
No options for advanced features such as Optimal Gateway Routing

No options for advanced UI changes (Some features such as Logo changes and such are now possible)
No options for regular on-premises MFA providers can only be done trough Azure MFA.

Availability:
Now, most Citrix Cloud services are US based, but Citrix also announced that the control plane is also now available in EMEA as well, which makes management and selling a bit easier since it has quite lower latency to make management a bit easier. However you should be aware of that not all services are not available in EMEA yet, such as Applayering feature still requires to connect to the US endpoint.

Picture3

Security:
When it comes to Security, all traffic is encrypted between the different components, and credentials such as Active Directory is not stored and needs to be entered each time we update a machine catalog or make some changes to an existing one. Credentials to the hypervisor and/or cloud are stored in the connection. Now since Citrix is managing the infrastructure we have no access to the underlying infrastructure and also we don’t have the administrative logging capabilities on Citrix Cloud, so if we want to get out logs on what has happened we would need to contact Citrix Cloud Support (within 30 days to get that information) Note that Citrix Cloud login can also be setup using Azure AD credentials, ensure that if you are using this, setup Azure AD setup with Azure MFA (Because if someone managed to gain access to your Azure AD account they can actually delete an entire machine catalog) 

Other components:
Other components also support Citrix Cloud, such as PVS can support Citrix Cloud but this requires version 7.7 and download of a specific Citrix Cloud PowerShell SDK, but you would still need to set up an on-prem licensing server and SQL to store the information (https://docs.citrix.com/en-us/provisioning/cloud-connector.html) Applayering is available in Citrix Cloud but only the management plane you will still to have the on-prem appliance (ELM) to handle the actual layering jobs. WEM is not there but was recently announced that it will be available in Citrix Cloud soon. https://www.citrix.com/blogs/2018/04/30/workspace-environment-management-service-coming-soon-to-citrix-cloud/

Other things missing:
As part of the other missing capabilities, there are also some other features which as missing such as lack of App-V integration and also lacking monitoring support. Since we now are moving the DDC role away, not all monitoring vendors which many might use don’t support Citrix Cloud yet, and also some of the management packs which was part of the Comtrade deal, will no longer work since they are dependant on some of the services that the DDC is using. Also if we move NetScaler and Storefront as well they are no longer under our control and therefore we need to handle monitoring in some other way such as load testing tools. Also, one thing that caught my eye is the ability to run PowerShell commands natively which you can read more about here –> http://citrixtips.com/disabling-rearm-of-os-and-office-on-mcs-in-citrix-cloud/

Monitoring and troubleshooting:
When it comes to troubleshooting and monitoring Citrix Cloud we only have a few options, first of is the view if there are any issues on Citrix Cloud using the Citrix Cloud status board –> https://status.cloud.com (this allows us to subscribe to alerts using SMS, Phone or WebHook to forward to Microsoft Teams or Slack) The Cloud Connector itself doesn’t have a dedicated event log but provides events into the Application log on the server it is installed on. If you are looking for errors, sort after these event sources on the Cloud Connector Server.
Picture5

Logs are also placed within C:\ProgramData\Citrix\WorkspaceCloud\Logs (In case you are using some log gathering tool such as Log Analytics) also we can view session information using the OData API against Director –> https://www.citrix.com/blogs/2018/03/23/monitor-data-for-xenapp-and-xendesktop-in-citrix-cloud-now-available-through-odata/

Best-practices for Cloud Connector:
Don’t install anything else on the Cloud Connector server (it is self-managed)
Setup AV exceptions and Proxy exceptions for the Cloud Connector traffic –> https://docs.citrix.com/en-us/citrix-cloud/overview/requirements/internet-connectivity-requirements.html and AV exceptions for Cloud Connector –> https://www.citrix.com/blogs/2016/12/02/citrix-recommended-antivirus-exclusions/
Setup Cloud Connector with Server Core –> https://xenappblog.com/2018/citrix-windows-server-core/ (allows for better throughput and higher security) but this kills the troubleshooting Citrix.
Setup Cloud Connector on Windows Server with Cubic Congestion algorithm –> netsh int tcp set supplemental template=internet congestionprovider=cubic
You need to have Cloud Connector for each AD Domain
You need to have at least two Cloud Connectors for redundancy
You should have a stable internet connection.

End-architecture:
Using most of the cloud-based components with Cloud Connectors with an hypervisor such as Nutanix or Cloud-based deployment you don’t need that much of infrastructure, but as of now if you want to leverage some of the advanced capabilities such as HDX Insight, Optimal Gateway Routing and using PVS and WEM you are still going to be needing some servers to host these different components, such as licensing, SQL and management servers.

picture4

High-availability:
For High-availability for the plain architecture you just need to have multiple cloud connectors installed, they are stateless, unlike the regular Cloud Connector. However the Cloud Connectors have Local Host Cache enabled by default, so all CC have a SQL Express installed to handle that. If internet drops out more then 20 seconds the LHC cache will kick-in to ensure existing users will be able to reconnect. Note that this doesn’t work with VDI sessions and it requires that we have a local Storefront server.

Conclusion: Still most of the management in Citrix Cloud is done trough Receiver for Web against Citrix Studio which is a still an MMC console which for me personally is not an elegant solution if we want to deliver the cloud message across. Citrix needs to make it more native web-based management combined with modern automation solutions to allow us to make it easy to script and automate. Also, Citrix needs to ensure that they remove the overhead with a Citrix deployment. Looking at Microsoft RDMI which is more PaaS services, Citrix should look at creating their services as a container instead of individual servers with roles. This could also reduce their own overhead on their infrastructure as well with more container-based deployments so we aren’t stuck with the 25 users limit. Also having more role based access control inside the platform itself combined with administrative configuration control is also something that should be implemented to ensure that companies with high level of security can adopt the solution. Also, they should have an easier way to do migration from on-prem to Cloud, at the end of the day a setup is just a bunch of configuration (luckily someone in the community fixed that for us –> http://citrixtips.com/citrix-cloud-migration-tool/) 

Nextly ill follow up on Citrix Analytics and capabilities.

Azure Stack – in-depth architecture

This was a session that I was going to present on NIC 2018, but because of a conflict, I was unable to attend. Therefore I decided to write a blog post on the subject instead since I see a lot of traffic against my earlier article on the subject (http://msandbu.org/what-is-azure-stack-and-want-is-the-architecture/) where I spoke a lot about the underlying architecture of Azure Stack and especially on how storage and networking are working together. So in this post, I wanted to go a bit more in-depth on some of the subjects but also on the limitations of Azure Stack and things you need to be aware of. Also, I wrote a piece on Brian Madden here about Azure Stack being an appliance and what it means for the type of hardware it uses http://www.brianmadden.com/opinion/Why-do-Azure-Stack-appliances-have-to-be-certified also that Microsoft has now as well published a roadmap on Azure Stack –> https://azure.microsoft.com/en-us/roadmap/?category=compute but this is part one of the in-depth archictecture!

The Core Architecture:
In the core of Azure Stack, we have the software-defined architecture, where it using both Storage Spaces Direct for underlying storage and VXLAN for cross-host communication. Since SPD ( Storage Spaces Direct) has a requirements part of RDMA that is part of the hardware design, this also makes the current limit of nodes at 12 physical servers. It is also running Hyper-V on Server Core, in an integrated system we also have an HLH (Hardware lifecycle Host) which is used to run OEM vendor-provided management tools for hardware. There are multiple virtual machines which run on Azure Stack which makes our part of the ecosystem. 
core

How does Storage Work:

The bare-metal servers are running Windows Server 2016 with Hyper-V as the underlying virtualization platform. The same servers are also running a feature called Storage Spaces Direct (SPD) which is Microsoft’s software-defined storage feature. SPD allows for the servers to share internal storage between themselves to provide a highly-available virtual storage solution as base storage for the virtualization layer.

SPD will then be used to create a virtual volume with a defined resiliency type (Parity, Mirrored, Two-way mirror) which will host the CSV shares and will use a Windows Cluster role to maintain quorum among the nodes.

SPD can use a combination of regular HDD disks and SSD disks (Can also be all-flash – Which Cisco announced earlier today) to enable capacity and caching tiers which are automatically balanced so hot data is placed on the fast tier and cold data on the capacity tier. So when a virtual machine is created and storage is placed on the CSV share, the virtual hard drive of the VM is chopped into interleaves of blocks which by default are 256KB and is then scattered across the different disks across the servers depending on the resiliency level. Now in Azure Stack by default we have a three-way mirror which is used to provide redundancy in the Stack. We have Service Fabric Cluster running  which is used to provide the tenant and admin API’s across using Azure Resource Manager and we have a underlying controler called ACS. On each VM that is configured with a standard HDD, the (ACS) Storage controller which insert a IOPS limit on the hypervisor to 500 IOPS to provide consistency with Azure.

storage
The Network Fabric:

The network consists of multiple modules, such as the software load balancer (MUX) which is a feature running on the hyper-v switch as a host agent service, and is also managed centrally by the network controller which acts as a central management for the network. The load balancer works on layer two and is used to define a public IP with a port against a backend pool on a specific port. The software load balancer is load balancing using (DSR) direct server return which means that it only load balances incoming traffic and the return traffic from the backend servers are going directly from the server back to the requesting IP address via the Hyper-V switch. This feature is also presented in Azure Stack as the regular load balancer.

To ensure that the software load balancing rules are in place and that the distributed firewall policies are synced and maintained and of course when we have VXLAN in place all the hosts needs to have a IP table so each node knows how to communicate with all the different virtual machines on different hosts. This is where there needs to be a centralized component in place which takes care of that and that is the network controller.

On Azure Stack the network controller runs as a highly available set of three virtual machines which operates as a single cluster across different nodes. The network controller has two API interfaces, one which is the northbound API which accepts requests using REST API, so for instance if we go and change a firewall rule or create a software load balanced in the Azure Stack UI the Northbound API will get that request. The network controller can also be integrated with System center but that is not a part of Azure Stack.

The southbound API will then propagate the changes the different virtual switches on the different hosts. The Network controller is intended to be a centralized management component for the physical and virtual network since it uses the Open vSwitch standard, but the schema it uses is still lacking some key features to be able to manage the physical network.

The Network Controller is also responsible for managing the VPN connections and advertisement of the BGP routes and maintaining sessions states across the hosts.

network

From a network perspective once, you have a Site-to-site gateway established you have essentially have two virtual machines which are powering the site-to-site VPN solution for all tenants. Therefore you will not have a dedicated public site IP for each gateway.

Troubleshooting and management:

When troubleshooting issues, make sure that you check if there is anything documented on the version build and yes there are a lot of documentet issues and bugs (https://docs.microsoft.com/en-us/azure/azure-stack/azure-stack-update-1712 for instance) and if you run into any issues such as alerts of such on the admin portal you will need to get logging information from the PEP (Privileged End Point) to get some assistance from Microsoft. Here is an example script you can run using the PEP to collect logs on an integrated system (note on a integrated system, there are always 3 instances of the PEP running) oh and Microsoft recommends that you connect to the PEP from a secure VM running on the HLH.

Get-AzureStackLog -OutputPath C:\AzureStackLogs -FilterByRole VirtualMachines,BareMetal -FromDate (Get-Date).AddHours(-8) -ToDate (Get-Date).AddHours(-2)

You will need to define what kind of role you want to get logs for
You can read more about the logging feature from this Github MD https://github.com/Azure/AzureStack-Tools/blob/master/Support/ERCS_Logs/ReadMe.md
But if an update fails you are pretty in the dark, and will need to extract these logs on different levels and roles and send across to Microsoft to get it troubleshooted, and we have had some times already now needed their assistance in order to troubleshoot an failed upgrade.

Security:
Of course, Microsoft focused alot on Security in Azure Stack which of course is something of the core advantages of it. Below is some of the settings which are configured on the Stack.
* Data at rest encryption – All storage is encrypted on disk using Bitlocker, unlike in Azure where you need to enable this on a tenant level. Azure Stack still provides the same level of data redundancy using three-way copy of data.
* Strong authentication between infrastructure components
* Security OS baselines – Using Security Compliance Manager to apply predefined security templates on the underlying operating system
* Disabled use of legacy protocols – Disabled old protocols in the underlying operating system such as SMB 1 also with new security features protocols such as NTLMv1, MS-CHAPv2, Digest, and CredSSP cannot be used.
* Constrained Administration (such as the PEP  endpoint uses PowerShell JEA (Just Enough Administration)
* Least privileged account – The platform itself has a set of service accounts used for different services which are running with least privilege
* Administration of the platform can only happen via Admin portal or Admin API.
* Locked down Infrastructure which means that we have no direct access to the hypervisor level.
Windows Credential Guard – Credential Guard uses virtualization-based security to isolate secrets so that only privileged system software can access them.
Uses Server Core to reduce attack surface and restrict the use of certain features.
Windows Defender on each host
Network ACL defined in TOR, SDN and Host and guest which are deployed using Ansible
Group Managed Service Accounts
Rotates secrets every 24 hours

Limitations:
Of course, as with a new platform, there are a lot of limitations that you need to be aware of (especially if you have read up on the consistency between Azure and Azure Stack) and if you have a system which as support for Azure using some form of PaaS service it does not necessarily mean that it supports Azure Stack. The application vendor will need to ensure that their product also is compatible with Azure Stack feature level. Here is also a list of other limitations that I have encountered. 

  • Limited amount of instance types (A, D and Dv2 series)
  • Single fault and update domain (UPDATE: Changed in 1802 build)
    • Only LRS Storage
  • No support for IPv6
  • No support for Managed Disks
  • Limited support for Premium disk (Cannot guarantee performance)
  • No support for Application Gateway or Traffic Manager
  • No support for VNET Peering
  • No support for Azure SQL (Only SQL Server which is served through a SQL Connector)
  • Only support for Basic VPN SKU (and only two pair HA nodes which provides VPN for all tenants)
  • No Network QoS on NIC (Can allow for noisy neighbors)
  • Only some marketplace items (Such as Windows 10 is missing out and other fun part in marketplace)
  • No customer specific gateway (Same IP for all gateway connections)
  • A lot of Azure Services such as Data Factory cannot use Azure Stack Storage (Hardcoded URL on the different services)
  • No support for SQL Server and AzureStack (Stretched database or SQL Backup) functionality which is part of SQL Server
  • No support for Citrix on Azure Stack (Meaning no Citrix NetScaler and Provisioning options available)
  • No support for Azure files
  • Max blob size 195 GB (UPDATE: Changed in 1802 build)
  • Max disk size 1 TB
  • No support for Point-to-site VPN
  • No support for docker machine drivers.
  • Troubleshooting is mainly dumping logs to the Microsoft support team
  • Some UI bugs such as defining DNS settings on virtual network

 Now since the release, there has been one update each month since the release, this shows the dedication to the platform and the ecosystem, but Microsoft has to make it easier to run edge processing and have Azure features that support Azure Stack integration. Also, one thing I want to highlight is that Azure Stack has one thing that it excels in which is networking 😉 but no wonder with the networking backend it provides.

rdmaazurestack

Also earlier today Cisco also came with an all-flash version of Azure Stack so now Microsoft really needs to fix the scale ability issues.

Azure Standard load balancer

A While back Microsoft announced a new Load balancing tier in Microsoft Azure called Azure load balancer standard, which is a new SKU of the existing load balancing service in Azure, which is still in preview when I’m writing this blogpost.
Azure provides different load balancing solutions, where we have Application Gateway (provides layer 7 and SSL based load balancing) Traffic Manager which provides geo redudancy using DNS based load balancing) and Load Balancer service which has been aimed at layer 4 load balancing.

Now there are many differences between the standard one and the old one (basic)

Standard SKU Basic SKU
Backend pool endpoints any virtual machine in a single virtual network, including blend of virtual machines, availability sets, virtual machine scale sets. virtual machines in a single availability set or virtual machine scale set
Availability Zones zone-redundant and zonal frontends for inbound and outbound, outbound flows mappings survive zone failure, cross-zone load balancing /
Diagnostics Azure Monitor, multi-dimensional metrics including byte and packet counters, health probe status, connection attempts (TCP SYN), outbound connection health (SNAT successful and failed flows), active data plane measurements Azure Log Analytics for public Load Balancer only, SNAT exhaustion alert, backend pool health count
HA Ports internal Load Balancer /
Secure by default default closed for public IP and Load Balancer endpoints and a network security group must be used to explicitly whitelist for traffic to flow default open, network security group optional
Outbound connections Multiple frontends with per rule opt-out. An outbound scenario must be explicitly created for the virtual machine to be able to use outbound connectivity. VNet Service Endpoints can be reached without outbound connectivity and do not count towards data processed. Any public IP addresses, including Azure PaaS services not available as VNet Service Endpoints, must be reached via outbound connectivity and count towards data processed. When only an internal Load Balancer is serving a virtual machine, outbound connections via default SNAT are not available. Outbound SNAT programming is transport protocol specific based on protocol of the inbound load balancing rule. Single frontend, selected at random when multiple frontends are present. When only internal Load Balancer is serving a virtual machine, default SNAT is used.
Multiple frontends Inbound and outbound Inbound only
Management Operations Most operations < 30 seconds 60-90+ seconds typical
SLA 99.99% for data path with two healthy virtual machines Implicit in VM SLA
Pricing Charged based on number of rules, data processed inbound or outbound associated with resource No charge

Now like AWS, Microsoft now charges based upon Load balancing rules and data processed (only for the standard SKU, the basic one is still free)

Load Balancing rules:
First 5 rules: $0.025/hour
Additional rules: $0.01/rule/hour

Data processed trough the load balancer.
$0.005 per GB

The biggest change in this new tier is that 1: It supports availability zones (Which today was GA), It has a much better diagnotics options and lastly it provides something called HA ports which ill come back to a little bit later in this post. To get started to configure a azure load balancer standard you might need to use the CLI or PowerShell, this example belove shows using Azure CLI.

Create Resource Group
az group create –name changename–location changelocation

Create Public IP with Standard SKU
az network public-ip create –resource-group myResourceGroupSLB –name myPublicIP –sku Standard

Create Standard load balancer
az network lb create –resource-group changename –name changename –public-ip-address myPublicIP –frontend-ip-name myFrontEnd –backend-pool-name myBackEndPool –sku Standard

Now looking back at some of the new functionality such as HA ports, this feature helps you with  high availability and scale for network virtual appliances (NVA) inside virtual networks. It can also help when a large number of ports must be load balanced since we can load balance entire ports ranges, setting the port value to  0, and the protocol to All. The internal Load Balancer resource then balances all TCP and UDP flows, regardless of port number. NOTE: that if you want to use HA ports to provide HA for NVA in Azure please make sure that the vendor has verified the appliances to work with HA ports.

One of the most welcome features is that we can now monitor VIP and DIP availability directly from the console, which was a really cumbersome process in basic load balancer tier.

lbazure

You can now also see availability on DIP based upon a single IP address.

lbazure2

So using Azure Load Balancer standard will allow us to easier load balance NVA inside where we for instance have a hub and spoke layout on the virtual network, and also the ability to now monitor the availabilty of DIP to make it easier to troubleshoot the health probes is something that I’ve been missing for some time!

 

Is this the year of Unified Application Delivery?

Always amusing when I see on social media “Is this year going to be the year of VDI?” Which has been going back a lot of years already? The issue with VDI projects back in the day (Starting to write like an old guy…) was that the architecture and storage didn’t quite scale to that extent since VDI projects were heavy when it came to resource usage, and therefore launching a successful VDI project was difficult of course there weren’t a lot of good products in the market as well. I don’t know how many blog posts and reference architectures I looked at about “scaling and designing VDI”, “How to scale VDI for IOPS” etc.. The first uprising in VDI projects started when the new players in the software-defined market came along and changed the market with its hyper-converged infrastructure made VDI projects a lot easier in terms of performance and more players in the VDI space as well. Now I’ve seen large reference VDI projects from most of the different hyper-converged / software-defined players in the market and also seeing more and more VDI deployments leveraging the cloud as well since you have the economics which can make it suitable for many use-cases. So is 2018 going to be the year of VDI? Not gonna happen! My personal opinion is that the year of VDI is not going to happen this year as well. I believe that the VDI ship has sailed, and moving forward more and more SaaS-based services are going to replace the windows based applications at a much faster rate. There is, of course, going to be a need for a VDI solution to deliver applications for a long time to come, but we need to look away from VDI solutions and focus more on application delivery (and not just windows based)

However, the key moving forwards it to be able to deliver all these applications in a single unified manner. Combining all those Windows-based applications, Linux based applications, those sloppy web-based apps which have their own authentication mechanism or using Basic Web authentication. Also to deliver those modern web applications which support open authentication mechanisms such as SAML and OAuth. The key is also to have a single security plane as well to control access and maintain security to these applications and also have a single source of truth for identity as well.

last ned

Handling security in a Cloud-based scenario such as with Google Gsuite or Office 365 also requires more investment into the different CASB products such as Microsoft 365 with Cloud App Security to allow integration directly with the cloud providers. So is it actually possible to build this type of unified application delivery platform? We are pretty close, so what kind of products do we need?

Identity:
There are multiple identity solutions that can be used, most tend to look at the cloud-based identity sources such as  Azure AD and Google Identity since they are proven for scale and have advanced functionality both to setup federation/trust but also a rich ecosystem which allows for building new applications which support these are identity sources. Both have a lot of built-in mechanisms to handle authentication such as SAML and OAuth. Azure AD, however, has a bit more security features built-in compared to Google at this time. There are also other solutions which provide these built-in authentication solutions such as Ping Identity, Okta, and One Identity. For on-premises deployments, we also have VMware identity manager. This depends on where you want the identity source to be located. Of course, vendors such as Ping, Okta and One identity are companies which only focus identity field and have proven products for that purpose. 

Application Delivery Platform:
Here we also have a couple of solutions which can deliver both Windows/Linux applications from a single platform, such as Citrix XenDesktop and VMware Horizon. Both of these platforms support LDAP/AD but also other authentication mechanisms such as SAML to support user-based authentication from end-user to backend. This allows us to for instance authenticate against XenDesktop or Horizon using any of the identity sources listed above.

Gateway:
This is of course to handle traffic and proxy connections with authorization rules against internal resources such as on-premises web applications which only supports basic web authentication for instance or to handle traffic to a backend VDI or RDS host. Both Citrix NetScaler and VMware Identity Manager can handle authentication mapping from SAML to for instance basic web authentication, both different in terms of function since NetScaler is more an advanced beast since it is a network application focusing on ADC but has advanced functionality to handle authentication and authorization while Vmware Identity Manager is more aimed at handling user lifecycle management and application access. VMware has traffic flow through its Unified Gateway. But also the online identity providers can also handle traffic against Basic Web applications. Microsoft also has Azure Active Directory App Proxy which allows authentication and traffic flow against on-premises web applications using Kerberos for instance.

Security:
Identity is the new firewall, which makes even more sense in this type of environment where we cannot control the end-users traffic and therefore we need to make sure there are security mechanisms in place to ensure data confidentiality. Only Microsoft of the large vendor has a solution which falls within the CASB (Cloud Access Security Broker) domain to handle connections and activities done against a SaaS product. The product Cloud App Security is now tightly integrated within Azure AD as well. VMware and Citrix have some policies controls which determine what kind of activity an end-user can do within a Terminal server environment and conditional access on the device connecting, but they do not have any functionality to control what a user can do within a SaaS service. Of course, controlling the user and what the user does is only a small part of the puzzle, we also need to be able to control to an extent, how the endpoint is which the end-user is using as well. Most all vendors Citrix, VMware, Microsoft, and Okta have MDM solutions which allow us to make more advanced authentication rules to determine if an end-user should have access to a business critical application from certain endpoints. Microsoft and VMware both have Conditional Access rules where we can build a solution to ensure that device is compliant before gaining access to an application or system.

Unified Application Portal:
There are multiple portals which we can utilize which can expose all these different applications from multiple sources, however we will be taking a closer took at those from Citrix, VMware and Microsoft.

Azure AD My Apps:
My Apps in Azure is directly integrated into Azure Active Directory and can expose applications which have been added to Azure AD. This can also be Office365 which also has its own App Launcher solution which reflects and shows the same application. It can also add other 3.party applications using SAML (Citrix and VMware can be added here but just as a hard link) but VDI desktops cannot be shown directly here. Azure also has support for an internal web application using Azure AD application proxy which can publish internal web applications and supports SSO using Kerberos and NTLM. The good thing is that this also integrates directly with Office365 så applications can be shown directly to the end-users App Launcher. Of course that we can protect access using Azure MFA and Azure Conditional Access which can now be integrated into Cloud App Security.

2017-01-18_11-39-57

Workspace One is a combination of VMware Identity Manager, AirWatch Enterprise Mobility Management Suite, and Horizon which is running as a local server setup.  The Workspace portal allows us to present out VDI/RDSH Desktops and application combined with web-based applications using SAML, where can also protect resources using conditional access rules. And also VMware has its own MFA solution as well that be used to provide additional security on top. The advantage here is that we can present both Windows/Linux application and web application within a single portal and multiple security policies on top.

VMware Workspace One:

Citrix Workspace Services which is the future workspace portal from Citrix which will be able to serve both applications and data from the same UI. Which will be more similar to Office365 app launcher with Sharepoint data in it. A Similar setup is possible with Unified Gateway where we can present out Windows/Linux applications and desktop, SAML based applications using NetScaler as the SAML SP and RDP sessions using RDP proxy etc. Citrix also has the advantage of being able to deliver VPN solutions as well so it provides a strong range of different solution which can be presented from within the same portal.

Citrix Workspace / Unified Gateway:
Bilderesultat for citrix workspace services

So what does the workspace of tomorrow look like? I’m guessing that this is the product that many vendors are working towards solving or finding the secret ingredients.
With more and more business applications become more and more web-based there is no denying that the workspace will need to have tight integrations with modern SaaS products such as Google GSuite, Salesforce, Service Now, Workday, Office365 and such, but also be able to integrate with on-premises based applications on legacy systems such as Windows/Linux based applications and Desktop but also older internal web applications. The workspace will also need to have certain security policies in place such as conditional access to give a more granular approach to security when it comes to giving access to applications and or SaaS services. We also need to have certain security products backend as well to take control of the data and API access to the SaaS services to ensure compliance and so on.

 

 

 

Windows 10 and Server 2016 network enhancements

There has been a lot of new enhancements done to the networking stack in Windows 10 and Server 2016, which I wanted to write a bit more about. Earlier I wrote a bit about TCP Fast Open which was available in Windows 10 and Microsoft EDGE to reduce the initial TCP SYN process http://msandbu.org/increasing-microsoft-edge-performance-using-tcp-fast-open-on-netscaler/ but looking at the rapid release cycle in Windows ther has been more new stuff that has been introduced over the last couple of years. Much of the functionality is defined is NDIS (https://docs.microsoft.com/en-us/windows-hardware/drivers/network/overview-of-ndis-versions) Which is the Windows specificiations on how drivers should be created for network communication. Now some of the new features that have been introduced are things as:

  • CUBIC Support: In Windows 10 creators update they also came with support for the congestion algoritm CUBIC, which is actually the default congestion algoritm in Linux. The main goal behind CUBIC is to improve the scalability of TCP over fast and long distance networks, and also to keep the CW much longer at the saturation point.
    The following commands can be used to enable CUBIC globally and to return to the default Compound TCP (requires elevation):

    • netsh int tcp set supplemental template=internet congestionprovider=cubic
    • netsh int tcp set supplemental template=internet congestionprovider=compound
  • Fast Connection Teardown: TCP connections in Windows are by default preserved for about 20 seconds to allow for fast reconnection in the case of a temporary loss of wired or wireless connectivity.  However, in the case of  such as docking and undocking this is long delay, Fast Connection Teardown feature can signal the Windows transport layer to instantly tear down TCP connections for a fast transition.
  • ISATAP and 6to4 disabled by default: With the uptake in IPV6, these protocols are now disabled by default, but can be enabled using Group Policy, Teredo is the last transition technology that is expected to be in active use because of its ability to perform NAT traversal to enable peer-to-peer communication.
  • Windows TCP AutoTuningLevel: Before the Creators Update the TCP receive Window autotuning algorithm depended on correct estimates of the connection’s bandwidth and RTT, the new algoritm adapts to BDP (Bandwidth-delay product) much more quickly than the old algorithm and has increased performance when it comes to converge on the maximum receive window value for a given connection.
  • Recent ACKnowledgement (RACK): RACK uses the notion of time, instead of  packet or sequence counts, to detect losses, for modern TCP implementations that can support per- packet timestamps and the  selective acknowledgment (SACK) option. RACK is enabled only for connections that have an RTT of at least 10 msec in both Windows Client and Server 2016. This is to avoid spurious retransmissions for low latency connections. RACK is also only enabled for connections that successfully negotiate SACK.
  • Windows Low Extra Delay BAckground Transport (LEDBAT): LEDBAT is a way to transfer data in the background quickly,  without clogging the network. Windows LEDBAT transfers data in the background and does not interfere with other TCP connections. LEDBAT does this by only consuming unused bandwidth. When LEDBAT detects increased latency that indicates other TCP connections are consuming bandwidth it reduces its own consumption to prevent interference. When the latency decreases again LEDBAT ramps up and consumes the unused bandwidth. LEDBAT is only exposed through an undocumented socket option and can only be used by approved partners.
  • RSSv2: Compared to RSSv1, RSSv2 shortens the time between the measurement of CPU load and updating the indirection table. This avoids slowdown during high-traffic situations. This is part of the  Windows 10, version 1709 kernel.

This youtube video from Ignite last year goes into detail on the different improvements that have been introduced into Windows over the course of the last year –> https://www.youtube.com/watch?v=BlBWUGcYCQQ 

And of course having a strong networking stack is important to handle the modern web applications and connections from different endpoints and different network connectivity. In the next blog post I will focus on a bit more on the container networking aspects that have been introduced in Windows.

 

 

 

My thoughts on Citrix buying Cedexis and what it is?

Earlier today Citrix announced publicly that they have bought the company Cedexis. (If you didn’t catch the news you can read the official blogpost here –>  https://www.citrix.com/blogs/2018/02/12/citrix-acquires-cedexis/)

Being the tech-curious mind that I am, and started to read through the official blogpost didn’t give me any clarity in what kind of value it would actually bring to Citrix. Also, I haven’t heard about the company before (other than some on social media from time-to-time, so I started to do some research) so therefore I decided to take a closer look and how Citrix can benefit from it.

Looking into the company I noticed that they have a set of products which make up the core which is called Cedexis ADP (Application Delivery Platform) which is actually aimed at making more intelligent load balancing using a combination of (Real-user monitoring & synthetic monitoring) to make the correct decision on where to route the data.

clip_image002

The platform is split into smaller parts where the core is three applications.

Radar: Is a product which contains and gathers real-user telemetry from thousands of users worldwide (you can see some of the interesting statistics here https://live.cedexis.com/) so with this we have detailed mappings on outages and response times and such. This is using a simple JavaScript script embedded within a content page or application provider’s pages to collect information about the performance and availability of a data center or delivery platform. (You can also access some nifty reports here as well à https://www.cedexis.com/get-the-data/country-report/)

Sonar: is a live-ness check service that can be used to monitor web-based services for availability. Sonar works by making HTTP or HTTPS requests from multiple points-of-presence around the world to a URL, Sonar checks are performed from multiple test locations from around the world.

Openmix: a SaaS Global Load Balancing which uses information from for instance Radar to consider real-time data feeds of end-user telemetry, and server or application monitoring data from Sonar to do Intelligent Global Load Balancing. Using all these different tools we can also combine this with other data such as Cost/Performance and define our own rating on a service if we for instance have a service available on multiple locations/platforms. The cool thing about Openmix being a cloud service and all is that Is available via DNS and HTTP, example here à https://github.com/cedexis/openmixapplib/wiki/Openmix-HTTP-API#overview

Fusion: In addition to Radar and Sonar data, Openmix can use 3rd party data as part of its decision criteria, which can integrate an existing synthetic monitoring service you already use. Or make cost-based decisions using usage data from a CDN provider. Here is a picture of the supported integrations that Fusion has which can be used to determine the best path.

clip_image004

There are also some new integrations such as Datadog, which also allows us to do for instance more efficient routing application logic based upon Datadog alerts.

So, looking at the products we can see that Cedexis have multiple tools to determine the optimal path, including the use of real-time user information and synthetic testing combined with third party integrations using custom metrics also a global SaaS load balancing service. For instance if we have a service which is available in multiple locations on multiple cloud providers, how can we ensure that an end-user is directed to the optimal route? We have multiple logic such as Radar (How is the network performing or CDN where the content is served from? ) and Sonar: (what is the RTT of the application from the ongoing test?) and also information from Fusion(New relic integration for instance APM which shows that Service Y is performing slow because of DB errors) and deduct from that information the correct path. However, Cedexis is missing the product to handle the actually load balancing in between the end-users and the backend services and is depedant on someone else to actually do the local load balancing and handle SSL traffic. While NetScaler on the other hand is missing the products to do more intelligent load balancing based upon real user telemetry, instead of just doing health-checks to the backend web server or doing GSLB based upon user proximity or such.

I can see the value of integrating the Cedexis platform into the NetScaler portfolio seeing that it can make it a much more powerful smart application delivery system. So, this is just my personal idea on how the portfolio could look like from an integrated solution. We could have NetScaler MAS feeding fusion using Web Analytics for instance and also seeing the performance usage on the NetScaler’s) which will then make it easier for Openmix to make the decision if the end-users should be load balanced to region X or Y based upon the weight that was defined on the application or service.

image

So just some intial thoughts on the Cedexis platform. Looking forward to try the platform and testing it out in real-scenario and what plans Citrix have for the platform moving forward.

Cloud Wars – IBM vs Microsoft vs Google vs Amazon IaaS

In my previous blog post I did a short overview of the different cloud vendors, a bit about their focus areas and also a bit about strengths and weaknesses. The blogpost can be found here –>http://bit.ly/2CrBgZA .In this post I want to focus more on IaaS and the offerings surrounding it, first I want describe a bit about each vendor and then ill go into a bit more comparison and also include the price/performance factor here as well and end it with some focus on automation functionality and additional services.

IBM:
As mentioned in my previous blogpost, IBM with its Softlayer capabilities has had extremely focus on bare-metal, with the addition on traditional IaaS and also with the extended partnership with VMware, they can also provide vCloud Foundation package (which also is a prerequisite package for VMware HCX) or just plain ESXi with vCenter deployment. On the bare-metal options we can choose between hourly or monthly pre-configured servers or customize with single to quad processing solutions that range from 4 to 72 cores. We can also order bare-metal servers with dedicated GPU offerings such as K2, K80, M60, P100). One of the cool features in terms of pure IaaS is that they offer pure Block storage as an option as well, using iSCSI or just plain file storage using NFS which also is an option. In terms of scale ability they can only offer up to 56 cores and 242 GB RAM for a single virtual machine which is a lot smaller then most offerings in Azure, Google and AWS. IBM like AWS and Azure also offers pre defined instances sizes which can be used and when setting up an instance you can also define what kind of network connectivity you want to have, by default you get 100 mbps uplink and private connectivity which is free, but if you want to up it to 1 GB you need to pay a cost. The main issue is that of all much of the concepts such as availability zones and other options for HA is not an option in IBM compared to GCP, AWS and Azure.

In terms of Automation, IBM has a feature called Cloud Schematics which is natively based upon Terraform, so it is basically wrapping REST API calls using a IBM Provider in Terraform, https://ibm-cloud.github.io/tf-ibm-docs/ we also have the ability to run provision scripts which can be run at boot as part of a deployment. One of the things I feel is missing on IBM when it comes to automation is the ability to provider more overall system management capability such as Azure automation or AWS systems manager.

Google:
Google compared to the others have the most simplified deployment of virtual machines. Also they are the only vendor that has the option to defined custom instance sizes (for a bit higher prices of course) and also the flexibility when it comes to GPU flexibility for instance we can add GPU instances to any type of instance and also when it comes to disk type and sizes.

With Automation, Google has an API framework called Google Cloud Deployment Manager, which uses a YAML based syntax, but can also be using providers from Terraform or Puppet to do the deployment as well. Google also has the option do run start-up scripts on each virtual machine which allows for scripting of software and services inside the virtual machines. Google provides up to 96 vCPU and 1433 GB of memory on their largest instances. They however do not have any form of bare metal options, compared to IBM but that is not their focus either but like AWS, Google has gone into a partnership with Nutanix on a Hybric Cloud model which is going to be interesting to see how it turns out. Another cool thing about Google is that they provide live migrations of instances as default to handle maintance updates on their infrastructure.

For deployment of redudant solutions you need to be able to deploy instances across multiple zones within a region (Which is a simliar setup as Amazon Web Services do and Azure does with Availability Zones)

From a management perspetive, Google has been really good at developing their Cloud Shell solutions which allows for easy access to virtual instances directly from their browser and also allows for simple access with auto inserting the SSH key as part of the setup. One of the coolest things about Google is their core infrastructure and the network backbone which is called Andromeda https://cloudplatform.googleblog.com/2017/11/Andromeda-2-1-reduces-GCPs-intra-zone-latency-by-40-percent.html which now has allowed them to provide low latency high bandwidth connections on east-west traffic. Also that they SDN is also worldwide meaning that if you create a virtual network by default it will be available on all the different regions (where different subnets are placed within each region  but are all interconnected)

Azure:
Microsoft has also been doing a lot of work recently and investing heavily into new options such as new GPU offerings with the P100 and P40 cards but also with the introductions of availability zones (Still in preview for most services) which now allows for a great level of redundancy which is now pretty similar to Zones on GCP and AWS. Microsoft has also introduced loads of different new instances types with the burstable compute (B-series) and also now with the introduction of GA on Accelerated networking which allows for SRV-IO based network deployment of instances in Azure.

From a management perspective Microsoft has been doing alot around regular operations, such as with Log Analytics which can now do patch management and provide multiple pieces of monitoring across different platforms and also integrating different PaaS serivces to allow for a single hub to do monitoring across most of the services. Also with simple EDI based tools such as Logic Apps and Azure Automation allows us to setup simple and down to more complex automation jobs to do automated deployments and start/stop virtual instanced based upon a trigger or schedule. Also that they provide alot more tools when it come to migration and backup tools compared to the other vendors, with Azure Migrate and Azure Site Recovery.

Also Microsoft has been doing alot of investment into their Cloud shell solution as well which allows us to run az cli (bash based) and Azure PowerShell cmdlets directly from the browser. (As as of t01.02 they now also support Ansible directly from the cloud shell interface)

One of the issues with Azure from an IaaS perspective is the lacking flexibility to mixing like GPU cards with different instances, scaleable IOPS together with disk size. Also Microsft is focusing alot on building partners in the ecosystem to support automation and have been doing alot when it comes to Terraform which now covers alot of the resources in Azure directly.

AWS:
When it comes to IaaS, Amazon provides most of the services both when it comes to bare-metal(coming) and support with VMware. Also different options depending on if needed reserved instances or just need to get reserve capacity or godzilla virtual machines. They also provide different storage options and scale options on IOPS depending on the size of the storage. Now with the upcoming support with VMware will also provide a whole new level of infrastructure solutions (the service is now available but still limited to certain regions in the US)

AWS also provides multiple management tools to make things easier such as AWS systems manager (which can also target on-premises virtual machines), and they even provide their own AWS Managed Services where they manage the IaaS solutions for you . AWS also has a service called OpsWorks which provides automation solution based upon Puppet and Chef as a managed service which can be then used to deliver configuration management against your own enviroment in AWS. AWS also has CloudWatch and CloudTrail to track events, logs and activity and API usage across AWS subscriptions.

AWS also has multiple options when it comes to GPU offerings such as P2 and G2 series which comes with a dedicated GPU card, or use the flexible GPU which is a software-defined GPU offering which allows us to add a GPU card to almost any type of instance.

Summary:
Now the fun part is that most providers are now delivering more and more services to help with automation and system management, such as managed container engine cluster and also different advisor roles which can detect cost or security issues. This can be to check for best-pratices according to the cloud provider.

Now the interesting part is mainly around the container solutions that most providers are now fighting about. Both Microsoft and AWS have their own Container instance solution, where you just provision a container based upon an image and don’t have to worry about the infrastructure beneath (AWS Fargate and Azure Container Instance) and both of them also provide other container solutions such as Amazon Container Engine and Azure Container Engine. The fun part is that all 4 providers supports Kubernetes as the container orchestration engine and have supported features to build upon it, this can be a container registry solution or CI/CD solutions.

Technical Comparison: So the intention here is to have a short table to compare some of the different infrastrucutre services from each vendor, it does not measure the quality of service but just defines that they have a service and service name.

Provider Microsoft Google Amazon IBM
High Performance Computing Services Azure Batch Amazon Batch  IBM Spectrum, IBM Aspera
Reserve Capacity instances Low Priority VM’s Preemptible instance Spot instances
Reserved Instances Reserved Instances Committed use EC2 Reserved Instances
Dedicated instances EC2 Dedicated Instances
Bare Metal hosts Yes (Announced) Yes
Burstable Instances Yes Yes Yes No
VM Metadata support Yes Yes Yes Yes
Custom Instance Sizes Yes Yes
Compute Service Identity Yes Yes Yes No

 

High performance disk Premium Disk SSD persistent disk, Local SSD SSD EBS SSD Octane
GPU-instances N-series (NV, NC, ND) Flexible GPU P2 instances / Flexible GPU Only as bare metal
Nested virtualization support Yes Yes (Beta) Yes
Hybrid Story Azure Stack Nutanix VMware VMware
GPU cards support M60, K80, P40, P100 K80, P100, AMD S9300 M60, Custom GPU, V100 P100, M60, K80
Desktop as a service Third Party Third party Workspaces & AppStream Third Party
Scale set VM Scale Set Instance Group Auto Scaling Auto Scale
Godzilla VM Standard_M128 128vCPU, 3800 GB N1-highmem 96vCPU, 1433 GB X1.32large 128vCPU, 4 TB 56 vCPU, 242 GB Memory (Other Bare Metal)
Skylake support Yes Yes Yes
VMware support Yes (Announced) Yes (Limited to the US) Yes
Billing for VM Per minute Per Second Per Second (For some) Per Hour
Deployment & Automation service Azure Resource Manager Google Deployment Manager Cloudformation IBM Cloud Schematics
CLI PowerShell, AzureCLI GCloud CLI, Cloud Tools for PowerShell AWS CLI, AWS Tools for PowerShell Bluemix CLI
Monitoring & Logging Microsoft Log Analytics, Azure Monitor StackDriver CloudWatch, Cloudtrail Monitoring and Analytics
Optimization Azure Advisor Native Service in UI Trusted Advisor
Automation tools Azure Automation Amazon CloudOps for Chef and Puppet Cloud Automation, Workload Scheduler
Support for third party configuration and infrastructure tools Chef, Puppet, Terraform, Ansible, SaltStack Chef, Puppet, Terraform, Ansible, SaltStack Chef, Puppet, Terraform, Ansible, SaltStack Terraform
Cloud Shell support Yes Yes
EDI Tools Azure Logic Apps

In the next blog post I will take a closer look at some price comparison and comparing apples and apples in some benchmarks which measures speed of deployment using the different deployment tools and in VM spped on different levels.

Citrix FAS with Azure AD and Error 404 Not Found

So a short blogpost on a issue I faced this week.

Working at a customer this week we were working on setting up Citrix with SAML based authentication from MyApps Portal using Azure Active Directory. In order to setup this properly we needed to implement Citrix FAS in order to do SSO directly from a Azure AD Joined Windows 10 device. One of the issues we were facing was when a user clicked on Citrix app from myapps portal and opening multiple tabs or closing the existing tab where Citrix application was opened. The end user received a standard 404 error from Citrix Storefront

The reason for this was because of the Gateway session cookie was inserted when the user was trying to access Gateway from Azure MyApps. The request from Azure AD was redirecting to /cgi/samlauth and forwarded to the IIS server since Session cookie matched with an existing connection the connection failed. So my initial idea was to use Responder or rewrite policies but after some thinking I noticied that they were ignored due to AAA processing in the NetScaler packet flow take precedence of those feature.

The end solution was quite simple. We created a virtual directory on Storefront IIS.

iis2

and created a redirect on that virtual directory back to the netscaler gateway setup.

iis2

After I did this, the end user could open up the application as normal.