Status GPU and Cloud providers AWS, GCP and Azure

So as part of the new role I have at my new company I’ve expanded my horizon to include AWS and GCP as well as Azure. Now because of my background on Citrix & VMware I also get a lot of questions around GPU support as part of it.

Now I have previously blogged a bit about N-series in Azure
N-series is great but it has some drawbacks, so therefore I decided to write a short summary of the different cloud vendors and their support for GPU instances, and what kind of GPU series they support.

Microsoft Azure:
Azure has support for GPU with their N-series. The N-series is using DDA feature in Hyper-V Windows Server 2016 which is in essence GPU-passtrough.  Since Microsoft has locked the GPU to the N-series it is on a limited set of instance sizes. N-series has been GA since December 2016, but has some issues, which is disk I/O. As of now the N-series does include an SSD drive which is only available on the temp disk (d:\ drive) and has no support for Premium Storage which only leaves us with regular HDD data disks which as a limit of 500 IOPS. N-series are equipped with NVIDIA cards, using either the NC (K80-cards) or the NV (M60-cards). Now certain instances sizes come with Infiniband which also allows for pure HPC workloads for instance. Of course Azure comes with per minute billing which allows us to easily spin up instances for shortlived workloads.

Google Cloud Platform:
Google also has GPU offers using the same model as Azure using passtrough mode. Today Google offers the K80 cards and the GPU feature is still in beta, but they have also promised support for AMD Firepro S9300 and NVIDIA Tesla P100 shortly. Google on the other hand has support allows you to attach a GPU to any type of instance which offers more flexibility, and they also offer per minute billing as well. Since Google has a more flexible approach, you can for instance easily scale from 1 to 4 cards using their CLI tool. Also Google allows us to use SSD based storage attach to the machine instance which allows us to have GPU combined with high-end IOPS support on Storage. However the feature is still in beta yet, the annoucement came out in February.

Amazon Web Services:
Amazon has two different offerings when it comes to GPU, the P2 series which has GPU-passtrough which can have up to 16 K80 cards (NVIDIA) to a set of predefined instances. on AWS you can also have SSD based storage as part of the configuration or IOPS provisioned storage. You can also use the GPUDirect capabilities which is essence is using RDMA based technology for low latency  high-speed DMA transfers to copy data between the memories of two GPUs on the same system/PCIe bus. Amazon Web Services also announced recently Elastic GPU which now is in preview which allows us to attach a virtual GPU with a set amount of virtual GPU memory to any type of EC2 instance type. Since this s a virtual GPU it might have some limited capabilities when it comes to OpenGL and DirectX support, but AWS promises that it should have good OpenGL and DirectX support.

So what about EUC workloads? The use of GPU is pretty worthless if you do not have a product which does not suppor or work with the platform. Citrix has provisioning support for both Amazon and Azure and can therefore leverage the different GPU instances types directly. Citrix also supports Windows Server 2016 which can benefit the most of the DDI feature in Azure for the N-series. Citrix does not have any direct support for GCP even though if software can be easily installed we do not have NetScaler directly available which wouldn’t provide us with alot of benefits for remote workers. Amazon has their own EUC offers with AppStream using HTML 5 protocol and Workspaces which uses Teradici, my guess is that Amazon will provide the option to boost Workspaces with Flexible GPU when it becomes available. However I can ensure the Microsoft RDS works on all platforms and with the updates in Windows Server 2016 when it comes to GPU use it does provide with a decent user-experience.

The verdict?
Amazon is leading the race, and Google is not far behind. Microsoft needs to get better options when it comes to more flexible GPU support and higher IOPS support with for instance premium disks and SSD on the C:\ drive, and also! RemoteFX vGPU support for Azure would provide a good and maybe cheap option to deliver GPU VDI workloads on Azure.

0 thoughts on “Status GPU and Cloud providers AWS, GCP and Azure”

  1. Hey,
    very nice article, valuable insights. But wandering about MCS on Azure – MCS (or Azure) doesn’t provide IO Ram caches and due that flash drives are not required?

Leave a Reply

Scroll to Top
%d bloggers like this: