Cloud Wars – IBM vs Microsoft vs Google vs Amazon IaaS

By | February 5, 2018

In my previous blog post I did a short overview of the different cloud vendors, a bit about their focus areas and also a bit about strengths and weaknesses. The blogpost can be found here –>http://bit.ly/2CrBgZA .In this post I want to focus more on IaaS and the offerings surrounding it, first I want describe a bit about each vendor and then ill go into a bit more comparison and also include the price/performance factor here as well and end it with some focus on automation functionality and additional services.

IBM:
As mentioned in my previous blogpost, IBM with its Softlayer capabilities has had extremely focus on bare-metal, with the addition on traditional IaaS and also with the extended partnership with VMware, they can also provide vCloud Foundation package (which also is a prerequisite package for VMware HCX) or just plain ESXi with vCenter deployment. On the bare-metal options we can choose between hourly or monthly pre-configured servers or customize with single to quad processing solutions that range from 4 to 72 cores. We can also order bare-metal servers with dedicated GPU offerings such as K2, K80, M60, P100). One of the cool features in terms of pure IaaS is that they offer pure Block storage as an option as well, using iSCSI or just plain file storage using NFS which also is an option. In terms of scale ability they can only offer up to 56 cores and 242 GB RAM for a single virtual machine which is a lot smaller then most offerings in Azure, Google and AWS. IBM like AWS and Azure also offers pre defined instances sizes which can be used and when setting up an instance you can also define what kind of network connectivity you want to have, by default you get 100 mbps uplink and private connectivity which is free, but if you want to up it to 1 GB you need to pay a cost. The main issue is that of all much of the concepts such as availability zones and other options for HA is not an option in IBM compared to GCP, AWS and Azure.

In terms of Automation, IBM has a feature called Cloud Schematics which is natively based upon Terraform, so it is basically wrapping REST API calls using a IBM Provider in Terraform, https://ibm-cloud.github.io/tf-ibm-docs/ we also have the ability to run provision scripts which can be run at boot as part of a deployment. One of the things I feel is missing on IBM when it comes to automation is the ability to provider more overall system management capability such as Azure automation or AWS systems manager.

Google:
Google compared to the others have the most simplified deployment of virtual machines. Also they are the only vendor that has the option to defined custom instance sizes (for a bit higher prices of course) and also the flexibility when it comes to GPU flexibility for instance we can add GPU instances to any type of instance and also when it comes to disk type and sizes.

With Automation, Google has an API framework called Google Cloud Deployment Manager, which uses a YAML based syntax, but can also be using providers from Terraform or Puppet to do the deployment as well. Google also has the option do run start-up scripts on each virtual machine which allows for scripting of software and services inside the virtual machines. Google provides up to 96 vCPU and 1433 GB of memory on their largest instances. They however do not have any form of bare metal options, compared to IBM but that is not their focus either but like AWS, Google has gone into a partnership with Nutanix on a Hybric Cloud model which is going to be interesting to see how it turns out. Another cool thing about Google is that they provide live migrations of instances as default to handle maintance updates on their infrastructure.

For deployment of redudant solutions you need to be able to deploy instances across multiple zones within a region (Which is a simliar setup as Amazon Web Services do and Azure does with Availability Zones)

From a management perspetive, Google has been really good at developing their Cloud Shell solutions which allows for easy access to virtual instances directly from their browser and also allows for simple access with auto inserting the SSH key as part of the setup. One of the coolest things about Google is their core infrastructure and the network backbone which is called Andromeda https://cloudplatform.googleblog.com/2017/11/Andromeda-2-1-reduces-GCPs-intra-zone-latency-by-40-percent.html which now has allowed them to provide low latency high bandwidth connections on east-west traffic. Also that they SDN is also worldwide meaning that if you create a virtual network by default it will be available on all the different regions (where different subnets are placed within each region  but are all interconnected)

Azure:
Microsoft has also been doing a lot of work recently and investing heavily into new options such as new GPU offerings with the P100 and P40 cards but also with the introductions of availability zones (Still in preview for most services) which now allows for a great level of redundancy which is now pretty similar to Zones on GCP and AWS. Microsoft has also introduced loads of different new instances types with the burstable compute (B-series) and also now with the introduction of GA on Accelerated networking which allows for SRV-IO based network deployment of instances in Azure.

From a management perspective Microsoft has been doing alot around regular operations, such as with Log Analytics which can now do patch management and provide multiple pieces of monitoring across different platforms and also integrating different PaaS serivces to allow for a single hub to do monitoring across most of the services. Also with simple EDI based tools such as Logic Apps and Azure Automation allows us to setup simple and down to more complex automation jobs to do automated deployments and start/stop virtual instanced based upon a trigger or schedule. Also that they provide alot more tools when it come to migration and backup tools compared to the other vendors, with Azure Migrate and Azure Site Recovery.

Also Microsoft has been doing alot of investment into their Cloud shell solution as well which allows us to run az cli (bash based) and Azure PowerShell cmdlets directly from the browser. (As as of t01.02 they now also support Ansible directly from the cloud shell interface)

One of the issues with Azure from an IaaS perspective is the lacking flexibility to mixing like GPU cards with different instances, scaleable IOPS together with disk size. Also Microsft is focusing alot on building partners in the ecosystem to support automation and have been doing alot when it comes to Terraform which now covers alot of the resources in Azure directly.

AWS:
When it comes to IaaS, Amazon provides most of the services both when it comes to bare-metal(coming) and support with VMware. Also different options depending on if needed reserved instances or just need to get reserve capacity or godzilla virtual machines. They also provide different storage options and scale options on IOPS depending on the size of the storage. Now with the upcoming support with VMware will also provide a whole new level of infrastructure solutions (the service is now available but still limited to certain regions in the US)

AWS also provides multiple management tools to make things easier such as AWS systems manager (which can also target on-premises virtual machines), and they even provide their own AWS Managed Services where they manage the IaaS solutions for you . AWS also has a service called OpsWorks which provides automation solution based upon Puppet and Chef as a managed service which can be then used to deliver configuration management against your own enviroment in AWS. AWS also has CloudWatch and CloudTrail to track events, logs and activity and API usage across AWS subscriptions.

AWS also has multiple options when it comes to GPU offerings such as P2 and G2 series which comes with a dedicated GPU card, or use the flexible GPU which is a software-defined GPU offering which allows us to add a GPU card to almost any type of instance.

Summary:
Now the fun part is that most providers are now delivering more and more services to help with automation and system management, such as managed container engine cluster and also different advisor roles which can detect cost or security issues. This can be to check for best-pratices according to the cloud provider.

Now the interesting part is mainly around the container solutions that most providers are now fighting about. Both Microsoft and AWS have their own Container instance solution, where you just provision a container based upon an image and don’t have to worry about the infrastructure beneath (AWS Fargate and Azure Container Instance) and both of them also provide other container solutions such as Amazon Container Engine and Azure Container Engine. The fun part is that all 4 providers supports Kubernetes as the container orchestration engine and have supported features to build upon it, this can be a container registry solution or CI/CD solutions.

Technical Comparison: So the intention here is to have a short table to compare some of the different infrastrucutre services from each vendor, it does not measure the quality of service but just defines that they have a service and service name.

Provider

Microsoft

Google

Amazon

IBM

High Performance Computing Services

Azure Batch

 

Amazon Batch

 

Reserve Capacity instances

Low Priority VM’s

Preemptible instance

Spot instances

 

Reserved Instances

Reserved Instances

Committed use

EC2 Reserved Instances

 

Dedicated instances

   

EC2 Dedicated Instances

 

Bare Metal hosts

   

Yes (Announced)

Yes

Burstable Instances

Yes

Yes

Yes

No

VM Metadata support

Yes

Yes

Yes

Yes

Custom Instance Sizes

 

Yes

 

Yes

Compute Service Identity

Yes

Yes

Yes

No

 

High performance disk

Premium Disk

SSD persistent disk, Local SSD

SSD EBS

SSD Octane

GPU-instances

N-series (NV, NC, ND)

Flexible GPU

P2 instances / Flexible GPU

Only as bare metal

Nested virtualization support Yes Yes (Beta) Yes  
Hybrid Story Azure Stack Nutanix VMware VMware

GPU cards support

M60, K80, P40, P100

K80, P100, AMD S9300

M60, Custom GPU, V100

P100, M60, K80

Desktop as a service

Third Party

Third party

Workspaces & AppStream

Third Party

Scale set

VM Scale Set

Instance Group

Auto Scaling

Auto Scale

Godzilla VM

Standard_M128 128vCPU, 3800 GB

N1-highmem 96vCPU, 1433 GB

X1.32large 128vCPU, 4 TB

56 vCPU, 242 GB Memory (Other Bare Metal)

Skylake support Yes Yes Yes  

VMware support

Yes (Announced)

 

Yes (Limited to the US)

Yes

Billing for VM

Per minute

Per Second

Per Second (For some)

Per Hour

Deployment & Automation service

Azure Resource Manager

Google Deployment Manager

Cloudformation

IBM Cloud Schematics

CLI

PowerShell, AzureCLI

GCloud CLI, Cloud Tools for PowerShell

AWS CLI, AWS Tools for PowerShell

Bluemix CLI

Monitoring & Logging

Microsoft Log Analytics, Azure Monitor

StackDriver

CloudWatch, Cloudtrail

Monitoring and Analytics

Optimization

Azure Advisor

Native Service in UI

Trusted Advisor

 

Automation tools

Azure Automation

 

Amazon CloudOps for Chef and Puppet

Cloud Automation, Workload Scheduler

Support for third party configuration and infrastructure tools

Chef, Puppet, Terraform, Ansible, SaltStack

Chef, Puppet, Terraform, Ansible, SaltStack

Chef, Puppet, Terraform, Ansible, SaltStack

Terraform

Cloud Shell support

Yes

Yes

   

EDI Tools

Azure Logic Apps

     

In the next blog post I will take a closer look at some price comparison and comparing apples and apples in some benchmarks which measures speed of deployment using the different deployment tools and in VM spped on different levels.

Leave a Reply

Your email address will not be published. Required fields are marked *