When moving to public cloud I often find that people I talk with misunderstands what services they actually get when they buy a service in for instance Azure, AWS or GCP for that matter. That is of course something I often find doing myself and not actually reading the different service descriptions. One common misconception is the redundancy == backup. All vendors provide built-in redundancy of their services, for instance with x amount of data copies of each storage object.
Now redudancy allows our VMs to continue to run in case of a physical drive failure for instance, but it does not provide us with the ability to get back old data if we managed to overwrite or delete a file which wereally needed.
And this is exactly the issue when running virtual machines in any of the large cloud providers, they have the redundancy but backup of the data is completely our own responsibility.
So what options do we have?
Amazon Web Services
When deploying virtual machines on Amazon Web Services, the virtual hard drives are deployed on a service called EBS (Elastic Block Storage) which is a block based storage solution which can be used for structured data, such as databases, or unstructured data, such as files in a file system on the volume.
Amazon EBS provides the ability to create snapshots (backups) of any Amazon EBS volume. It takes a copy of the volume and places it in Amazon S3, where it is stored redundantly in multiple Availability Zones. The first snapshot is a full copy of the volume; ongoing snapshots store incremental block-level changes only.
Now there are some issues with this solution.
Now since this is a pure snapshot solution of a EBS drive which is not utilizing VSS (Volume Shadow Copy Service) and it is pretty blunt that it is not direct backup service and only provides simple restore capabilities which are limited to full restore of snapshot with limited granularity.
Google Cloud Compute
Google has pretty much the same approach as AWS. Google also provides snapshots of their persistent disks running in GCP, however GCP has one tiny advantage and that it is able to do VSS based disk snapshots directly from a persistent disk.
But again they suffer from the same downside like AWS. It is still only a snapshot solution, limited restore options and no file level granularity and no ability to define a backup schedule or retention policy.
Microsoft has gotten alot further in this space with the backup offerings. Microsoft is the only vendor of the three which has backup and Disaster recovery solutions as a finished service. Azure Backup provides backup capabilities to on-premises solutions which could be clients, physical servers and virtualized workloads.
But they actually have a pure backup solution for virtual machines in Azure where we have the ability to define a backup schedule, retention policy and allows for more granular level of restore, for instance the ability to do item-level restore –> https://azure.microsoft.com/nb-no/blog/instant-file-recovery-from-cloud-backups-using-azure-backup/
The backup service is based upon protected instance and storage cost in terms of billing and allows for full application consistent backup of virtual machines running in Azure (Both Windows and Linux)
It is still missing some capabilities which I feel should be there.
* Simple file level restore directly in Azure portal to have a native experience to find files and folders
* Better notification options! Much of the services in Azure have webhooks or Azure automation jobs which can be triggered in case of failure this should include Azure backup as well.
* File level restore for certain applications, Microsoft should have some better integration options with for instance Active directory running in-guest do to restore of a AD database directly from within Azure portal
* Integration with Office365! Today we need to use third party tools to do backup of Office365, now this focus is mostly on IaaS but I still needed to mention it.
* Storage policy to move data backup to cold tiers
But compared to the two others, Microsoft is a long way ahead when it comes to a native and good backup solution for Cloud workloads.
In-guest VM Agent – Veeam
Now all three vendors have one thing in common regardless on how well their backup/snapshot solution they have is, and that is that hte service contained within the platform . Now I can for instance use the Windows Agent for Azure backup on instances running in GCP or AWS but that agent is pretty limited in terms of capabilities.
Veeam however which so far has been quite known for virtualized availability suite is now well under way for the agent support for Windows. Since you cannot have direct access to the underlying hypervisor in any cloud vendor you are limited to use agent based backup solutions (or using VSS snapshot combined with storage access API reads) Now the Veeam agent is far from limited when it comes, it will have Microsoft Exchange, Active Directory, SQL Server, SharePoint, Oracle and file server full application-aware processing.
It will also be able to integrate it with a Cloud Connect service provider to do sentralized backup from all customers running in the cloud for instance (GCP, Azure AWS) to a specific region where the cloud connect repository is placed . It will also provide guest file indexing to be easier to integrate file items directly from within the UI, you can even it with regular Veeam backup and replication to do restore of VM’s to Azure directly to provide kind of a Instant-On capability.
Here is a summary of the list of features contained in the different editions
Now this summarizes the backup solutions with the different cloud vendors and what they provide in terms of backup services. One thing is certain and that is that none of the cloud providers deliver you backup by default and it is your responbility to provide a backup solution for your infrastructure.