This is the next post on my blog posts on Microsoft Azure and Security Best Practices, my first post focusing on Identity can be found here –> http://msandbu.org/microsoft-azure-and-security-best-pratices-part-1-identity/
One of the main concerns about moving the workload to the public cloud is security, basically, you now can manage your entire infrastructure from a self-service portal or API from anywhere as long as you have correct access to the infrastructure. So of course identity is the first puzzle you need to solve, to ensure that people do not have too much access to the cloud platform. Next piece is ensuring that infrastructure runs securely and that you can lock-down access and ensure correct security measurements are in place.
With the infrastructure in the cloud, however, you might need to rethink how to secure your environment, much of the same concepts still apply but just in other shapes. First off we will take a closer look at what we need to do on a platform level and not so much inside the operating system itself.
- Best-practices for IaaS and Control with Azure Policies
When setting up infrastructure in Microsoft Azure you should always ensure that they are set up according to company guidelines, such as using only support computer instance types and that it is only deployed in the correct region, this helps to control cost as well ensuring that no one deploys in the wrong region. This is where Azure Policies come in, which are essentially ARM policies. So if we try to deploy an instance that is prohibited by an Azure Policy the ARM process will stop because of an deny request. Note however we can also use Azure Policies for Auditing purposes so we can see the status for our deployments for instance if resources are being deployed in wrong regions if people are not using managed disks and so on. All Azure Policies are defined at a subscription level or optionally at a resource group level. So as a simple Azure Policy Definition can be something like this, which requires us to have a tag in place for resources, if not the deployment of resources will stop.
There are currently five effects that are supported in a policy definition.
- Append
- Audit
- AuditIfNotExists
- Deny
- DeployIfNotExists (only available to built-in policies)
You can also define multiple policies within an Initiative definition is a collection of policy definitions, which can also be assigned to a resource group or to a subscription level. Azure Policies can be useful to ensure for instance that Azure Security Agent (OMS) agent is installed on a Windows Machine running in Azure, ensuring that tags are properly defined on resources or ensure that proper ACL’s are defined on virtual machines within a resource group for instance. When creating policy definitions in your environment, I recommend starting with an audit effect, as opposed to a deny effect, to keep track of the impact of your policy definition on the resources in your environment.
Policies are configured under Subscription –> Azure Policies
You can also use Azure Policies to for instance DeployIfNotExists (which sadly is only available for the built-in policies) to deploy Log Analytics agents on virtual machines in Azure on Machines which does not have the agent installed. It should be noted that specific Azure Policy only checks if the Azure Extension for Log Analytics was installed, it does not actually check inside the guest operating system.
NOTE: Use Azure Policies to ensure compliance with your Azure Subscription, use it for audit purposes and to implement some deny policies to ensure proper tagging and location usage.
- ACLs
Now one of the biggest issues I see with deployments is the lack of control when it comes to configuring ACL’s aka using Network Security Groups (NSG) in Azure. NSG’s are useful to define ACL’s either on a network interface level or at a subnet level. If we define an NSG on a subnet level and on a vNIC level, both are inherited down to the virtual machine in which subnet it resides.
NSG’s rules are processed in priority order, with lower numbers processed before higher numbers, because lower numbers have higher priority. Once traffic matches a rule, processing stops. As a result, any rules that exist with lower priorities (higher numbers) that have the same attributes as rules with higher priorities are not processed.
Now the issue with traditional NSG is that you cannot group a set of virtual machines that should have a same predefined firewall policy, such as database servers or web servers since it only applies to subnet or interface level. Luckily Microsoft has created a feature called Application Security Groups, which allows us to do that.
Application Security Groups, allows us to group virtual machines together with a similar function, and then we can use those groups to filter rules in an NSG.
So first we need to assign a virtual machine to an application security group, and then we can change the NSG rules being applied to the subnet itself.
And now we can easily apply the same defined rules to multiple virtual machines instead of doing it individually.
So next time we don’t need to assign ACL’s directly to a virtual machine, we just need to add it to a predefined group instead.
NOTE: Define ACL’s on subnet level, such as common services such as RDP, Remember to have ACL opening for Basic infrastructure services such as DHCP, DNS, and health monitoring are provided through the virtualized host IP addresses 168.63.129.16 and 169.254.169.254. Define service ACLs such as Web servers, database servers or RDP servers using Application Security Groups. Try to avoid using firewall rules directly associated with a virtual machine
- Using Azure Firewall
Azure Firewall is a new service that came into preview a couple of weeks ago, but Azure Firewall provides more functionality then NSG’s which are only 5 tuples (IP, Port, and Protocol) and doesn’t handle stateful traffic.
An Azure firewall acts as an appliance but doesn’t require any traditional load balancing and scales on-demand. You can limit outbound HTTP/S traffic to a specified list of fully qualified domain names (FQDN) including wild cards. Also, all events are integrated with Azure Monitor.
Azure Firewall doesn’t provide the same features such as IDS/IPS and or NGFW appliances, it is in this preview stage aimed at one thing. Protecting outgoing traffic from IaaS in Azure.
To start using Azure Firewall in the preview phase you would need to sign up for the use of the resource providers.
Code Snippet for PowerShell
Register-AzureRmProviderFeature -FeatureName AllowRegionalGatewayManagerForSecureGateway -ProviderNamespace Microsoft.Network
Register-AzureRmProviderFeature -FeatureName AllowAzureFirewall -ProviderNamespace Microsoft.Network
That might take 30 minutes for it is done, then you need to add the final resource provider
Register-AzureRmResourceProvider -ProviderNamespace Microsoft.Network
Once that is done we can deploy the Azure Firewall appliance to a new or existing virtual network.
NOTE: That you need to have a seperate subnet called AzureFirewallSubnet in your VNET or else the deployment will fail.
Deployment of Azure Firewall can be done through the portal or through an ARM template, that can be found here –> https://github.com/Azure/azure-quickstart-templates/tree/master/101-azurefirewall-sandbox
Once the deployment is done you need to change the default routes of the subnets to point to the Azure Firewall appliance so that traffic is routed through the Azure Firewall.
One thing that you will notice inside the firewall service is that you now have two different rules sets that can be defined for the Azure Firewall. It is either Network rules (which is basically NSG rules, but defined within the firewall) and also application rules, where we can define URL’s such as Facebook.com or wildcards such as *.facebook.com or *.microsoft.com
Now as mentioned Azure Firewall is a good start in handling outgoing connections from servers, but it doesn’t provide any type of packet inspection of any kind. And you might be better off deploying Network Virtual Appliances in an HA scenario which can handle DPI.
- Enabling diagnostics for Network services (LB, NSG and so on) and visualization using Traffic Analytics
One of the things I typically find is that many do not enable diagnostics for NSG (and lastly Azure Firewall also supports the same) The NSG diagnostics allow for granular troubleshooting on the firewall but also we have the ability to enable flow logs. Flow logs are a feature of Network Watcher that allows you to view information about ingress and egress IP traffic through an NSG. Flow logs are written in json format, and show outbound and inbound flows on a per-rule basis, the network interface (NIC) the flow applies to, 5-tuple information about the flow (Source/destination IP, source/destination port, and protocol), and if the traffic was allowed or denied. So you can compare Flow Logs to IPFIX or Netflow.
In order to be able to get flow logs and be able to analyze these logs, we need to use traffic analysis. You need to enable Network Watcher on your subscription first.
1: That can be configured using Azure Portal. Under Network Watcher.
- Flow Logs
- Diagnostics Logs (Entries are logged for which NSG rules are applied to VMs, based on MAC address. The status for these rules is collected every 60 seconds)
Now enabling this for services and gathering the information in Log Analytics allows us to collect all changes happening to either an NSG or for instance an load balancer.
2: Reregister the Resource Providers for Network and Insight to get access to the new services.
Register-AzureRmResourceProvider -ProviderNamespace “Microsoft.Network”
Register-AzureRmResourceProvider -ProviderNamespace Microsoft.Insights
Ohh and yeah, Traffic Analytics is priced: https://azure.microsoft.com/en-us/pricing/details/network-watcher/
NOTE: If data is not appearing right away under Traffic Analytics, please wait 1 hour or so for the app to process the flow logs properly.
- Using Azure ATP for monitoring Active Directory
Azure ATP is the Cloud version of ATA which is essentially a monitoring tool for Active Directory. So as long as you are running regular Active Directory, ATA (Part of EMS E3) or Azure ATP (Part of EMS E5) Be part of it to monitor your security against Active Directory. Once you signed up for a trial or set up a subscription you can sign in to https://portal.atp.azure.com/
The Architecture is pretty simple since it is a cloud service. You have the ATP Senser which you can install directly on a domain controller which sends events to the Cloud Service. Of course, this requires that your domain controllers have a direct internet connection, or you can download the ATP agent and install a standalone sensor. Which collects events using WEF and Port Mirroring (Requires RSPAN for instance on the physical switches)
Also before doing a deployment please look through the capacity planning guide, since against a normal AD environment, there will be a lot of activity that needs to be processed –> https://docs.microsoft.com/en-us/azure-advanced-threat-protection/atp-capacity-planning and therefore needs proper planning to ensure which type of agents you need to process the traffic.
The first thing you need to configure is an ATP Workspace, and basically install the sensor agent on your domain controllers, using the setup and the access key.
NOTE: If you want to install the agent sensor agent silently, you can just use the following parameters
“Azure ATP sensor Setup.exe” /quiet NetFrameworkCommandLineArguments=”/q” AccessKey=”3WlO0uKW7lY6Lk0+dfkfkJQ0qZV6aSq5WxLf71+fuBhggCl/BMs9JxfAwi7oy9vYGviazUS1EPpzte7z8s4grw==”
And NOTE: If you get some issues with the service not starting, ensure that you entered the proper credentials and you can also further debug under –> C:\Program Files\Azure Advanced Threat Protection Sensor\2.42.5155.30364\Logs
Also after installing the sensor, you also have an option to configure a domain controller as a domain synchronizer candidate. A domain synchronizer candidate can be responsible for synchronization between ATA and your Active Directory domain. Depending on the size of the domain, the initial synchronization might take some time and is resource-intensive. By default, only ATA Gateways are set as Domain synchronizer candidates. As listed in the install guide here –> https://docs.microsoft.com/en-us/azure-advanced-threat-protection/install-atp-step5
This allows us to easily search through Active Directory resources and correlate between activities. You can also use this ATA Playbook to simulate attacks to see how ATP/ATA works as well –> https://gallery.technet.microsoft.com/ATA-Playbook-ef0a8e38
As an example from my test environment
.
So for a short summary of best-practices.
- Ensure that you have proper ACL’s in place, and group them properly based on services and roles.
- Ensure that you have a proper baseline in place for the naming of resources and policy to ensure tags and agent installs.
- Monitoring of ACL’s and Traffic Flow (ill get more back to that in second piece)
- Using ATA to monitor health of Active Directory.
So this was the first piece of Security on Infrastructure, part 3 of this series will continue focusing on IaaS but some of the missing pieces such as;
- Using Windows Defender ATP for Detection and integration with Azure ATP
- Using Log Analytics and Azure Security Center for monitoring and logging of events
- Using Azure Security Center to provide JIT and Adaptive Controls
- Patch management and Change Management
- Encryption of virtual machines