Threat Hunting in Microsoft Azure

A while back, a customer asked me to help inspect what happened to an environment in Azure that got compromised and was used to launch a ransomware attack.  Unfortunately, this environment also had a VPN connection between Azure and their existing on-premises data center which also meant that their entire infrastructure got compromised eventually….

Now the entry point here, that was later discovered to be a server that was brute-forced using RDP (Because it had public IP), but I use this blog post to show some of the mechanisms/capabilities to monitoring abnormal traffic patterns in Azure and show if changes have been made lately that resulted into the compromised server.

Within Azure, there are numerous tools and services that can monitor what’s going on within a virtual machine including networking traffic. So, what kind of data can we collect?

It is important to note that the crucial point for all detection mechanisms is using Log Analytics / Azure Sentinel to collect the data.

NOTE: As of now you also have Microsoft 365 Defender which is more aimed at endpoint but also supports virtual machines (with server OS) that also collect a large set of data from the VM. However, these are stored within a separate data source (and not directly accessible from Azure Sentinel / Log Analytics) unless you use the new preview connector from Microsoft.

Security Data Sources:

NOTE: Most of these data sources are also available if you are using Azure Arc for virtual machines.
The only data log sources that of course are not available are NSG Flow logs, Traffic Analysis and Azure Firewall.

  • Azure Automation (Not enabled by default)
    • Change Tracking (Looks at changes to Registry/Software/Services)
    • File Integrity / File Monitoring (Can be defined to look at changes to files)
    • Update Management (Handles updates to virtual machines)
  • Activity Log (not collected into Log Analytics by default)
    • Collect Azure Resource Manager changes (By default stored in 30 days)
  • Network Security Group Flow Logs (Not enabled by default)
    • Traffic Analysis (NSG Flow logs that are enriched with Microsoft Threat Intelligence)
  • Azure AD Sign-in and Audit logs (not collected into Log Analytics by default)
      • AuditLogs
      • SignInLogs
      • NonInteractiveUserSignInLogs
      • ServicePrincipalSignInLogs
      • ManagedIdentitySignInLogs
      • ProvisioningLogs
      • ADFSSignInLogs
      • RiskyUsers
      • UserRiskEvents
  • Azure Firewall  (Not enabled by default)
      • AzureFirewallApplicationRule – Azure Firewall Application Rule
      • AzureFirewallNetworkRule – Azure Firewall Network Rule
    • Azure Application Gateway (Not enabled by default)
        • ApplicationGatewayAccessLog – Application Gateway Access Log
          ApplicationGatewayFirewallLog – Application Gateway Firewall Log
          ApplicationGatewayPerformanceLog – Application Gateway Performance Log
      • Windows Security Events (Using Sentinel or Azure Defender for Cloud) (Not enabled by default)
        • Service Map (Not enabled by default)
            • Virtual Machine Connections
            • Virtual Machine Processes
          • Security Center (Not enabled by default)
              • Protection Status

            Here is an overview of the data collection that can be collected using the different built-in services within Microsoft Azure. To give a complete insight into what kind of traffic or events are occurring within your virtual infrastructure or against your services you need to have data collection enabled.

             

            1: Understand if there have been some changes to the environment

            Sometimes you need to see if there has been any sudden change to the environment to determine if a change that was made then had some negative impact either on the availability or security. Or if a malicious actor has made some changes to the environment.

            All changes within Microsoft Azure are reflected as changes to the JSON attributes. This means that changes to a firewall rule or storage or security setting in Azure is by changing a JSON value. All these changes are logged into the Azure Activity Log. This log can be viewed by going into Azure Monitor –> Activity Log then for each activity you can see the JSON attribute that has been changed

            The problem that is requires that you understand the attribute references and how it affects the resource. Secondly by default is activity log is only stored for 30 days. That means that if changes have been made a while back and you do not store this log data within Log Analytics the data is lost.

            You also have the option to view visually using the Change Analysis feature that presents this within an easier UI that allows you to much easier see what has been changed.

            Now if you do not have this log access and the change has been a while back (over 30 days) you can also if resources have been deployed using Azure Resource Manager either using the portal or using templates that you can view the deployment information. (NOTE: This is not generated if you are using Terraform to do deployment)

            But this allows you to view the deployment and changes that were made to a resource within a resource group.

            When you view a deployment done to a resource group. This means that you can also view attributes and parameters that were defined and changes for the given resource as part of that deployment.

            This approach is however just to view what changes have been made to an Azure resource and does not give insight into changes that have been made inside the actual virtual machine, except the example if a guest agent has been installed outside of using extensions.

            Regardless, ensure that the Azure Activity Log is collected into a centralized log analytics workspace, this makes it easier as well to monitor usage and changes to the Azure environment.

            Just by using simple Kusto commands, you can use this to get information about where activity is coming from. This is also summarized based upon IP address of the user that made the changes.

            AzureActivity 
            | summarize count() by CallerIpAddress, Caller

            Now if we have enabled Azure Automation and Change Tracking, we can also monitor changes within a virtual machine. This service can monitor change to.

            • Windows Software
            • Linux software (packages)
            • Windows and Linux files
            • Windows registry keys
            • Windows services
            • Linux daemons

            As an example of Kusto queries to monitor changes to Windows virtual machines

            1: Monitor changes to services that are not coming from Microsoft.

            ConfigurationChange 
            | where ConfigChangeType <> "WindowsServices"
            | where Publisher <> "Microsoft Corporation"

            You can also configure how often it should be collecting info.

            2: Monitor changes to file systems. It should be noted that this feature uses the FIM engine in Windows Defender (and requires Azure Defender for Cloud) also that you define what kind of files that should be monitored.

            ConfigurationData 
            | where ConfigDataType <> "WindowsServices"
            | where Publisher <> "Microsoft Corporation"
            | where ConfigDataType == "Files"

            It should be noted that it does not collect all file changes, you have to specify what kind of folders it should be monitoring.

            2: Following the breadcrumbs

            So how can we monitor what kind of traffic is hitting our virtual infrastructure in Azure (or on-prem)? That is also depending on what kind of network architecture is being used and what kind of Azure services are used.

            Let me use an example where we have a hub and spoke network with Azure Firewall in combination with different security services.

            NOTE: An important aspect to be able to get visibility is to enable diagnostics for service wherever possible to ensure that you have log data for the services. If you do not enable diagnostics, you will have extremely limited visibility.

            Traffic coming from an external source is aimed at an internal virtual machine. First, traffic needs to be processed by:

            Azure Firewall –> Network Flow –> NSG either at subnet or both at NIC –> Virtual Machine

            NOTE: NSG Flow logs will only get into Log Analytics if used together with Traffic Analysis. The Azure Firewall could be replaced by an Application Gateway if the service is publicly, but the traffic flow is similar.

            NOTE: When hitting the Azure Firewall is important to note that the Azure firewall service has multiple instances running that are load-balanced so traffic might be sources between multiple IP addresses so when looking into the log data from Azure Firewall, we need to ensure that traffic is filtered based upon the subnet and not a specific IP source.

            The Azure Firewall will log all requests coming into the packet engine so that we can see what kind of traffic that is going through and what the actions (and rules that were applied).

            However, all log data from Azure Firewall will be placed within the AzureDiagnostics table using different categories. To view traffic going to a specific IP, we can use the following query.

            AzureDiagnostics 
            | where Category == "AzureFirewallNetworkRule"
            | where msg_s contains "10.200.1.4"
            
            

            The NSG flow logs will only be collected into Log Analytics if Traffic Analysis is enabled, and by default, Microsoft will enrich the data tables with information about the Azure subscription such as IP addresses that are coming from or to another Azure Service. This link describes what kind of data tables that are used Azure traffic analytics schema | Microsoft Docs

            For traffic that is hitting a virtual machine, we also have some options to monitor network traffic within a virtual machine. This can provide us with the ability to see what kind of process or service that is receiving the traffic within the virtual machine.

            Using a solution called VM Insight, which starts collecting a new set of data from the virtual machine, such as

            • VMProcess – Shows running processes on a virtual machine
            • VMConnection – Shows a running connection and attached process (DNS and IP)
            • VMComputer – Shows Computer information
            • VMBoundPort – Shows services and bound ports on OS
            • InsightMetrics – Show Virtual Machine Metrics.

            If we combine this with SecurityEvents (that can be collected using Azure Sentinel or Azure Defender for Servers) we can also use it to see traffic going against RDP services for instance.

            Now the problem with Log Analytics and Sentinel is that each table has its own set of attributes where one table might store data within a table called IPAddress while another might store in a table called Src_IPAddress. So how do we search across the different tables?

            As an example, if we can combine data between two tables, SecurityEvents and SigninLogs Which stores data in different entities such as IPAddress and IpAddress.

            Here is an example of Kusto Query that does the following.

            1: Define a specific IP address that I’m looking for and define certain filters such as failed login attempts, which we define as a list, can be one or multiple IP addresses. 

            2: Map attributes to a common attribute for each table

            let IP = "8.8.8.8";
            (union isfuzzy=true
            (SecurityEvent 
            | where EventID == 4625
            | where IpAddress in (IP)
            | extend Ip = IpAddress, User = Account
            ),
            (SigninLogs
            | where IPAddress in (IP)
            | extend Ip = IPAddress, User = UserPrincipalName
            ))

            Using this query means that all data from both tables (SecurityEvent and SigninLogs) and IP addresses will be shown within a common attribute called IP and User. You can even use a similar one to collect all the IP addresses that are connecting the different services.

            (union isfuzzy=true
            (SecurityEvent 
            | where EventID == 4625
            | extend Ip = IpAddress, User = Account
            ),
            (SigninLogs
            | extend Ip = IPAddress, User = UserPrincipalName
            ))
            | summarize count() by Ip, User

            Now let’s try and map activity against all our sources based upon external “known” bad IP addresses. This can either be using external data sources or using other known threat intelligence.

            let IP = (externaldata(ip:string)
            [@"https://rules.emergingthreats.net/blockrules/compromised-ips.txt",
            @"https://raw.githubusercontent.com/stamparm/ipsum/master/levels/5.txt",
            @"https://cinsscore.com/list/ci-badguys.txt",
            @"https://infosec.cert-pa.it/analyze/listip.txt",
            @"https://feodotracker.abuse.ch/downloads/ipblocklist_recommended.txt"
            ]
            with(format="csv")
            | where ip matches regex "(^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$)"
            | distinct ip
            );
            (union isfuzzy=true
            (SecurityEvent 
            | where EventID == 4625
            | where IpAddress in (IP)
            | extend Ip = IpAddress, User = Account
            ),
            (SigninLogs
            | where IPAddress in (IP)
            | extend Ip = IPAddress, User = UserPrincipalName
            ))

            Now, this can use as many tables as you want, just make sure to have some filters in place so it doesn’t need to traverse through all the tables without any filters. Can also consume a lot of excessive resources if not filtered properly. This is a simple query that allows us to easily scan traffic against known bad IP addresses.

            Now if we want to find traffic that is coming in, that is going into using RDP we can use the information from the VMConnection table if we have any.

            let IP = (externaldata(ip:string)
            [@"https://rules.emergingthreats.net/blockrules/compromised-ips.txt",
            @"https://raw.githubusercontent.com/stamparm/ipsum/master/levels/5.txt",
            @"https://cinsscore.com/list/ci-badguys.txt",
            @"https://infosec.cert-pa.it/analyze/listip.txt",
            @"https://feodotracker.abuse.ch/downloads/ipblocklist_recommended.txt"
            ]
            with(format="csv")
            | where ip matches regex "(^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$)"
            | distinct ip
            );
            (union isfuzzy=true
            (SecurityEvent 
            | where EventID == 4625
            | where IpAddress in (IP)
            | extend Ip = IpAddress, User = Account
            ),
            (VMConnection
            | where SourceIp in (IP)
            | extend Ip = SourceIp
            | where DestinationPort == "3389"
            | where LinksLive == 1
            ),
            (SigninLogs
            | where IPAddress in (IP)
            | extend Ip = IPAddress, User = UserPrincipalName
            ))

            You can also use this compiled list from Microsoft in combination with the “known” bad IP addresses (That was generated when Log4J vulnerability was discovered) https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Sample Data/Feeds/Log4j_IOC_List.csv or even use together with Watchlists within Sentinel where you might have some known bad IP addresses of your own. Where you address the watchlist name in the list like this

            Here I’m defining a Watchlist that is precreated in Sentinel called TorExit, then I’m getting up the TorIPAddress which I’m then checking against for SecurityEvents.

            let IPWatch = (_GetWatchlist('TorExit') | project TorIPAddress);
            let IP = (externaldata(ip:string)
            [@"https://rules.emergingthreats.net/blockrules/compromised-ips.txt",
            @"https://raw.githubusercontent.com/stamparm/ipsum/master/levels/5.txt",
            @"https://cinsscore.com/list/ci-badguys.txt",
            @"https://infosec.cert-pa.it/analyze/listip.txt",
            @"https://feodotracker.abuse.ch/downloads/ipblocklist_recommended.txt"
            ]
            with(format="csv")
            | where ip matches regex "(^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$)"
            | distinct ip
            );
            (union isfuzzy=true
            (SecurityEvent 
            | where EventID == 4625
            | where IpAddress in (IPWatch) or IpAddress in (IPWatch)
            | extend Ip = IpAddress, User = Account
            ))

            If a VM within Azure was compromised, and you have VM Insight and are collecting Security Events and enabled Traffic Analysis you have a lot of data that can show what has happened.

            A good start is to map inbound network traffic to a specific VM or specific port such as RDP, where we are filtering against the allowed inbound flow and not coming from inside the internal virtual network.

            AzureNetworkAnalytics_CL 
            | where DestPort_d == 3389
            | where AllowedInFlows_d >= 1
            | where FlowType_s <> "InterVNet"
            | summarize count() by SrcIP_s

            Here we might have some Source IP addresses that we need to take a closer look at, one example Is looking at the correlation between IPs that are seen in the Network Analytics and Combined with Security Events to see if someone has been trying to be getting access to our networks.

            let IP = (AzureNetworkAnalytics_CL 
            | where DestPort_d == 3389
            | where AllowedInFlows_d >= 1
            | where FlowType_s <> "InterVNet"
            | summarize count() by SrcIP_s);
            SecurityEvent
            | where IpAddress in (IP)

            Now in this blog post, we looked closer into using tools and services to do threat hunting and look at abnormal traffic across different data sets. In the next blog post, we will dig deeper into other data sources within a virtual machine that we can look at to see abnormal traffic in combination with Sentinel.

             

             

             

             

             

             

             

             

             

            Leave a Reply

            Scroll to Top