Azure Monitoring alerting rule to notify on non-compliant resources

When using Azure Policies as part of your govnernance framework, there is always on thing that has bugged me with Azure Policies and that is with regards to alerting about non-compliant resources. From within the portal you get a list of non-compliant resources shown as part of the Policy view. However there are no alerting mechanism built into Azure Policy.

There is however one way to forward this. All Azure Policy activities are logged into Azure Monitor under Policy activities. Using Monitor diagnostics integration, we can forward these logs to Azure Log Analytics using the diagnostic settings, from there we can define action group to trigger email notification, webhooks or such what is required.

Diagnostics integration can be configured here, under Azure Monitor and Diagnostics, from here you can forward logs to a central log analytics workspace.

Once that is configured all future events will be stored in Log Analytics. Now depending on the Azure Policy type that is configured, the time it takes before logs start appearing in Log Analytics will differ.

  • A policy or initiative is newly assigned to a scope. It takes around 30 minutes for the assignment to be applied to the defined scope
  • A resource is deployed to a scope with an assignment via Azure Resource Manager, REST, Azure CLI, or Azure PowerShell. In this scenario, the effect event (append, audit, deny, deploy) and compliant status information for the individual resource becomes available in the portal and SDKs around 15 minutes later.
  • Standard compliance evaluation cycle. Once every 24 hours, assignments are automatically reevaluated.

Once we have the logs/data in Log Analytics we can run queries against the log data and define an action. All Azure Policy log data is collected under the AzureActivity table in Log Analytics.

AzureActivity
| where CategoryValue == "Policy"

By going a bit deeper on this, we can define a more granular Kusto Query that can look at non compliant resource activity logs in Log Analytics.

AzureActivity
| where CategoryValue == "Policy"
| where parse_json(Properties).isComplianceCheck == "False"
| extend resource_ = tostring(Properties_d.resource)

This Kusto Query will look for AzureActivity logs with the value Policy and will parse the JSON logs and look for value called ComplianceCheck == False, then also collect the resourcename so it is easier to structure an alert.

These queries are useful if you have Azure Policies which are configured using Audit, AuditifNotExist or Deny policies, however if you are using DeployIfNotExist Policies you will have some issues since you will have mulitple log entries where you will first have one Failure, Accept and then Success log entry, where the resource itself will be compliant after it has deployed the remediation part of the script, but according to the log entry you will be notify for non-compliant resources until the remediation task has been deployed. This means you will get alerts regardless if the resource has already been remediated.

Therefore we need something else to ensure compliance for resources to get notified of the actual status and not based upon log entries, and this is where AzPolicyInsight comes in, which is a PowerShell cmdlet which is used to collect the actual status from Azure Policy.

So for instance using the cmdlet we can get realtime status of non-compliant resources.

Get-AzPolicyState -filter "ComplianceState eq 'NonCompliant'" -Apply "groupby((ResourceId))"

So we need to have this in an automated fashion to collect resource compliance on a given schedule and then create an alert or log entry if a resource is non-compliant after a policy with AuditifNotExist can been configured. One way to do this is using Azure Automation in conjunction with Logic Apps (in this example I’ve used Log Analytics to feed data back into Log Analytics so I can reuse existing Action Groups

First of I’m using the same Azure Policy definition, then I have an Azure Automation runbook which is doing the following.

# Azure Automation script
Disable-AzContextAutosave –Scope Process
$connection = Get-AutomationConnection -Name AzureRunAsConnection
while(!($connectionResult) -and ($logonAttempt -le 10))
{
    $LogonAttempt++
    # Logging in to Azure...
    $connectionResult = Connect-AzAccount `
                            -ServicePrincipal `
                            -Tenant $connection.TenantID `
                            -ApplicationId $connection.ApplicationID `
                            -CertificateThumbprint $connection.CertificateThumbprint
    Start-Sleep -Seconds 10
}

$output = Get-AzPolicyState -filter "ComplianceState eq 
'NonCompliant'" -Apply "groupby((ResourceId))"

if (@($output).Count -gt 0) { Write-Output "Non-Compliant Resources" $output.Count }else
 { Write-Output "All Resources Compliant" }

Then it will look at if there are more then zero non-compliant resources and if then it will print out a count and write output Non-Compliant Resources. Then I have an Logic App which is run on a 24 hours schedule.

# Logic App Playbook

Which is then creating a job for running the runbook, collecting the output and based upon what is collected back from the Content in the job will fire an condition if true.

So if the content contains Non-Compliant it will trigger an action using the Log Analytics Send Data Connector. Note that this connector action requires JSON based format.
This will forward the data to a new table within log analytics called CL_AzurePolicy (All REST API driven data forwarded to log analytics will be stored into a custom table with prefix CL_
Now since this is now running on a 24 hour schedule, I can check if data is being collected into log analytics.
And now I can reuse the same action groups to send alerts based upon the non-compliant resources as well.
NOTE: With this setup it will not generate an entry if resources are compliant based upon the script. There are better ways to structure the output but this was as an example.

Leave a Reply

Scroll to Top