Επεξεργασία

Κοινή χρήση μέσω


Azure Firewall monitoring data reference

This article contains all the monitoring reference information for this service.

See Monitor Azure Firewall for details on the data you can collect for Azure Firewall and how to use it.

Metrics

This section lists all the automatically collected platform metrics for this service. These metrics are also part of the global list of all platform metrics supported in Azure Monitor.

For information on metric retention, see Azure Monitor Metrics overview.

Supported metrics for Microsoft.Network/azureFirewalls

The following table lists the metrics available for the Microsoft.Network/azureFirewalls resource type.

  • All columns might not be present in every table.
  • Some columns might be beyond the viewing area of the page. Select Expand table to view all available columns.

Table headings

  • Category - The metrics group or classification.
  • Metric - The metric display name as it appears in the Azure portal.
  • Name in REST API - The metric name as referred to in the REST API.
  • Unit - Unit of measure.
  • Aggregation - The default aggregation type. Valid values: Average (Avg), Minimum (Min), Maximum (Max), Total (Sum), Count.
  • Dimensions - Dimensions available for the metric.
  • Time Grains - Intervals at which the metric is sampled. For example, PT1M indicates that the metric is sampled every minute, PT30M every 30 minutes, PT1H every hour, and so on.
  • DS Export- Whether the metric is exportable to Azure Monitor Logs via diagnostic settings. For information on exporting metrics, see Create diagnostic settings in Azure Monitor.
Metric Name in REST API Unit Aggregation Dimensions Time Grains DS Export
Application rules hit count

Number of times Application rules were hit
ApplicationRuleHit Count Total (Sum) Status, Reason, Protocol PT1M Yes
Data processed

Total amount of data processed by this firewall
DataProcessed Bytes Total (Sum) <none> PT1M Yes
Firewall health state

Indicates the overall health of this firewall
FirewallHealth Percent Average Status, Reason PT1M Yes
Latency Probe

Estimate of the average latency of the Firewall as measured by latency probe
FirewallLatencyPng Milliseconds Average <none> PT1M Yes
Network rules hit count

Number of times Network rules were hit
NetworkRuleHit Count Total (Sum) Status, Reason, Protocol PT1M Yes
SNAT port utilization

Percentage of outbound SNAT ports currently in use
SNATPortUtilization Percent Average, Maximum Protocol PT1M Yes
Throughput

Throughput processed by this firewall
Throughput BitsPerSecond Average <none> PT1M No

Firewall health state

In the preceding table, the Firewall health state metric has two dimensions:

  • Status: Possible values are Healthy, Degraded, Unhealthy.
  • Reason: Indicates the reason for the corresponding status of the firewall.

If SNAT ports are used more than 95%, they're considered exhausted and the health is 50% with status=Degraded and reason=SNAT port. The firewall keeps processing traffic and existing connections aren't affected. However, new connections might not be established intermittently.

If SNAT ports are used less than 95%, then firewall is considered healthy and health is shown as 100%.

If no SNAT ports usage is reported, health is shown as 0%.

SNAT port utilization

For the SNAT port utilization metric, when you add more public IP addresses to your firewall, more SNAT ports are available, reducing the SNAT ports utilization. Additionally, when the firewall scales out for different reasons (for example, CPU or throughput) more SNAT ports also become available.

Effectively, a given percentage of SNAT ports utilization might go down without you adding any public IP addresses, just because the service scaled out. You can directly control the number of public IP addresses available to increase the ports available on your firewall. But, you can't directly control firewall scaling.

If your firewall is running into SNAT port exhaustion, you should add at least five public IP address. This increases the number of SNAT ports available. For more information, see Azure Firewall features.

AZFW Latency Probe

The AZFW Latency Probe metric measures the overall or average latency of Azure Firewall in milliseconds. Administrators can use this metric for the following purposes:

  • Diagnose if Azure Firewall is the cause of latency in the network
  • Monitor and alert if there are any latency or performance issues, so IT teams can proactively engage.
  • There might be various reasons that can cause high latency in Azure Firewall. For example, high CPU utilization, high throughput, or a possible networking issue.

What the AZFW Latency Probe Metric Measures (and Doesn't):

  • What it measures: The latency of the Azure Firewall within the Azure platform
  • What it doesn't measure: The metric does not capture end-to-end latency for the entire network path. Instead, it reflects the performance within the firewall, rather than how much latency Azure Firewall introduces into the network.
  • Error reporting: If the latency metric isn't functioning correct, it reports a value of 0 in the metrics dashboard, indicating a probe failure or interruption.

Factors that impact latency:

  • High CPU utilization
  • High throughput or traffic load
  • Networking issues within the Azure platform

Latency Probes: From ICMP to TCP The latency probe currently uses Microsoft's Ping Mesh technology, which is based on ICMP (Internet Control Message Protocol). ICMP is suitable for quick health checks, like ping requests, but it may not accurately represent real-world application traffic, which typically relis on TCP.However, ICMP probes prioritize differently across the Azure platform, which can result in variation across SKUs. To reduce these discrepancies, Azure Firewall plans to transition to TCP-based probes.

  • Latency spikes: With ICMP probes, intermittent spikes are normal and are part of the host network's standard behavior. These should not be misinterpreted as firewall issues unless they are persistent.
  • Average latency: On average, the latency of Azure Firewall is expected to range from 1ms to 10 ms, depending on the Firewall SKU and deployment size.

Best Practices for Monitoring Latency

  • Set a baseline: Establish a latency baseline under light traffic conditions for accurate comparisons during normal or peak usage.

  • Monitor for patterns: Expect occasional latency spikes as part of normal operations. If high latency persists beyond these normal variations, it may indicate a deeper issue requiring investigation.

  • Recommended latency threshold: A recommended guideline is that latency should not exceed 3x the baseline. If this threshold is crossed, further investigation is recommended.

  • Check the rule limit: Ensure that the network rules are within the 20K rule limit. Exceeding this limit can affect performance.

  • New application onboarding: Check for any newly onboarded applications that could be adding significant load or causing latency issues.

  • Support request: If you observe continuous latency degredation that does not align with expected behavior, consider filing a support ticket for further assistance.

    Screenshot showing the Azure Firewall Latency Probe metric.

Metric dimensions

For information about what metric dimensions are, see Multi-dimensional metrics.

This service has the following dimensions associated with its metrics.

  • Protocol
  • Reason
  • Status

Resource logs

This section lists the types of resource logs you can collect for this service. The section pulls from the list of all resource logs category types supported in Azure Monitor.

Supported resource logs for Microsoft.Network/azureFirewalls

Category Category display name Log table Supports basic log plan Supports ingestion-time transformation Example queries Costs to export
AZFWApplicationRule Azure Firewall Application Rule AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWApplicationRuleAggregation Azure Firewall Network Rule Aggregation (Policy Analytics) AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWDnsQuery Azure Firewall DNS query AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWFatFlow Azure Firewall Fat Flow Log AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWFlowTrace Azure Firewall Flow Trace Log AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWFqdnResolveFailure Azure Firewall FQDN Resolution Failure AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWIdpsSignature Azure Firewall IDPS Signature AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWNatRule Azure Firewall Nat Rule AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWNatRuleAggregation Azure Firewall Nat Rule Aggregation (Policy Analytics) AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWNetworkRule Azure Firewall Network Rule AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWNetworkRuleAggregation Azure Firewall Application Rule Aggregation (Policy Analytics) AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AZFWThreatIntel Azure Firewall Threat Intelligence AzureDiagnostics

Logs from multiple Azure resources.

No No Queries Yes
AzureFirewallApplicationRule Azure Firewall Application Rule (Legacy Azure Diagnostics) AzureDiagnostics

Logs from multiple Azure resources.

No No Queries No
AzureFirewallDnsProxy Azure Firewall DNS Proxy (Legacy Azure Diagnostics) AzureDiagnostics

Logs from multiple Azure resources.

No No Queries No
AzureFirewallNetworkRule Azure Firewall Network Rule (Legacy Azure Diagnostics) AzureDiagnostics

Logs from multiple Azure resources.

No No Queries No

Azure Firewall has two new diagnostic logs that can help monitor your firewall, but these logs currently do not show application rule details.

  • Top flows
  • Flow trace

Top flows

The top flows log is known in the industry as fat flow log and in the preceding table as Azure Firewall Fat Flow Log. The top flows log shows the top connections that are contributing to the highest throughput through the firewall.

Tip

Activate Top flows logs only when troubleshooting a specific issue to avoid excessive CPU usage of Azure Firewall.

The flow rate is defined as the data transmission rate in megabits per second units. It's a measure of the amount of digital data that can be transmitted over a network in a period of time through the firewall. The Top Flows protocol runs periodically every three minutes. The minimum threshold to be considered a Top Flow is 1 Mbps.

Enable the Top flows log using the following Azure PowerShell commands:

Set-AzContext -SubscriptionName <SubscriptionName>
$firewall = Get-AzFirewall -ResourceGroupName <ResourceGroupName> -Name <FirewallName>
$firewall.EnableFatFlowLogging = $true
Set-AzFirewall -AzureFirewall $firewall

To disable the logs, use the same previous Azure PowerShell command and set the value to False.

For example:

Set-AzContext -SubscriptionName <SubscriptionName>
$firewall = Get-AzFirewall -ResourceGroupName <ResourceGroupName> -Name <FirewallName>
$firewall.EnableFatFlowLogging = $false
Set-AzFirewall -AzureFirewall $firewall

There are a few ways to verify the update was successful, but you can navigate to firewall Overview and select JSON view on the top right corner. Here’s an example:

Screenshot of JSON showing additional log verification.

To create a diagnostic setting and enable Resource Specific Table, see Create diagnostic settings in Azure Monitor.

Flow trace

The firewall logs show traffic through the firewall in the first attempt of a TCP connection, known as the SYN packet. However, such an entry doesn't show the full journey of the packet in the TCP handshake. As a result, it's difficult to troubleshoot if a packet is dropped, or asymmetric routing occurred. The Azure Firewall Flow Trace Log addresses this concern.

Tip

To avoid excessive disk usage caused by Flow trace logs in Azure Firewall with many short-lived connections, activate the logs only when troubleshooting a specific issue for diagnostic purposes.

The following properties can be added:

  • SYN-ACK: ACK flag that indicates acknowledgment of SYN packet.

  • FIN: Finished flag of the original packet flow. No more data is transmitted in the TCP flow.

  • FIN-ACK: ACK flag that indicates acknowledgment of FIN packet.

  • RST: The Reset the flag indicates the original sender doesn't receive more data.

  • INVALID (flows): Indicates packet can’t be identified or don't have any state.

    For example:

    • A TCP packet lands on a Virtual Machine Scale Sets instance, which doesn't have any prior history for this packet
    • Bad CheckSum packets
    • Connection Tracking table entry is full and new connections can't be accepted
    • Overly delayed ACK packets

Enable the Flow trace log using the following Azure PowerShell commands or navigate in the portal and search for Enable TCP Connection Logging:

Connect-AzAccount 
Select-AzSubscription -Subscription <subscription_id> or <subscription_name>
Register-AzProviderFeature -FeatureName AFWEnableTcpConnectionLogging -ProviderNamespace Microsoft.Network
Register-AzResourceProvider -ProviderNamespace Microsoft.Network

It can take several minutes for this change to take effect. Once the feature is registered, consider performing an update on Azure Firewall for the change to take effect immediately.

To check the status of the AzResourceProvider registration, you can run the Azure PowerShell command:

Get-AzProviderFeature -FeatureName "AFWEnableTcpConnectionLogging" -ProviderNamespace "Microsoft.Network"

To disable the log, you can unregister it using the following command or select unregister in the previous portal example.

Unregister-AzProviderFeature -FeatureName AFWEnableTcpConnectionLogging -ProviderNamespace Microsoft.Network

To create a diagnostic setting and enable Resource Specific Table, see Create diagnostic settings in Azure Monitor.

Azure Monitor Logs tables

This section lists the Azure Monitor Logs tables relevant to this service, which are available for query by Log Analytics using Kusto queries. The tables contain resource log data and possibly more depending on what is collected and routed to them.

Azure Firewall Microsoft.Network/azureFirewalls

Activity log

The linked table lists the operations that can be recorded in the activity log for this service. These operations are a subset of all the possible resource provider operations in the activity log.

For more information on the schema of activity log entries, see Activity Log schema.