Application performance monitoring and troubleshooting solutions for Azure VMware Solution
A key objective of Azure VMware Solution is to maintain the performance and security of applications and services across VMware on Azure and on-premises. Getting there requires visibility into complex infrastructures and quickly pinpointing the root cause of service disruptions across the hybrid cloud.
Microsoft solutions
Microsoft recommends Application Insights, a feature of Azure Monitor, to maximize the availability and performance of your applications and services.
Learn how modern monitoring with Azure Monitor can transform your business by reviewing the product overview, features, getting started guide and more.
Azure Resource Health for Azure VMware Solution Private Cloud (Public preview)
In this article, you learn how Azure Resource Health helps you diagnose and get support for service problems that affect your Private Cloud resources. Azure Resource Health reports on the current and past health of your Private Cloud Infrastructure resources and provides you with a personalized dashboard of the health of the infrastructure resources. Azure Resource Health allows you to report on historical events and can identify every time a service is unavailable and if Service Level Agreement (SLA) is violated.
Preview Enablement
You are required to register yourself for the feature preview under Preview Features of Azure VMware Solution in Azure portal. Customers should first register themselves to "Microsoft.AVS/ResourceHealth" preview flag from Azure portal and once registered, all the preconfigured alerts related to Host replacement, vCenter, and other critical alarms will start to surface in the Resource Health of Azure VMware Solution (AVS) User Interface (UI).
Benefits of enabling Resource Health
Resource Health feature enablement adds significant value to your monitoring capabilities. You get notified about unplanned maintenance that took place in your private cloud infrastructure.
Resource Health gives you a personalized dashboard of the health of your resources. Resource Health shows all the time that your resources have been unavailable which makes it easy for you to check if SLA was violated.
For the Public Preview, a group of critical alerts are enabled which notifies you about Host replacements, storage critical alarms and also about the Network health of your private cloud.
The alerts are updated to have all the necessary information for better reporting and triage purposes.
Resource Health uses Azure Action groups that allow you to configure Email/SMS/Webhook/ITSM and get notified via communication method of your choice.
Once Enabled the health of your private cloud infrastructure reflects following statuses
Available
Unavailable
Unknown
Degraded
Available
Available means that there are no events detected that affect the health of the resource. In cases where the resource recovered from unplanned downtime during the last 24 hours, you see a "Recently resolved" notification
Unavailable
Unavailable means that the service detected an ongoing platform or nonplatform event that affects the health of the resource.
Unknown
Unknown means that Resource Health hasn't received information about the resource for more than 10 minutes. You may see this status under two different conditions:
Your subscription is not enabled for Resource Health metrics, and you need to register yourself for the preview.
If the resource is running as expected, the status of the resource will change to Available after a few minutes. If you experience problems with the resource, the Unknown health status might mean that an event in the private cloud is affecting the resource.
Degraded
Degraded means that Resource Health detected a loss in performance in either one or more private cloud resources, although it's still available for use. Different resources have their own criteria for when they report that they are degraded.
Pre-configured Alarms enabled in Azure Resource Health
Alert Name | Remediation Mode |
---|---|
Physical Disk Health Alarm | System Remediation |
System Board Health Alarm | System Remediation |
Memory Health Alarm | System Remediation |
Storage Health Alarm | System Remediation |
Temperature Health Alarm | System Remediation |
Host Connection State Alarm | System Remediation |
High Availability (HA) host Status | System Remediation |
Network Connectivity Lost Alarm | System Remediation |
Virtual Storage (vSAN) Host Disk Error Alarm | System Remediation |
Voltage Health Alarm | System Remediation |
Processor Health Alarm | System Remediation |
Fan Health Alarm | System Remediation |
High pNIC error rate detected | System Remediation |
iDRAC critical alerts if there are hardware faults (CPU/DIMM/PCI bus/Voltage issues) | System Remediation |
vSphere HA restarted a virtual machine | System Remediation |
Virtual Storage (vSAN) High Disk Utilization | Customer Intervention Required |
Replacement Start and Stop Notification | System Remediation |
Repair Service notification to customers (Host reboot and Restart of Management services) | System Remediation |
Notification to customer when a Virtual Machine is configured to use an external device that prevents a maintenance operation | Customer Intervention Required |
Customer notification when CD-ROM is mounted on the Virtual Machine and its ISO image isn't accessible and blocks maintenance operation | Customer Intervention Required |
Notification to customer when an external Datastore mounted becomes inaccessible and will block maintenance operations | Customer Intervention Required |
Notification to customer when connected network adapter becomes inaccessible and blocks any maintenance operations | Customer Intervention Required |
VMware Network (NSX –T) alarms (Customer notification about License expiration) | Customer Intervention Required |
Next Steps
Now that you have configured an alert rule for your Azure VMware Solution private cloud, you can learn more about:
You can also continue with one of the other Azure VMware Solution how-to guides
Third-party solutions
Our application performance monitoring and troubleshooting partners have industry-leading solutions in VMware-based environments that assure the availability, reliability, and responsiveness of applications and services. You can adopt many of the solutions integrated with VMware NSX-T Data Center for their on-premises deployments. As one of our key principles, we want to enable you to continue to use your investments and VMware solutions running on Azure. Many of the Independent Software Vendors (ISV) already validated their solutions with Azure VMware Solution.
You can find more information about these solutions here: