Supported metrics for Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments
The following table lists the metrics available for the Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments resource type.
Table headings
Metric - The metric display name as it appears in the Azure portal.
Name in Rest API - Metric name as referred to in the REST API.
Unit - Unit of measure.
Aggregation - The default aggregation type. Valid values: Average, Minimum, Maximum, Total, Count.
Dimensions - Dimensions available for the metric.
Time Grains - Intervals at which the metric is sampled. For example, PT1M
indicates that the metric is sampled every minute, PT30M
every 30 minutes, PT1H
every hour, and so on.
DS Export- Whether the metric is exportable to Azure Monitor Logs via Diagnostic Settings.
For information on exporting metrics, see - Metrics export using data collection rules and Create diagnostic settings in Azure Monitor.
For information on metric retention, see Azure Monitor Metrics overview.
Category | Metric | Name in REST API | Unit | Aggregation | Dimensions | Time Grains | DS Export |
---|---|---|---|---|---|---|---|
Resource | CPU Memory Utilization Percentage Percentage of memory utilization on an instance. Utilization is reported at one minute intervals. |
CpuMemoryUtilizationPercentage |
Percent | Minimum, Maximum, Average | instanceId |
PT1M | Yes |
Resource | CPU Utilization Percentage Percentage of CPU utilization on an instance. Utilization is reported at one minute intervals. |
CpuUtilizationPercentage |
Percent | Minimum, Maximum, Average | instanceId |
PT1M | Yes |
Resource | Data Collection Errors Per Minute The number of data collection events dropped per minute. |
DataCollectionErrorsPerMinute |
Count | Minimum, Maximum, Average | instanceId , reason , type |
PT1M | No |
Resource | Data Collection Events Per Minute The number of data collection events processed per minute. |
DataCollectionEventsPerMinute |
Count | Minimum, Maximum, Average | instanceId , type |
PT1M | No |
Resource | Deployment Capacity The number of instances in the deployment. |
DeploymentCapacity |
Count | Minimum, Maximum, Average | instanceId , State |
PT1M | No |
Resource | Disk Utilization Percentage of disk utilization on an instance. Utilization is reported at one minute intervals. |
DiskUtilization |
Percent | Minimum, Maximum, Average | instanceId , disk |
PT1M | Yes |
Resource | GPU Energy in Joules Interval energy in Joules on a GPU node. Energy is reported at one minute intervals. |
GpuEnergyJoules |
Count | Minimum, Maximum, Average | instanceId |
PT1M | No |
Resource | GPU Memory Utilization Percentage Percentage of GPU memory utilization on an instance. Utilization is reported at one minute intervals. |
GpuMemoryUtilizationPercentage |
Percent | Minimum, Maximum, Average | instanceId |
PT1M | Yes |
Resource | GPU Utilization Percentage Percentage of GPU utilization on an instance. Utilization is reported at one minute intervals. |
GpuUtilizationPercentage |
Percent | Minimum, Maximum, Average | instanceId |
PT1M | Yes |
Traffic | Request Latency P50 The average P50 request latency aggregated by all request latency values collected over the selected time period |
RequestLatency_P50 |
Milliseconds | Average | <none> | PT1M | Yes |
Traffic | Request Latency P90 The average P90 request latency aggregated by all request latency values collected over the selected time period |
RequestLatency_P90 |
Milliseconds | Average | <none> | PT1M | Yes |
Traffic | Request Latency P95 The average P95 request latency aggregated by all request latency values collected over the selected time period |
RequestLatency_P95 |
Milliseconds | Average | <none> | PT1M | Yes |
Traffic | Request Latency P99 The average P99 request latency aggregated by all request latency values collected over the selected time period |
RequestLatency_P99 |
Milliseconds | Average | <none> | PT1M | Yes |
Traffic | Requests Per Minute The number of requests sent to online deployment within a minute |
RequestsPerMinute |
Count | Average | envoy_response_code |
PT1M | No |