Supported metrics for Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments

Artikkeli
02/18/2025

The following table lists the metrics available for the Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments resource type.

Table headings

Metric - The metric display name as it appears in the Azure portal.
Name in Rest API - Metric name as referred to in the REST API.
Unit - Unit of measure.
Aggregation - The default aggregation type. Valid values: Average, Minimum, Maximum, Total, Count.
Dimensions - Dimensions available for the metric.
Time Grains - Intervals at which the metric is sampled. For example, PT1M indicates that the metric is sampled every minute, PT30M every 30 minutes, PT1H every hour, and so on.
DS Export- Whether the metric is exportable to Azure Monitor Logs via Diagnostic Settings.

For information on exporting metrics, see - Metrics export using data collection rules and Create diagnostic settings in Azure Monitor.

For information on metric retention, see Azure Monitor Metrics overview.

Category: Resource

Metric	Name in REST API	Unit	Aggregation	Dimensions	Time Grains	DS Export
CPU Memory Utilization Percentage Percentage of memory utilization on an instance. Utilization is reported at one minute intervals.	`CpuMemoryUtilizationPercentage`	Percent	Minimum, Maximum, Average	`instanceId`	PT1M	Yes
CPU Utilization Percentage Percentage of CPU utilization on an instance. Utilization is reported at one minute intervals.	`CpuUtilizationPercentage`	Percent	Minimum, Maximum, Average	`instanceId`	PT1M	Yes
Data Collection Errors Per Minute The number of data collection events dropped per minute.	`DataCollectionErrorsPerMinute`	Count	Minimum, Maximum, Average	`instanceId`, `reason`, `type`	PT1M	No
Data Collection Events Per Minute The number of data collection events processed per minute.	`DataCollectionEventsPerMinute`	Count	Minimum, Maximum, Average	`instanceId`, `type`	PT1M	No
Deployment Capacity The number of instances in the deployment.	`DeploymentCapacity`	Count	Minimum, Maximum, Average	`instanceId`, `State`	PT1M	No
Disk Utilization Percentage of disk utilization on an instance. Utilization is reported at one minute intervals.	`DiskUtilization`	Percent	Minimum, Maximum, Average	`instanceId`, `disk`	PT1M	Yes
GPU Energy in Joules Interval energy in Joules on a GPU node. Energy is reported at one minute intervals.	`GpuEnergyJoules`	Count	Minimum, Maximum, Average	`instanceId`	PT1M	No
GPU Memory Utilization Percentage Percentage of GPU memory utilization on an instance. Utilization is reported at one minute intervals.	`GpuMemoryUtilizationPercentage`	Percent	Minimum, Maximum, Average	`instanceId`	PT1M	Yes
GPU Utilization Percentage Percentage of GPU utilization on an instance. Utilization is reported at one minute intervals.	`GpuUtilizationPercentage`	Percent	Minimum, Maximum, Average	`instanceId`	PT1M	Yes

Category: Traffic

Metric	Name in REST API	Unit	Aggregation	Dimensions	Time Grains	DS Export
Request Latency P50 The average P50 request latency aggregated by all request latency values collected over the selected time period	`RequestLatency_P50`	Milliseconds	Average	<none>	PT1M	Yes
Request Latency P90 The average P90 request latency aggregated by all request latency values collected over the selected time period	`RequestLatency_P90`	Milliseconds	Average	<none>	PT1M	Yes
Request Latency P95 The average P95 request latency aggregated by all request latency values collected over the selected time period	`RequestLatency_P95`	Milliseconds	Average	<none>	PT1M	Yes
Request Latency P99 The average P99 request latency aggregated by all request latency values collected over the selected time period	`RequestLatency_P99`	Milliseconds	Average	<none>	PT1M	Yes
Requests Per Minute The number of requests sent to online deployment within a minute	`RequestsPerMinute`	Count	Average	`envoy_response_code`	PT1M	No

Jaa

Supported metrics for Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments

Category: Resource

Category: Traffic

Next steps

Palaute

Lisäresursseja