AksEndpointDeploymentConfiguration Class
Note
This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Represents deployment configuration information for a service deployed on Azure Kubernetes Service.
Create an AksEndpointDeploymentConfiguration object using the deploy_configuration
method of the
AksEndpoint class.
Initialize a configuration object for deploying an Endpoint to an AKS compute target.
- Inheritance
-
AksEndpointDeploymentConfiguration
Constructor
AksEndpointDeploymentConfiguration(autoscale_enabled, autoscale_min_replicas, autoscale_max_replicas, autoscale_refresh_seconds, autoscale_target_utilization, collect_model_data, auth_enabled, cpu_cores, memory_gb, enable_app_insights, scoring_timeout_ms, replica_max_concurrent_requests, max_request_wait_time, num_replicas, primary_key, secondary_key, tags, properties, description, gpu_cores, period_seconds, initial_delay_seconds, timeout_seconds, success_threshold, failure_threshold, namespace, token_auth_enabled, version_name, traffic_percentile, compute_target_name, cpu_cores_limit, memory_gb_limit)
Parameters
Name | Description |
---|---|
autoscale_enabled
Required
|
Whether or not to enable autoscaling for this Webservice.
Defaults to True if |
autoscale_min_replicas
Required
|
The minimum number of containers to use when autoscaling this Webservice. Defaults to 1. |
autoscale_max_replicas
Required
|
The maximum number of containers to use when autoscaling this Webservice. Defaults to 10. |
autoscale_refresh_seconds
Required
|
How often the autoscaler should attempt to scale this Webservice. Defaults to 1. |
autoscale_target_utilization
Required
|
The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice. Defaults to 70. |
collect_model_data
Required
|
Whether or not to enable model data collection for this Webservice. Defaults to False. |
auth_enabled
Required
|
Whether or not to enable auth for this Webservice. Defaults to True. |
cpu_cores
Required
|
The number of cpu cores to allocate for this Webservice. Can be a decimal. Defaults to 0.1 |
memory_gb
Required
|
The amount of memory (in GB) to allocate for this Webservice. Can be a decimal. Defaults to 0.5 |
enable_app_insights
Required
|
Whether or not to enable Application Insights logging for this Webservice. Defaults to False. |
scoring_timeout_ms
Required
|
A timeout to enforce for scoring calls to this Webservice. Defaults to 60000. |
replica_max_concurrent_requests
Required
|
The number of maximum concurrent requests per replica to allow for this Webservice. Defaults to 1. Do not change this setting from the default value of 1 unless instructed by Microsoft Technical Support or a member of Azure Machine Learning team. |
max_request_wait_time
Required
|
The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. Defaults to 500. |
num_replicas
Required
|
The number of containers to allocate for this Webservice. No default, if this parameter is not set then the autoscaler is enabled by default. |
primary_key
Required
|
A primary auth key to use for this Webservice |
secondary_key
Required
|
A secondary auth key to use for this Webservice |
tags
Required
|
Dictionary of key value tags to give this Webservice |
properties
Required
|
Dictionary of key value properties to give this Webservice. These properties cannot be changed after deployment, however new key value pairs can be added. |
description
Required
|
A description to give this Webservice. |
gpu_cores
Required
|
The number of GPU cores to allocate for this Webservice. Defaults to 0. |
period_seconds
Required
|
How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1. |
initial_delay_seconds
Required
|
The number of seconds after the container has started before liveness probes are initiated. Defaults to 310. |
timeout_seconds
Required
|
The number of seconds after which the liveness probe times out. Defaults to 2 second. Minimum value is 1. |
success_threshold
Required
|
The minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1. |
failure_threshold
Required
|
When a Pod starts and the liveness probe fails, Kubernetes will try
|
namespace
Required
|
The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens. |
token_auth_enabled
Required
|
Whether or not to enable Azure Active Directory auth for this Webservice. If this is enabled, users can access this Webservice by fetching access token using their Azure Active Directory credentials. Defaults to False. |
version_name
Required
|
The name of the version in an endpoint. |
traffic_percentile
Required
|
The amount of traffic the version takes in an endpoint. |
cpu_cores_limit
Required
|
The max number of cpu cores this Webservice is allowed to use. Can be a decimal. |
memory_gb_limit
Required
|
The max amount of memory (in GB) this Webservice is allowed to use. Can be a decimal. |
autoscale_enabled
Required
|
Whether or not to enable autoscaling for this Webservice.
Defaults to True if |
autoscale_min_replicas
Required
|
The minimum number of containers to use when autoscaling this Webservice. Defaults to 1. |
autoscale_max_replicas
Required
|
The maximum number of containers to use when autoscaling this Webservice. Defaults to 10. |
autoscale_refresh_seconds
Required
|
How often the autoscaler should attempt to scale this Webservice. Defaults to 1. |
autoscale_target_utilization
Required
|
The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice. Defaults to 70. |
collect_model_data
Required
|
Whether or not to enable model data collection for this Webservice. Defaults to False. |
auth_enabled
Required
|
Whether or not to enable auth for this Webservice. Defaults to True. |
cpu_cores
Required
|
The number of cpu cores to allocate for this Webservice. Can be a decimal. Defaults to 0.1 |
memory_gb
Required
|
The amount of memory (in GB) to allocate for this Webservice. Can be a decimal. Defaults to 0.5 |
enable_app_insights
Required
|
Whether or not to enable Application Insights logging for this Webservice. Defaults to False. |
scoring_timeout_ms
Required
|
A timeout to enforce for scoring calls to this Webservice. Defaults to 60000. |
replica_max_concurrent_requests
Required
|
The number of maximum concurrent requests per replica to allow for this Webservice. Defaults to 1. Do not change this setting from the default value of 1 unless instructed by Microsoft Technical Support or a member of Azure Machine Learning team. |
max_request_wait_time
Required
|
The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. Defaults to 500. |
num_replicas
Required
|
The number of containers to allocate for this Webservice. No default, if this parameter is not set then the autoscaler is enabled by default. |
primary_key
Required
|
A primary auth key to use for this Webservice |
secondary_key
Required
|
A secondary auth key to use for this Webservice |
tags
Required
|
Dictionary of key value tags to give this Webservice |
properties
Required
|
Dictionary of key value properties to give this Webservice. These properties cannot be changed after deployment, however new key value pairs can be added. |
description
Required
|
A description to give this Webservice. |
gpu_cores
Required
|
The number of GPU cores to allocate for this Webservice. Defaults to 0. |
period_seconds
Required
|
How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1. |
initial_delay_seconds
Required
|
The number of seconds after the container has started before liveness probes are initiated. Defaults to 310. |
timeout_seconds
Required
|
The number of seconds after which the liveness probe times out. Defaults to 2 second. Minimum value is 1. |
success_threshold
Required
|
The minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1. |
failure_threshold
Required
|
When a Pod starts and the liveness probe fails, Kubernetes will try
|
namespace
Required
|
The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens. |
token_auth_enabled
Required
|
Whether or not to enable Azure Active Directory auth for this Webservice. If this is enabled, users can access this Webservice by fetching access token using their Azure Active Directory credentials. Defaults to False. |
version_name
Required
|
The name of the version in an endpoint. |
traffic_percentile
Required
|
The amount of traffic the version takes in an endpoint. |
compute_target_name
Required
|
The name of the compute target to deploy to |
cpu_cores_limit
Required
|
The max number of cpu cores this Webservice is allowed to use. Can be a decimal. |
memory_gb_limit
Required
|
The max amount of memory (in GB) this Webservice is allowed to use. Can be a decimal. |
Variables
Name | Description |
---|---|
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.autoscale_enabled
|
Whether or not to enable
autoscaling for this Webservice. Defaults to True if |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.autoscale_min_replicas
|
The minimum number of containers to use when autoscaling this Webservice. Defaults to 1. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.autoscale_max_replicas
|
The maximum number of containers to use when autoscaling this Webservice. Defaults to 10. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.autoscale_refresh_seconds
|
How often the autoscaler should attempt to scale this Webservice. Defaults to 1. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.autoscale_target_utilization
|
The target utilization (in percent out of 100) the autoscaler should attempt to maintain for this Webservice. Defaults to 70. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.collect_model_data
|
Whether or not to enable model data collection for this Webservice. Defaults to False. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.auth_enabled
|
Whether or not to enable auth for this Webservice. Defaults to True. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.cpu_cores
|
The number of cpu cores to allocate for this Webservice. Can be a decimal. Defaults to 0.1 |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.memory_gb
|
The amount of memory (in GB) to allocate for this Webservice. Can be a decimal. Defaults to 0.5 |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.enable_app_insights
|
Whether or not to enable Application Insights logging for this Webservice. Defaults to False. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.scoring_timeout_ms
|
A timeout to enforce for scoring calls to this Webservice. Defaults to 60000. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.replica_max_concurrent_requests
|
The number of maximum concurrent requests per replica to allow for this Webservice. Defaults to 1. Do not change this setting from the default value of 1 unless instructed by Microsoft Technical Support or a member of Azure Machine Learning team. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.max_request_wait_time
|
The maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. Defaults to 500. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.num_replicas
|
The number of containers to allocate for this Webservice. No default, if this parameter is not set then the autoscaler is enabled by default. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.primary_key
|
A primary auth key to use for this Webservice |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.secondary_key
|
A secondary auth key to use for this Webservice |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.tags
|
Dictionary of key value tags to give this Webservice |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.properties
|
Dictionary of key value properties to give this Webservice. These properties cannot be changed after deployment, however new key value pairs can be added. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.description
|
A description to give this Webservice. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.gpu_cores
|
The number of GPU cores to allocate for this Webservice. Defaults to 0. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.period_seconds
|
How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.initial_delay_seconds
|
The number of seconds after the container has started before liveness probes are initiated. Defaults to 310. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.timeout_seconds
|
The number of seconds after which the liveness probe times out. Defaults to 2 second. Minimum value is 1. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.success_threshold
|
The minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.failure_threshold
|
When a Pod starts and the
liveness probe fails, Kubernetes will try |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.namespace
|
The Kubernetes namespace in which to deploy this Webservice: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters cannot be hyphens. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.token_auth_enabled
|
Whether or not to enable Azure Active Directory auth for this Webservice. If this is enabled, users can access this Webservice by fetching access token using their Azure Active Directory credentials. Defaults to False. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.version_name
|
The name of the version in an endpoint. |
azureml.core.webservice.aks.AksEndpointDeploymentConfiguration.traffic_percentile
|
The amount of traffic the version takes in an endpoint. |
Methods
validate_endpoint_configuration |
Check that the specified configuration values are valid. Will raise a WebserviceException if validation fails. |
validate_endpoint_configuration
Check that the specified configuration values are valid.
Will raise a WebserviceException if validation fails.
validate_endpoint_configuration()
Exceptions
Type | Description |
---|---|