Microsoft.MachineLearningServices workspaces/inferencePools 2024-04-01-preview

Bicep resource definition

The workspaces/inferencePools resource type can be deployed with operations that target:

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.MachineLearningServices/workspaces/inferencePools resource, add the following Bicep to your template.

resource symbolicname 'Microsoft.MachineLearningServices/workspaces/inferencePools@2024-04-01-preview' = {
  parent: resourceSymbolicName
  identity: {
    type: 'string'
    userAssignedIdentities: {
      {customized property}: {}
    }
  }
  kind: 'string'
  location: 'string'
  name: 'string'
  properties: {
    codeConfiguration: {
      codeId: 'string'
      scoringScript: 'string'
    }
    description: 'string'
    environmentConfiguration: {
      environmentId: 'string'
      environmentVariables: {
        {customized property}: 'string'
      }
      livenessProbe: {
        failureThreshold: int
        initialDelay: 'string'
        period: 'string'
        successThreshold: int
        timeout: 'string'
      }
      readinessProbe: {
        failureThreshold: int
        initialDelay: 'string'
        period: 'string'
        successThreshold: int
        timeout: 'string'
      }
      startupProbe: {
        failureThreshold: int
        initialDelay: 'string'
        period: 'string'
        successThreshold: int
        timeout: 'string'
      }
    }
    modelConfiguration: {
      modelId: 'string'
    }
    nodeSkuType: 'string'
    properties: {
      {customized property}: 'string'
    }
    requestConfiguration: {
      maxConcurrentRequestsPerInstance: int
      requestTimeout: 'string'
    }
  }
  sku: {
    capacity: int
    family: 'string'
    name: 'string'
    size: 'string'
    tier: 'string'
  }
  tags: {
    {customized property}: 'string'
  }
}

Property values

CodeConfiguration

Name Description Value
codeId ARM resource ID of the code asset. string
scoringScript [Required] The script to execute on startup. eg. "score.py" string

Constraints:
Min length = 1
Pattern = [a-zA-Z0-9_] (required)

InferencePoolProperties

Name Description Value
codeConfiguration Code configuration for the inference pool. CodeConfiguration
description Description of the resource. string
environmentConfiguration EnvironmentConfiguration for the inference pool. PoolEnvironmentConfiguration
modelConfiguration ModelConfiguration for the inference pool. PoolModelConfiguration
nodeSkuType [Required] Compute instance type. string

Constraints:
Min length = 1
Pattern = [a-zA-Z0-9_] (required)
properties Property dictionary. Properties can be added, but not removed or altered. PropertiesBaseProperties
requestConfiguration Request configuration for the inference pool. RequestConfiguration

ManagedServiceIdentity

Name Description Value
type Type of managed service identity (where both SystemAssigned and UserAssigned types are allowed). 'None'
'SystemAssigned'
'SystemAssigned,UserAssigned'
'UserAssigned' (required)
userAssignedIdentities The set of user assigned identities associated with the resource. The userAssignedIdentities dictionary keys will be ARM resource ids in the form: '/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{identityName}. The dictionary values can be empty objects ({}) in requests. UserAssignedIdentities

Microsoft.MachineLearningServices/workspaces/inferencePools

Name Description Value
identity Managed service identity (system assigned and/or user assigned identities) ManagedServiceIdentity
kind Metadata used by portal/tooling/etc to render different UX experiences for resources of the same type. string
location The geo-location where the resource lives string (required)
name The resource name string

Constraints:
Pattern = ^[a-zA-Z0-9][a-zA-Z0-9\-_]{0,254}$ (required)
parent In Bicep, you can specify the parent resource for a child resource. You only need to add this property when the child resource is declared outside of the parent resource.

For more information, see Child resource outside parent resource.
Symbolic name for resource of type: workspaces
properties [Required] Additional attributes of the entity. InferencePoolProperties (required)
sku Sku details required for ARM contract for Autoscaling. Sku
tags Resource tags Dictionary of tag names and values. See Tags in templates

PoolEnvironmentConfiguration

Name Description Value
environmentId ARM resource ID of the environment specification for the inference pool. string
environmentVariables Environment variables configuration for the inference pool. PoolEnvironmentConfigurationEnvironmentVariables
livenessProbe Liveness probe monitors the health of the container regularly. ProbeSettings
readinessProbe Readiness probe validates if the container is ready to serve traffic. The properties and defaults are the same as liveness probe. ProbeSettings
startupProbe This verifies whether the application within a container is started. Startup probes run before any other probe, and, unless it finishes successfully, disables other probes. ProbeSettings

PoolEnvironmentConfigurationEnvironmentVariables

Name Description Value

PoolModelConfiguration

Name Description Value
modelId The URI path to the model. string

ProbeSettings

Name Description Value
failureThreshold The number of failures to allow before returning an unhealthy status. int
initialDelay The delay before the first probe in ISO 8601 format. string
period The length of time between probes in ISO 8601 format. string
successThreshold The number of successful probes before returning a healthy status. int
timeout The probe timeout in ISO 8601 format. string

PropertiesBaseProperties

Name Description Value

RequestConfiguration

Name Description Value
maxConcurrentRequestsPerInstance The number of maximum concurrent requests per node allowed per deployment. Defaults to 1. int
requestTimeout The scoring timeout in ISO 8601 format.
Defaults to 5000ms.
string

Sku

Name Description Value
capacity If the SKU supports scale out/in then the capacity integer should be included. If scale out/in is not possible for the resource this may be omitted. int
family If the service has different generations of hardware, for the same SKU, then that can be captured here. string
name The name of the SKU. Ex - P3. It is typically a letter+number code string (required)
size The SKU size. When the name field is the combination of tier and some other value, this would be the standalone code. string
tier This field is required to be implemented by the Resource Provider if the service has more than one tier, but is not required on a PUT. 'Basic'
'Free'
'Premium'
'Standard'

TrackedResourceTags

Name Description Value

UserAssignedIdentities

Name Description Value

UserAssignedIdentity

Name Description Value

ARM template resource definition

The workspaces/inferencePools resource type can be deployed with operations that target:

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.MachineLearningServices/workspaces/inferencePools resource, add the following JSON to your template.

{
  "type": "Microsoft.MachineLearningServices/workspaces/inferencePools",
  "apiVersion": "2024-04-01-preview",
  "name": "string",
  "identity": {
    "type": "string",
    "userAssignedIdentities": {
      "{customized property}": {
      }
    }
  },
  "kind": "string",
  "location": "string",
  "properties": {
    "codeConfiguration": {
      "codeId": "string",
      "scoringScript": "string"
    },
    "description": "string",
    "environmentConfiguration": {
      "environmentId": "string",
      "environmentVariables": {
        "{customized property}": "string"
      },
      "livenessProbe": {
        "failureThreshold": "int",
        "initialDelay": "string",
        "period": "string",
        "successThreshold": "int",
        "timeout": "string"
      },
      "readinessProbe": {
        "failureThreshold": "int",
        "initialDelay": "string",
        "period": "string",
        "successThreshold": "int",
        "timeout": "string"
      },
      "startupProbe": {
        "failureThreshold": "int",
        "initialDelay": "string",
        "period": "string",
        "successThreshold": "int",
        "timeout": "string"
      }
    },
    "modelConfiguration": {
      "modelId": "string"
    },
    "nodeSkuType": "string",
    "properties": {
      "{customized property}": "string"
    },
    "requestConfiguration": {
      "maxConcurrentRequestsPerInstance": "int",
      "requestTimeout": "string"
    }
  },
  "sku": {
    "capacity": "int",
    "family": "string",
    "name": "string",
    "size": "string",
    "tier": "string"
  },
  "tags": {
    "{customized property}": "string"
  }
}

Property values

CodeConfiguration

Name Description Value
codeId ARM resource ID of the code asset. string
scoringScript [Required] The script to execute on startup. eg. "score.py" string

Constraints:
Min length = 1
Pattern = [a-zA-Z0-9_] (required)

InferencePoolProperties

Name Description Value
codeConfiguration Code configuration for the inference pool. CodeConfiguration
description Description of the resource. string
environmentConfiguration EnvironmentConfiguration for the inference pool. PoolEnvironmentConfiguration
modelConfiguration ModelConfiguration for the inference pool. PoolModelConfiguration
nodeSkuType [Required] Compute instance type. string

Constraints:
Min length = 1
Pattern = [a-zA-Z0-9_] (required)
properties Property dictionary. Properties can be added, but not removed or altered. PropertiesBaseProperties
requestConfiguration Request configuration for the inference pool. RequestConfiguration

ManagedServiceIdentity

Name Description Value
type Type of managed service identity (where both SystemAssigned and UserAssigned types are allowed). 'None'
'SystemAssigned'
'SystemAssigned,UserAssigned'
'UserAssigned' (required)
userAssignedIdentities The set of user assigned identities associated with the resource. The userAssignedIdentities dictionary keys will be ARM resource ids in the form: '/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{identityName}. The dictionary values can be empty objects ({}) in requests. UserAssignedIdentities

Microsoft.MachineLearningServices/workspaces/inferencePools

Name Description Value
apiVersion The api version '2024-04-01-preview'
identity Managed service identity (system assigned and/or user assigned identities) ManagedServiceIdentity
kind Metadata used by portal/tooling/etc to render different UX experiences for resources of the same type. string
location The geo-location where the resource lives string (required)
name The resource name string

Constraints:
Pattern = ^[a-zA-Z0-9][a-zA-Z0-9\-_]{0,254}$ (required)
properties [Required] Additional attributes of the entity. InferencePoolProperties (required)
sku Sku details required for ARM contract for Autoscaling. Sku
tags Resource tags Dictionary of tag names and values. See Tags in templates
type The resource type 'Microsoft.MachineLearningServices/workspaces/inferencePools'

PoolEnvironmentConfiguration

Name Description Value
environmentId ARM resource ID of the environment specification for the inference pool. string
environmentVariables Environment variables configuration for the inference pool. PoolEnvironmentConfigurationEnvironmentVariables
livenessProbe Liveness probe monitors the health of the container regularly. ProbeSettings
readinessProbe Readiness probe validates if the container is ready to serve traffic. The properties and defaults are the same as liveness probe. ProbeSettings
startupProbe This verifies whether the application within a container is started. Startup probes run before any other probe, and, unless it finishes successfully, disables other probes. ProbeSettings

PoolEnvironmentConfigurationEnvironmentVariables

Name Description Value

PoolModelConfiguration

Name Description Value
modelId The URI path to the model. string

ProbeSettings

Name Description Value
failureThreshold The number of failures to allow before returning an unhealthy status. int
initialDelay The delay before the first probe in ISO 8601 format. string
period The length of time between probes in ISO 8601 format. string
successThreshold The number of successful probes before returning a healthy status. int
timeout The probe timeout in ISO 8601 format. string

PropertiesBaseProperties

Name Description Value

RequestConfiguration

Name Description Value
maxConcurrentRequestsPerInstance The number of maximum concurrent requests per node allowed per deployment. Defaults to 1. int
requestTimeout The scoring timeout in ISO 8601 format.
Defaults to 5000ms.
string

Sku

Name Description Value
capacity If the SKU supports scale out/in then the capacity integer should be included. If scale out/in is not possible for the resource this may be omitted. int
family If the service has different generations of hardware, for the same SKU, then that can be captured here. string
name The name of the SKU. Ex - P3. It is typically a letter+number code string (required)
size The SKU size. When the name field is the combination of tier and some other value, this would be the standalone code. string
tier This field is required to be implemented by the Resource Provider if the service has more than one tier, but is not required on a PUT. 'Basic'
'Free'
'Premium'
'Standard'

TrackedResourceTags

Name Description Value

UserAssignedIdentities

Name Description Value

UserAssignedIdentity

Name Description Value

Terraform (AzAPI provider) resource definition

The workspaces/inferencePools resource type can be deployed with operations that target:

  • Resource groups

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.MachineLearningServices/workspaces/inferencePools resource, add the following Terraform to your template.

resource "azapi_resource" "symbolicname" {
  type = "Microsoft.MachineLearningServices/workspaces/inferencePools@2024-04-01-preview"
  name = "string"
  identity = {
    type = "string"
    userAssignedIdentities = {
      {customized property} = {
      }
    }
  }
  kind = "string"
  location = "string"
  sku = {
    capacity = int
    family = "string"
    name = "string"
    size = "string"
    tier = "string"
  }
  tags = {
    {customized property} = "string"
  }
  body = jsonencode({
    properties = {
      codeConfiguration = {
        codeId = "string"
        scoringScript = "string"
      }
      description = "string"
      environmentConfiguration = {
        environmentId = "string"
        environmentVariables = {
          {customized property} = "string"
        }
        livenessProbe = {
          failureThreshold = int
          initialDelay = "string"
          period = "string"
          successThreshold = int
          timeout = "string"
        }
        readinessProbe = {
          failureThreshold = int
          initialDelay = "string"
          period = "string"
          successThreshold = int
          timeout = "string"
        }
        startupProbe = {
          failureThreshold = int
          initialDelay = "string"
          period = "string"
          successThreshold = int
          timeout = "string"
        }
      }
      modelConfiguration = {
        modelId = "string"
      }
      nodeSkuType = "string"
      properties = {
        {customized property} = "string"
      }
      requestConfiguration = {
        maxConcurrentRequestsPerInstance = int
        requestTimeout = "string"
      }
    }
  })
}

Property values

CodeConfiguration

Name Description Value
codeId ARM resource ID of the code asset. string
scoringScript [Required] The script to execute on startup. eg. "score.py" string

Constraints:
Min length = 1
Pattern = [a-zA-Z0-9_] (required)

InferencePoolProperties

Name Description Value
codeConfiguration Code configuration for the inference pool. CodeConfiguration
description Description of the resource. string
environmentConfiguration EnvironmentConfiguration for the inference pool. PoolEnvironmentConfiguration
modelConfiguration ModelConfiguration for the inference pool. PoolModelConfiguration
nodeSkuType [Required] Compute instance type. string

Constraints:
Min length = 1
Pattern = [a-zA-Z0-9_] (required)
properties Property dictionary. Properties can be added, but not removed or altered. PropertiesBaseProperties
requestConfiguration Request configuration for the inference pool. RequestConfiguration

ManagedServiceIdentity

Name Description Value
type Type of managed service identity (where both SystemAssigned and UserAssigned types are allowed). 'None'
'SystemAssigned'
'SystemAssigned,UserAssigned'
'UserAssigned' (required)
userAssignedIdentities The set of user assigned identities associated with the resource. The userAssignedIdentities dictionary keys will be ARM resource ids in the form: '/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{identityName}. The dictionary values can be empty objects ({}) in requests. UserAssignedIdentities

Microsoft.MachineLearningServices/workspaces/inferencePools

Name Description Value
identity Managed service identity (system assigned and/or user assigned identities) ManagedServiceIdentity
kind Metadata used by portal/tooling/etc to render different UX experiences for resources of the same type. string
location The geo-location where the resource lives string (required)
name The resource name string

Constraints:
Pattern = ^[a-zA-Z0-9][a-zA-Z0-9\-_]{0,254}$ (required)
parent_id The ID of the resource that is the parent for this resource. ID for resource of type: workspaces
properties [Required] Additional attributes of the entity. InferencePoolProperties (required)
sku Sku details required for ARM contract for Autoscaling. Sku
tags Resource tags Dictionary of tag names and values.
type The resource type "Microsoft.MachineLearningServices/workspaces/inferencePools@2024-04-01-preview"

PoolEnvironmentConfiguration

Name Description Value
environmentId ARM resource ID of the environment specification for the inference pool. string
environmentVariables Environment variables configuration for the inference pool. PoolEnvironmentConfigurationEnvironmentVariables
livenessProbe Liveness probe monitors the health of the container regularly. ProbeSettings
readinessProbe Readiness probe validates if the container is ready to serve traffic. The properties and defaults are the same as liveness probe. ProbeSettings
startupProbe This verifies whether the application within a container is started. Startup probes run before any other probe, and, unless it finishes successfully, disables other probes. ProbeSettings

PoolEnvironmentConfigurationEnvironmentVariables

Name Description Value

PoolModelConfiguration

Name Description Value
modelId The URI path to the model. string

ProbeSettings

Name Description Value
failureThreshold The number of failures to allow before returning an unhealthy status. int
initialDelay The delay before the first probe in ISO 8601 format. string
period The length of time between probes in ISO 8601 format. string
successThreshold The number of successful probes before returning a healthy status. int
timeout The probe timeout in ISO 8601 format. string

PropertiesBaseProperties

Name Description Value

RequestConfiguration

Name Description Value
maxConcurrentRequestsPerInstance The number of maximum concurrent requests per node allowed per deployment. Defaults to 1. int
requestTimeout The scoring timeout in ISO 8601 format.
Defaults to 5000ms.
string

Sku

Name Description Value
capacity If the SKU supports scale out/in then the capacity integer should be included. If scale out/in is not possible for the resource this may be omitted. int
family If the service has different generations of hardware, for the same SKU, then that can be captured here. string
name The name of the SKU. Ex - P3. It is typically a letter+number code string (required)
size The SKU size. When the name field is the combination of tier and some other value, this would be the standalone code. string
tier This field is required to be implemented by the Resource Provider if the service has more than one tier, but is not required on a PUT. 'Basic'
'Free'
'Premium'
'Standard'

TrackedResourceTags

Name Description Value

UserAssignedIdentities

Name Description Value

UserAssignedIdentity

Name Description Value