Microsoft.MachineLearningServices workspaces/onlineEndpoints/deployments 2024-01-01-preview

Article
12/09/2024

Bicep resource definition

The workspaces/onlineEndpoints/deployments resource type can be deployed with operations that target:

Resource groups - See resource group deployment commands

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments resource, add the following Bicep to your template.

resource symbolicname 'Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments@2024-01-01-preview' = {
  parent: resourceSymbolicName
  identity: {
    type: 'string'
    userAssignedIdentities: {
      {customized property}: {}
    }
  }
  kind: 'string'
  location: 'string'
  name: 'string'
  properties: {
    appInsightsEnabled: bool
    codeConfiguration: {
      codeId: 'string'
      scoringScript: 'string'
    }
    dataCollector: {
      collections: {
        {customized property}: {
          clientId: 'string'
          dataCollectionMode: 'string'
          dataId: 'string'
          samplingRate: int
        }
      }
      requestLogging: {
        captureHeaders: [
          'string'
        ]
      }
      rollingRate: 'string'
    }
    description: 'string'
    egressPublicNetworkAccess: 'string'
    environmentId: 'string'
    environmentVariables: {
      {customized property}: 'string'
    }
    instanceType: 'string'
    livenessProbe: {
      failureThreshold: int
      initialDelay: 'string'
      period: 'string'
      successThreshold: int
      timeout: 'string'
    }
    model: 'string'
    modelMountPath: 'string'
    properties: {
      {customized property}: 'string'
    }
    readinessProbe: {
      failureThreshold: int
      initialDelay: 'string'
      period: 'string'
      successThreshold: int
      timeout: 'string'
    }
    requestSettings: {
      maxConcurrentRequestsPerInstance: int
      maxQueueWait: 'string'
      requestTimeout: 'string'
    }
    scaleSettings: {
      scaleType: 'string'
      // For remaining properties, see OnlineScaleSettings objects
    }
    endpointComputeType: 'string'
    // For remaining properties, see OnlineDeploymentProperties objects
  }
  sku: {
    capacity: int
    family: 'string'
    name: 'string'
    size: 'string'
    tier: 'string'
  }
  tags: {
    {customized property}: 'string'
  }
}

OnlineScaleSettings objects

Set the scaleType property to specify the type of object.

For Default, use:

{
  scaleType: 'Default'
}

For TargetUtilization, use:

{
  maxInstances: int
  minInstances: int
  pollingInterval: 'string'
  scaleType: 'TargetUtilization'
  targetUtilizationPercentage: int
}

OnlineDeploymentProperties objects

Set the endpointComputeType property to specify the type of object.

For Kubernetes, use:

{
  containerResourceRequirements: {
    containerResourceLimits: {
      cpu: 'string'
      gpu: 'string'
      memory: 'string'
    }
    containerResourceRequests: {
      cpu: 'string'
      gpu: 'string'
      memory: 'string'
    }
  }
  endpointComputeType: 'Kubernetes'
}

For Managed, use:

{
  endpointComputeType: 'Managed'
}

Property Values

CodeConfiguration

Name	Description	Value
codeId	ARM resource ID of the code asset.	string
scoringScript	[Required] The script to execute on startup. eg. "score.py"	string Constraints: Min length = 1 Pattern = `[a-zA-Z0-9_]` (required)

Collection

Name	Description	Value
clientId	The msi client id used to collect logging to blob storage. If it's null,backend will pick a registered endpoint identity to auth.	string
dataCollectionMode	Enable or disable data collection.	'Disabled' 'Enabled'
dataId	The data asset arm resource id. Client side will ensure data asset is pointing to the blob storage, and backend will collect data to the blob storage.	string
samplingRate	The sampling rate for collection. Sampling rate 1.0 means we collect 100% of data by default.	int

ContainerResourceRequirements

Name	Description	Value
containerResourceLimits	Container resource limit info:	ContainerResourceSettings
containerResourceRequests	Container resource request info:	ContainerResourceSettings

ContainerResourceSettings

Name	Description	Value
cpu	Number of vCPUs request/limit for container. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/	string
gpu	Number of Nvidia GPU cards request/limit for container. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/	string
memory	Memory size request/limit for container. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/	string

DataCollector

Name	Description	Value
collections	[Required] The collection configuration. Each collection has it own configuration to collect model data and the name of collection can be arbitrary string. Model data collector can be used for either payload logging or custom logging or both of them. Collection request and response are reserved for payload logging, others are for custom logging.	DataCollectorCollections (required)
requestLogging	The request logging configuration for mdc, it includes advanced logging settings for all collections. It's optional.	RequestLogging
rollingRate	When model data is collected to blob storage, we need to roll the data to different path to avoid logging all of them in a single blob file. If the rolling rate is hour, all data will be collected in the blob path /yyyy/MM/dd/HH/. If it's day, all data will be collected in blob path /yyyy/MM/dd/. The other benefit of rolling path is that model monitoring ui is able to select a time range of data very quickly.	'Day' 'Hour' 'Minute' 'Month' 'Year'

DataCollectorCollections

Name	Description	Value

DefaultScaleSettings

Name	Description	Value
scaleType	[Required] Type of deployment scaling algorithm	'Default' (required)

EndpointDeploymentPropertiesBaseEnvironmentVariables

Name	Description	Value

EndpointDeploymentPropertiesBaseProperties

Name	Description	Value

KubernetesOnlineDeployment

Name	Description	Value
containerResourceRequirements	The resource requirements for the container (cpu and memory).	ContainerResourceRequirements
endpointComputeType	[Required] The compute type of the endpoint.	'Kubernetes' (required)

ManagedOnlineDeployment

Name	Description	Value
endpointComputeType	[Required] The compute type of the endpoint.	'Managed' (required)

ManagedServiceIdentity

Name	Description	Value
type	Type of managed service identity (where both SystemAssigned and UserAssigned types are allowed).	'None' 'SystemAssigned' 'SystemAssigned,UserAssigned' 'UserAssigned' (required)
userAssignedIdentities	The set of user assigned identities associated with the resource. The userAssignedIdentities dictionary keys will be ARM resource ids in the form: '/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{identityName}. The dictionary values can be empty objects ({}) in requests.	UserAssignedIdentities