Επεξεργασία

Κοινή χρήση μέσω


Use an Azure Resource Manager template to create a workspace for Azure Machine Learning

In this article, you learn several ways to create an Azure Machine Learning workspace using Azure Resource Manager templates. A Resource Manager template makes it easy to create resources as a single, coordinated operation. A template is a JSON document that defines the resources that are needed for a deployment. It might also specify deployment parameters. Parameters are used to provide input values when using the template.

For more information, see Deploy an application with Azure Resource Manager template.

Prerequisites

Limitations

  • When you create a new workspace, you can either automatically create services needed by the workspace or use existing services. If you want to use existing services from a different Azure subscription than the workspace, you must register the Azure Machine Learning namespace in the subscription that contains those services. For example, if you create a workspace in subscription A that uses a storage account in subscription B, the Azure Machine Learning namespace must be registered in subscription B before the workspace can use the storage account.

    The resource provider for Azure Machine Learning is Microsoft.MachineLearningServices. For information on seeing whether it's registered or registering it, see Azure resource providers and types.

    Important

    This information applies only to resources provided during workspace creation: Azure Storage Accounts, Azure Container Registry, Azure Key Vault, and Application Insights.

  • The example template might not always use the latest API version for Azure Machine Learning. Before using the template, we recommend modifying it to use the latest API versions. For information on the latest API versions for Azure Machine Learning, see the Azure Machine Learning REST API.

    Tip

    Each Azure service has its own set of API versions. For information on the API for a specific service, check the service information in the Azure REST API reference.

    To update the API version, find the "apiVersion": "YYYY-MM-DD" entry for the resource type and update it to the latest version. The following example is an entry for Azure Machine Learning:

    "type": "Microsoft.MachineLearningServices/workspaces",
    "apiVersion": "2023-10-01",
    

Multiple workspaces in the same virtual network

The template doesn't support multiple Azure Machine Learning workspaces deployed in the same virtual network. This limitation is because the template creates new DNS zones during deployment.

If you want to create a template that deploys multiple workspaces in the same virtual network, set it up manually (using the Azure portal or CLI). Then use the Azure portal to generate a template.

About the Azure Resource Manager template

The Azure Resource Manager template used throughout this document can be found in the microsoft.machineleaerningservices/machine-learning-workspace-vnet directory of the Azure Quickstart Templates GitHub repository.

This template creates the following Azure services:

  • Azure Storage Account
  • Azure Key Vault
  • Azure Application Insights
  • Azure Container Registry
  • Azure Machine Learning workspace

The resource group is the container that holds the services. The Azure Machine Learning workspace uses these services for functionality such as storing data, secrets, logging, and Docker images.

The example template has two required parameters:

  • The location where the resources are created.

    The template uses the location you select for most resources. The exception is the Application Insights service, which isn't available in all of the locations that the other services are. If you select a location where it isn't available, the service is created in the South Central US location.

  • The workspaceName, which is the friendly name of the Azure Machine Learning workspace.

    Note

    The workspace name is case-insensitive.

    The names of the other services are generated randomly.

Tip

While the template associated with this document creates a new Azure Container Registry, you can also create a new workspace without creating a container registry. One will be created when you perform an operation that requires a container registry. For example, training or deploying a model.

You can also reference an existing container registry or storage account in the Azure Resource Manager template, instead of creating a new one. When doing so, you must either use a managed identity (preview), or enable the admin account for the container registry.

Warning

Once an Azure Container Registry is created for a workspace, don't delete it. Doing so breaks your Azure Machine Learning workspace.

For more information on templates, see the following articles:

Deploy template

To deploy your template, you have to create a resource group.

See the Azure portal section if you prefer using the graphical user interface.

az group create --name "examplegroup" --location "eastus"

Once your resource group is successfully created, deploy the template with the following command:

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" location="eastus"

By default, all of the resources created as part of the template are new. However, you also have the option of using existing resources. By providing other parameters to the template, you can use existing resources. For example, if you want to use an existing storage account set the storageAccountOption value to existing and provide the name of your storage account in the storageAccountName parameter.

Important

If you want to use an existing Azure Storage account, it cannot be a premium account (Premium_LRS and Premium_GRS). It also cannot have a hierarchical namespace (used with Azure Data Lake Storage Gen2). Neither premium storage or hierarchical namespace are supported with the default storage account of the workspace. Neither premium storage or hierarchical namespaces are supported with the default storage account of the workspace. You can use premium storage or hierarchical namespace with non-default storage accounts.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      storageAccountOption="existing" \
      storageAccountName="existingstorageaccountname"

Deploy an encrypted workspace

The following example template demonstrates how to create a workspace with three settings:

  • Enable high confidentiality settings for the workspace. This configuration creates a new Azure Cosmos DB instance.
  • Enable encryption for the workspace.
  • Uses an existing Azure Key Vault to retrieve customer-managed keys. Customer-managed keys are used to create a new Azure Cosmos DB instance for the workspace.

Important

Once a workspace has been created, you cannot change the settings for confidential data, encryption, key vault ID, or key identifiers. To change these values, you must create a new workspace using the new values.

For more information, see Customer-managed keys.

Important

There are some specific requirements your subscription must meet before using this template:

  • You must have an existing Azure Key Vault that contains an encryption key.
  • The Azure Key Vault must be in the same region where you plan to create the Azure Machine Learning workspace.
  • You must specify the ID of the Azure Key Vault and the URI of the encryption key.

For steps on creating the vault and key, see Configure customer-managed keys.

To get the values for the cmk_keyvault (ID of the Key Vault) and the resource_cmk_uri (key URI) parameters needed by this template, use the following steps:

  1. To get the Key Vault ID, use the following command:

    az keyvault show --name <keyvault-name> --query 'id' --output tsv    
    

    This command returns a value similar to /subscriptions/{subscription-guid}/resourceGroups/<resource-group-name>/providers/Microsoft.KeyVault/vaults/<keyvault-name>.

  2. To get the value for the URI for the customer managed key, use the following command:

    az keyvault key show --vault-name <keyvault-name> --name <key-name> --query 'key.kid' --output tsv    
    

This command returns a value similar to https://mykeyvault.vault.azure.net/keys/mykey/{guid}.

Important

Once a workspace has been created, you cannot change the settings for confidential data, encryption, key vault ID, or key identifiers. To change these values, you must create a new workspace using the new values.

To enable use of Customer Managed Keys, set the following parameters when deploying the template:

  • encryption_status to Enabled.
  • cmk_keyvault to the cmk_keyvault value obtained in previous steps.
  • resource_cmk_uri to the resource_cmk_uri value obtained in previous steps.
az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      encryption_status="Enabled" \
      cmk_keyvault="/subscriptions/{subscription-guid}/resourceGroups/<resource-group-name>/providers/Microsoft.KeyVault/vaults/<keyvault-name>" \
      resource_cmk_uri="https://mykeyvault.vault.azure.net/keys/mykey/{guid}" \

When you use a customer-managed key, Azure Machine Learning creates a secondary resource group which contains the Azure Cosmos DB instance. For more information, see Encryption at rest in Azure Cosmos DB.

Another configuration you can provide for your data is to set the confidential_data parameter to true. Doing so, enables the following behavior:

  • Starts encrypting the local scratch disk for Azure Machine Learning compute clusters, providing you haven't created any previous clusters in your subscription. If you had previously created a cluster in the subscription, open a support ticket to have encryption of the scratch disk enabled for your compute clusters.

  • Cleans up the local scratch disk between jobs.

  • Securely passes credentials for the storage account, container registry, and SSH account from the execution layer to your compute clusters by using key vault.

  • Enables IP filtering to ensure that no external services other than AzureMachineLearningService can call the underlying batch pools.

    Important

    Once a workspace has been created, you cannot change the settings for confidential data, encryption, key vault ID, or key identifiers. To change these values, you must create a new workspace using the new values.

    For more information, see encryption at rest.

Deploy workspace behind a virtual network

By setting the vnetOption parameter value to either new or existing, you're able to create the resources used by a workspace behind a virtual network.

Important

For container registry, only the 'Premium' sku is supported.

Important

Application Insights does not support deployment behind a virtual network.

Only deploy workspace behind private endpoint

If your associated resources aren't behind a virtual network, you can set the privateEndpointType parameter to AutoAproval or ManualApproval to deploy the workspace behind a private endpoint. This setting can be used for both new and existing workspaces. When updating an existing workspace, fill in the template parameters with the information from the existing workspace.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      privateEndpointType="AutoApproval"

Use a new virtual network

To deploy a resource behind a new virtual network, set the vnetOption to new along with the virtual network settings for the respective resource. The following example shows how to deploy a workspace with the storage account resource behind a new virtual network.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      vnetOption="new" \
      vnetName="examplevnet" \
      storageAccountBehindVNet="true"
      privateEndpointType="AutoApproval"

Alternatively, you can deploy multiple or all dependent resources behind a virtual network.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      vnetOption="new" \
      vnetName="examplevnet" \
      storageAccountBehindVNet="true" \
      keyVaultBehindVNet="true" \
      containerRegistryBehindVNet="true" \
      containerRegistryOption="new" \
      containerRegistrySku="Premium"
      privateEndpointType="AutoApproval"

Use an existing virtual network & resources

To deploy a workspace with existing resources, you have to set the vnetOption parameter to existing along with subnet parameters. However, you need to create service endpoints in the virtual network for each of the resources before deployment. Like with new virtual network deployments, you can have one or all of your resources behind a virtual network.

Important

Subnet should have Microsoft.Storage service endpoint

Important

Subnets do not allow creation of private endpoints. Disable private endpoint to enable subnet.

  1. Enable service endpoints for the resources.

    az network vnet subnet update --resource-group "examplegroup" --vnet-name "examplevnet" --name "examplesubnet" --service-endpoints "Microsoft.Storage"
    az network vnet subnet update --resource-group "examplegroup" --vnet-name "examplevnet" --name "examplesubnet" --service-endpoints "Microsoft.KeyVault"
    az network vnet subnet update --resource-group "examplegroup" --vnet-name "examplevnet" --name "examplesubnet" --service-endpoints "Microsoft.ContainerRegistry"
    
  2. Deploy the workspace

    az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      vnetOption="existing" \
      vnetName="examplevnet" \
      vnetResourceGroupName="examplegroup" \
      storageAccountBehindVNet="true" \
      keyVaultBehindVNet="true" \
      containerRegistryBehindVNet="true" \
      containerRegistryOption="new" \
      containerRegistrySku="Premium" \
      subnetName="examplesubnet" \
      subnetOption="existing"
      privateEndpointType="AutoApproval"
    

Use the Azure portal

  1. Follow the steps in Deploy resources from custom template. When you arrive at the Custom deployment screen, choose the Quickstart template entry.

  2. In the dropdown for Quickstart templates, select the microsoft.machinelearningservices/machine-learning-workspace-vnet entry. Finally, use Select template.

  3. When the template appears, provide the following required information and any other parameters depending on your deployment scenario.

    • Subscription: Select the Azure subscription to use for these resources.
    • Resource group: Select or create the resource group that is to contain the services.
    • Region: Select the Azure region where the resources are to be created.
    • Workspace name: The name to use for the Azure Machine Learning workspace to be created. The workspace name must be between 3 and 33 characters. It can only contain alphanumeric characters and '-'.
    • Location: Select the location where the resources are to be created.
  4. Select Review + create.

  5. In the Review + create screen, agree to the listed terms and conditions and select Create.

For more information, see Deploy resources from custom template.

Troubleshooting

Resource provider errors

When creating an Azure Machine Learning workspace, or a resource used by the workspace, you may receive an error similar to the following messages:

  • No registered resource provider found for location {location}
  • The subscription is not registered to use namespace {resource-provider-namespace}

Most resource providers are automatically registered, but not all. If you receive this message, you need to register the provider mentioned.

The following table contains a list of the resource providers required by Azure Machine Learning:

Resource provider Why it's needed
Microsoft.MachineLearningServices Creating the Azure Machine Learning workspace.
Microsoft.Storage Azure Storage Account is used as the default storage for the workspace.
Microsoft.ContainerRegistry Azure Container Registry is used by the workspace to build Docker images.
Microsoft.KeyVault Azure Key Vault is used by the workspace to store secrets.
Microsoft.Notebooks Integrated notebooks on Azure Machine Learning compute instance.
Microsoft.ContainerService If you plan on deploying trained models to Azure Kubernetes Services.

If you plan on using a customer-managed key with Azure Machine Learning, then the following service providers must be registered:

Resource provider Why it's needed
Microsoft.DocumentDB Azure CosmosDB instance that logs metadata for the workspace.
Microsoft.Search Azure Search provides indexing capabilities for the workspace.

If you plan on using a managed virtual network with Azure Machine Learning, then the Microsoft.Network resource provider must be registered. This resource provider is used by the workspace when creating private endpoints for the managed virtual network.

For information on registering resource providers, see Resolve errors for resource provider registration.

Azure Key Vault access policy and Azure Resource Manager templates

When you use an Azure Resource Manager template to create the workspace and associated resources (including Azure Key Vault), multiple times. For example, using the template multiple times with the same parameters as part of a continuous integration and deployment pipeline.

Most resource creation operations through templates are idempotent, but Key Vault clears the access policies each time the template is used. Clearing the access policies breaks access to the Key Vault for any existing workspace that is using it. For example, Stop/Create functionalities of Azure Notebooks VM might fail.

To avoid this problem, we recommend one of the following approaches:

  • Don't deploy the template more than once for the same parameters. Or delete the existing resources before using the template to recreate them.

  • Examine the Key Vault access policies and then use these policies to set the accessPolicies property of the template. To view the access policies, use the following Azure CLI command:

    az keyvault show --name mykeyvault --resource-group myresourcegroup --query properties.accessPolicies
    

    For more information on using the accessPolicies section of the template, see the AccessPolicyEntry object reference.

  • Check if the Key Vault resource already exists. If it does, don't recreate it through the template. For example, to use the existing Key Vault instead of creating a new one, make the following changes to the template:

    • Add a parameter that accepts the ID of an existing Key Vault resource:

      "keyVaultId":{
        "type": "string",
        "metadata": {
          "description": "Specify the existing Key Vault ID."
        }
      }
      
    • Remove the section that creates a Key Vault resource:

      {
        "type": "Microsoft.KeyVault/vaults",
        "apiVersion": "2018-02-14",
        "name": "[variables('keyVaultName')]",
        "location": "[parameters('location')]",
        "properties": {
          "tenantId": "[variables('tenantId')]",
          "sku": {
            "name": "standard",
            "family": "A"
          },
          "accessPolicies": [
          ]
        }
      },
      
    • Remove the "[resourceId('Microsoft.KeyVault/vaults', variables('keyVaultName'))]", line from the dependsOn section of the workspace. Also Change the keyVault entry in the properties section of the workspace to reference the keyVaultId parameter:

      {
        "type": "Microsoft.MachineLearningServices/workspaces",
        "apiVersion": "2019-11-01",
        "name": "[parameters('workspaceName')]",
        "location": "[parameters('location')]",
        "dependsOn": [
          "[resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName'))]",
          "[resourceId('Microsoft.Insights/components', variables('applicationInsightsName'))]"
        ],
        "identity": {
          "type": "systemAssigned"
        },
        "sku": {
          "tier": "[parameters('sku')]",
          "name": "[parameters('sku')]"
        },
        "properties": {
          "friendlyName": "[parameters('workspaceName')]",
          "keyVault": "[parameters('keyVaultId')]",
          "applicationInsights": "[resourceId('Microsoft.Insights/components',variables('applicationInsightsName'))]",
          "storageAccount": "[resourceId('Microsoft.Storage/storageAccounts/',variables('storageAccountName'))]"
        }
      }
      

    After these changes, you can specify the ID of the existing Key Vault resource when running the template. The template then reuses the Key Vault by setting the keyVault property of the workspace to its ID.

    To get the ID of the Key Vault, you can reference the output of the original template job or use the Azure CLI. The following command is an example of using the Azure CLI to get the Key Vault resource ID:

    az keyvault show --name mykeyvault --resource-group myresourcegroup --query id
    

    This command returns a value similar to the following text:

    /subscriptions/{subscription-guid}/resourceGroups/myresourcegroup/providers/Microsoft.KeyVault/vaults/mykeyvault