Muokkaa

Jaa


Network isolation with Azure Machine Learning registries

In this article, you learn to secure Azure Machine Learning registry using Azure Virtual Network and private endpoints.

Private endpoints on Azure provide network isolation by enabling Azure services to be accessed through a private IP address within a virtual network (VNet). The VNet secures connections between Azure resources and prevent exposure of sensitive data to the public internet.

Using network isolation with private endpoints prevents the network traffic from going over the public internet and brings Azure Machine Learning registry service to your Virtual network. All the network traffic happens over Azure Private Link when private endpoints are used.

Prerequisites

Securing Azure Machine Learning registry

Note

For simplicity, we will be referring to workspace, it's associated resources and the virtual network they are part of as secure workspace configuration. We will explore how to add Azure machine Learning registries as part of the existing configuration.

The following diagram shows a basic network configuration and how the Azure Machine Learning registry fits in. If you're already using Azure Machine Learning workspace and have a secure workspace configuration where all the resources are part of virtual network, you can create a private endpoint from the existing virtual network to Azure Machine Learning registry and it's associated resources (storage and ACR).

If you don't have a secure workspace configuration, you can create it using the Create a secure workspace in Azure portal article, Bicep template, or Terraform template.

Diagram of registry connected to Virtual network containing workspace and associated resources using private endpoint.

Limitations

If you are using an Azure Machine Learning registry with network isolation, you can view model assets in Azure Machine Learning studio. You won't be able to view other types of assets. You won't be able to perform any operations on Azure Machine Learning registry or assets under it using studio. Please use the Azure Machine Learning CLI or SDK instead.

Scenario: workspace configuration is secure and Azure Machine Learning registry is public

This section describes the scenarios and required network configuration if you have a secure workspace configuration but using a public registry.

Create assets in registry from local files

The identity (for example, a Data Scientist's Microsoft Entra user identity) used to create assets in the registry must be assigned the AzureML Registry User, owner, or contributor role in Azure role-based access control. For more information, see the Manage access to Azure Machine Learning article.

Share assets from workspace to registry

Note

Sharing a component from Azure Machine Learning workspace to Azure Machine Learning registry is not supported currently.

Due to data exfiltration protection, it isn't possible to share an asset from secure workspace to a public registry if the storage account containing the asset has public access disabled. To enable asset sharing from workspace to registry:

  • Go to the Networking section of the storage account attached to the workspace (from where you would like to allow sharing of assets to registry)
  • Set Public network access to Enabled from selected virtual networks and IP addresses
  • Scroll down and go to Resource instances section. Select Resource type to Microsoft.MachineLearningServices/registries and set Instance name to the name of Azure Machine Learning registry resource were you would like to enable sharing to from workspace.
  • Make sure to check rest of the settings as per your network configuration.

Use assets from registry in workspace

Example operations:

  • Submit a job that uses an asset from registry.
  • Use a component from registry in a pipeline.
  • Use an environment from registry in a component.

Using assets from registry to a secure workspace requires configuring outbound access to the registry.

Deploy a model from registry to workspace

To deploy a model from a registry to a secure managed online endpoint, the deployment must have egress_public_network_access=disabled set. Azure Machine Learning creates the necessary private endpoints to the registry during endpoint deployment. For more information, see Create secure managed online endpoints.

Outbound network configuration to access any Azure Machine Learning registry

Service tag Protocol and ports Purpose
AzureMachineLearning TCP: 443, 877, 18881
UDP: 5831
Using Azure Machine Learning services.
Storage.<region> TCP: 443 Access data stored in the Azure Storage Account for compute clusters and compute instances. This outbound can be used to exfiltrate data. For more information, see Data exfiltration protection.
MicrosoftContainerRegistry.<region> TCP: 443 Access Docker images provided by Microsoft.
AzureContainerRegistry.<region> TCP: 443 Access Docker images for environments.

Scenario: workspace configuration is secure and Azure Machine Learning registry is connected to virtual networks using private endpoints

This section describes the scenarios and required network configuration if you have a secure workspace configuration with Azure Machine Learning registries connected using private endpoint to a virtual network.

Azure Machine Learning registry has associated storage/ACR service instances. These service instances can also be connected to the VNet using private endpoints to secure the configuration. For more information, see the How to create a private endpoint section.

How to find the Azure Storage Account and Azure Container Registry used by your registry

The storage account and ACR used by your Azure Machine Learning registry are created under a managed resource group in your Azure subscription. The name of the managed resource group follows the pattern of azureml-rg-<name-of-your-registry>_<GUID>. The GUID is a randomly generated string. For example, if the name of your registry is "contosoreg", the name of the managed resource group would be azureml-rg-contosoreg_<GUID>.

In the Azure portal, you can find this resource group by searching for azureml_rg-<name-of-your-registry>. All the storage and ACR resources for your registry are available under this resource group.

Create assets in registry from local files

Note

Creating an environment asset is not supported in a private registry where associated ACR has public access disabled. As a workaround, you can create an environment in Azure Machine Learning workspace and share it to Azure Machine Learning registry.

Clients need to be connected to the VNet to which the registry is connected with a private endpoint.

Securely connect to your registry

To connect to a registry that's secured behind a VNet, use one of the following methods:

  • Azure VPN gateway - Connects on-premises networks to the VNet over a private connection. Connection is made over the public internet. There are two types of VPN gateways that you might use:

    • Point-to-site: Each client computer uses a VPN client to connect to the VNet.

    • Site-to-site: A VPN device connects the VNet to your on-premises network.

  • ExpressRoute - Connects on-premises networks into the cloud over a private connection. Connection is made using a connectivity provider.

  • Azure Bastion - In this scenario, you create an Azure Virtual Machine (sometimes called a jump box) inside the VNet. You then connect to the VM using Azure Bastion. Bastion allows you to connect to the VM using either an RDP or SSH session from your local web browser. You then use the jump box as your development environment. Since it is inside the VNet, it can directly access the registry.

Share assets from workspace to registry

Note

Sharing a component from Azure Machine Learning workspace to Azure Machine Learning registry is not supported currently.

Due to data exfiltration protection, it isn't possible to share an asset from secure workspace to a private registry if the storage account containing the asset has public access disabled. To enable asset sharing from workspace to registry:

  • Go to the Networking section of the storage account attached to the workspace (from where you would like to allow sharing of assets to registry)
  • Set Public network access to Enabled from selected virtual networks and IP addresses
  • Scroll down and go to Resource instances section. Select Resource type to Microsoft.MachineLearningServices/registries and set Instance name to the name of Azure Machine Learning registry resource were you would like to enable sharing to from workspace.
  • Make sure to check rest of the settings as per your network configuration.

Use assets from registry in workspace

Example operations:

  • Submit a job that uses an asset from registry.
  • Use a component from registry in a pipeline.
  • Use an environment from registry in a component.

Create a private endpoint to the registry, storage and ACR from the VNet of the workspace. If you're trying to connect to multiple registries, create private endpoint for each registry and associated storage and ACRs. For more information, see the How to create a private endpoint section.

Deploy a model from registry to workspace

To deploy a model from a registry to a secure managed online endpoint, the deployment must have egress_public_network_access=disabled set. Azure Machine Learning creates the necessary private endpoints to the registry during endpoint deployment. For more information, see Create secure managed online endpoints.

How to create a private endpoint

Use the tabs to view instructions to either add a private endpoint to an existing registry or create a new registry that has a private endpoint:

  1. In the Azure portal, search for Private endpoint, and the select the Private endpoints entry to go to the Private link center.

  2. On the Private link center overview page, select + Create.

  3. Provide the requested information. For the Region field, select the same region as your Azure Virtual Network. Select Next.

  4. From the Resource tab, when selecting Resource type, select Microsoft.MachineLearningServices/registries. Set the Resource field to your Azure Machine Learning registry name, then select Next.

  5. From the Virtual network tab, select the virtual network and subnet for your Azure Machine Learning resources. Select Next to continue.

  6. From the DNS tab, leave the default values unless you have specific private DNS integration requirements. Select Next to continue.

  7. From the Review + Create tab, select Create to create the private endpoint.

  8. If you would like to set public network access to disabled, use the following command. Confirm the storage and ACR has the public network access disabled as well.

    az ml registry update --set publicNetworkAccess=Disabled --name <name-of-registry>
    

How to find the Azure Storage Account and Azure Container Registry used by your registry

The storage account and ACR used by your Azure Machine Learning registry are created under a managed resource group in your Azure subscription. The name of the managed resource group follows the pattern of azureml-rg-<name-of-your-registry>_<GUID>. The GUID is a randomly generated string. For example, if the name of your registry is "contosoreg", the name of the managed resource group would be azureml-rg-contosoreg_<GUID>.

In the Azure portal, you can find this resource group by searching for azureml_rg-<name-of-your-registry>. All the storage and ACR resources for your registry are available under this resource group.

How to create a private endpoint for the Azure Storage Account

To create a private endpoint for the storage account used by your registry, use the following steps:

  1. In the Azure portal, search for Private endpoint, and the select the Private endpoints entry to go to the Private link center.
  2. On the Private link center overview page, select + Create.
  3. Provide the requested information. For the Region field, select the same region as your Azure Virtual Network. Select Next.
  4. From the Resource tab, when selecting Resource type, select Microsoft.Storage/storageAccounts. Set the Resource field to the storage account name. Set the Sub-resource to Blob, then select Next.
  5. From the Virtual network tab, select the virtual network and subnet for your Azure Machine Learning resources. Select Next to continue.
  6. From the DNS tab, leave the default values unless you have specific private DNS integration requirements. Select Next to continue.
  7. From the Review + Create tab, select Create to create the private endpoint.

Data exfiltration protection

For a user created Azure Machine Learning registry, we recommend using a private endpoint for the registry, managed storage account, and managed ACR.

For a system registry, we recommend creating a Service Endpoint Policy for the Storage account using the /services/Azure/MachineLearning alias. For more information, see Configure data exfiltration prevention.

How to find the registry's fully qualified domain name

Note

Make sure your DNS is able to resolve the registry private FQDN which is in this format: <registry-guid>.registry.<region>.privatelink.api.azureml.ms as there is no public resource specific FQDN which is recursively resolved by Azure DNS.

The following examples show how to use the discovery URL to get the fully qualified domain name (FQDN) of your registry. When calling the discovery URL, you must provide an Azure access token in the request header. The following examples show how to get an access token and call the discovery URL:

Tip

The format for the discovery URL is https://<region>.api.azureml.ms/registrymanagement/v1.0/registries/<registry_name>/discovery, where <region> is the region where your registry is located and <registry_name> is the name of your registry. To call the URL, make a GET request:

   GET https://<region>.api.azureml.ms/registrymanagement/v1.0/registries/<registry_name>/discovery 
$region = "<region>"
$registryName = "<registry_name>"
$accessToken = (az account get-access-token | ConvertFrom-Json).accessToken 
(Invoke-RestMethod -Method Get `
                   -Uri "https://$region.api.azureml.ms/registrymanagement/v1.0/registries/$registryName/discovery" `
                   -Headers @{ Authorization="Bearer $accessToken" }).registryFqdns
  • REST API

Note

For more information on using Azure REST APIs, see the Azure REST API reference.

  1. Get the Azure access token. You can use the following Azure CLI command to get a token:

    az account get-access-token --query accessToken
    
  2. Use a REST client such as Curl to make a GET request to the discovery URL. Use the access token retrieved in the previous step for authorization. In the following example, replace <region> with the region where your registry is located and <registry_name> with the name of your registry. Replace <token> with the access token retrieved in the previous step:

    curl -X GET "https://<region>.api.azureml.ms/registrymanagement/v1.0/registries/<registry_name>/discovery" -H "Authorization: Bearer <token>" -H "Content-Type: application/json"
    

Next step

Learn how to Share models, components, and environments across workspaces with registries.