How to implement integrated vectorization using models from Azure AI Foundry

Important

This feature is in public preview under Supplemental Terms of Use. The 2024-05-01-Preview REST API supports this feature.

In this article, learn how to access the embedding models in the Azure AI Foundry model catalog for vector conversions during indexing and in queries in Azure AI Search.

The workflow includes model deployment steps. The model catalog includes embedding models from Azure OpenAI, Cohere, Facebook, and OpenAI. Deploying a model is billable per the billing structure of each provider.

After the model is deployed, you can use it for integrated vectorization during indexing, or with the AI Foundry vectorizer for queries.

Deploy an embedding model from the Azure AI Foundry model catalog

  1. Open the Azure AI Foundry model catalog.

  2. Apply a filter to show just the embedding models. Under Inference tasks, select Embeddings:

    Screenshot of the Azure AI Foundry model catalog page highlighting how to filter by embeddings models.

  3. Select the model you would like to vectorize your content with. Then select Deploy and pick a deployment option.

    Screenshot of deploying an endpoint via the Azure AI Foundry model catalog.

  4. Fill in the requested details. Select or create a new AI project, and then select Deploy. The deployment details vary depending on which model you select.

  5. Wait for the model to finish deploying by monitoring the Provisioning State. It should change from "Provisioning" to "Updating" to "Succeeded". You might need to select Refresh every few minutes to see the status update.

  6. Copy the URL, Primary key, and Model ID fields and set them aside for later. You need these values for the vectorizer definition in a search index, and for the skillset that calls the model endpoints during indexing.

    Optionally, you can change your endpoint to use Token authentication instead of Key authentication. If you enable token authentication, you only need to copy the URL and Model ID, and also make a note of which region the model is deployed to.

    Screenshot of a deployed endpoint in AI Foundry portal highlighting the fields to copy and save for later.

  7. You can now configure a search index and indexer to use the deployed model.

Sample AML skill payloads

When you deploy embedding models from the Azure AI Foundry model catalog you connect to them using the AML skill in Azure AI Search for indexing workloads.

This section describes the AML skill definition and index mappings. It includes sample payloads that are already configured to work with their corresponding deployed endpoints. For more technical details on how these payloads work, read about the Skill context and input annotation language.

This AML skill payload works with the following models from AI Foundry:

  • OpenAI-CLIP-Image-Text-Embeddings-vit-base-patch32
  • OpenAI-CLIP-Image-Text-Embeddings-ViT-Large-Patch14-336

It assumes that you're chunking your content using the Text Split skill and that the text to be vectorized is in the /document/pages/* path. If your text comes from a different path, update all references to the /document/pages/* path accordingly.

The URI and key are generated when you deploy the model from the catalog. For more information about these values, see How to deploy large language models with Azure AI Foundry.

{
  "@odata.type": "#Microsoft.Skills.Custom.AmlSkill",
  "context": "/document/pages/*",
  "uri": "<YOUR_MODEL_URL_HERE>",
  "key": "<YOUR_MODEL_KEY_HERE>",
  "inputs": [
    {
      "name": "input_data",
      "sourceContext": "/document/pages/*",
      "inputs": [
        {
          "name": "columns",
          "source": "=['image', 'text']"
        },
        {
          "name": "index",
          "source": "=[0]"
        },
        {
          "name": "data",
          "source": "=[['', $(/document/pages/*)]]"
        }
      ]
    }
  ],
  "outputs": [
    {
      "name": "text_features"
    }
  ]
}

Sample AI Foundry vectorizer payload

The AI Foundry vectorizer, unlike the AML skill, is tailored to work only with those embedding models that are deployable via the AI Foundry model catalog. The main difference is that you don't have to worry about the request and response payload, but you do have to provide the modelName, which corresponds to the "Model ID" that you copied after deploying the model in AI Foundry portal.

Here's a sample payload of how you would configure the vectorizer on your index definition given the properties copied from AI Foundry.

For Cohere models, you should NOT add the /v1/embed path to the end of your URL like you did with the skill.

"vectorizers": [
    {
        "name": "<YOUR_VECTORIZER_NAME_HERE>",
        "kind": "aml",
        "amlParameters": {
            "uri": "<YOUR_URL_HERE>",
            "key": "<YOUR_PRIMARY_KEY_HERE>",
            "modelName": "<YOUR_MODEL_ID_HERE>"
        },
    }
]

Connect using token authentication

If you can't use key-based authentication, you can instead configure the AML skill and AI Foundry vectorizer connection for token authentication via role-based access control on Azure. The search service must have a system or user-assigned managed identity, and the identity must have Owner or Contributor permissions for your AML project workspace. You can then remove the key field from your skill and vectorizer definition, replacing it with the resourceId field. If your AML project and search service are in different regions, also provide the region field.

"uri": "<YOUR_URL_HERE>",
"resourceId": "subscriptions/<YOUR_SUBSCRIPTION_ID_HERE>/resourceGroups/<YOUR_RESOURCE_GROUP_NAME_HERE>/providers/Microsoft.MachineLearningServices/workspaces/<YOUR_AML_WORKSPACE_NAME_HERE>/onlineendpoints/<YOUR_AML_ENDPOINT_NAME_HERE>",
"region": "westus", // Only need if AML project lives in different region from search service

Next steps