你当前正在访问 Microsoft Azure Global Edition 技术文档网站。 如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站,请访问 https://docs.azure.cn

快速入门:将生成式搜索 (RAG) 与 Azure AI 搜索中的上下文关联数据配合使用

本快速入门介绍如何将查询发送到聊天完成模型,以基于 Azure AI 搜索中的索引内容进行对话式搜索体验。 你将使用 Azure 门户设置资源,然后运行 Python 代码来调用 API。

先决条件

若要满足同一区域要求,请先查看要使用的聊天模型所在区域。 确定区域后,确认 Azure AI 搜索在同一区域可用。

请确保知道已部署模型的名称,并同时具有两个 Azure 资源的终结点。 你将在后续步骤中提供此信息。

下载文件

从 GitHub 下载 Jupyter 笔记本以发送本快速入门中所述的请求。 有关详细信息,请参阅从 GitHub 下载文件

你还可以在本地系统上启动一个新文件,并根据本文中的说明手动创建请求。

配置访问权限

搜索终结点的请求必须经过身份验证和授权。 你可以使用 API 密钥或角色来完成此任务。 密钥更容易上手,但角色更安全。 本快速入门将使用角色。

你将设置两个客户端,因此需要拥有对两个资源的权限。

Azure AI 搜索正在从本地系统接收查询请求。 如果酒店示例索引已存在,请为自己分配“搜索索引数据读取者”角色分配。 如果不存在,请为自己分配“搜索服务参与者”和“搜索索引数据参与者”角色,以便你可创建和查询索引。

Azure OpenAI 正在从本地系统接收查询和搜索结果。 为自己分配 Azure OpenAI 上的“认知服务 OpenAI 用户”角色。

  1. 登录到 Azure 门户

  2. 为 Azure AI 搜索配置基于角色的访问:

    1. 在 Azure 门户中,找到你的 Azure AI 搜索服务。

    2. 在左侧菜单中选择“设置”>“密钥”,然后选择“基于角色的访问控制”或“两者”

  3. 分配角色:

    1. 在左侧菜单中,选择“访问控制 (IAM)”。

    2. 在 Azure AI 搜索上,选择要创建、加载和查询搜索索引的角色,然后将这些角色分配给Microsoft Entra ID 用户标识:

      • 搜索索引数据参与者
      • 搜索服务参与者
    3. 在 Azure OpenAI 上,选择“访问控制(IAM)”,在 Azure OpenAI 上为自己分配此角色:

      • 认知服务 OpenAI 用户

权限可能需要几分钟才能生效。

创建索引

搜索索引为聊天模型提供基础数据。 我们建议使用 hotels-sample-index,此索引在几分钟内即可创建完成,并可在任何搜索服务层级上运行。 此索引是使用内置示例数据创建的。

  1. 在 Azure 门户上找到你的搜索服务

  2. 在“概述”主页上,选择“导入数据”以启动向导

  3. 在“连接到数据”页面上,从下拉列表中选择“示例”

  4. 选择“hotels-sample”

  5. 在剩余页面中选择“下一步”,并接受默认值

  6. 创建索引后,从左侧菜单中选择“搜索管理”>“索引”以打开索引

  7. 选择“编辑 JSON”。

  8. 滚动到索引的末尾,可在其中找到可添加到索引的构造的占位符。

    "analyzers": [],
    "tokenizers": [],
    "tokenFilters": [],
    "charFilters": [],
    "normalizers": [],
    
  9. 在“规范化器”后的新行上,粘贴以下语义配置。 此示例指定了 "defaultConfiguration",它对于运行本快速入门非常重要。

    "semantic":{
       "defaultConfiguration":"semantic-config",
       "configurations":[
          {
             "name":"semantic-config",
             "prioritizedFields":{
                "titleField":{
                   "fieldName":"HotelName"
                },
                "prioritizedContentFields":[
                   {
                      "fieldName":"Description"
                   }
                ],
                "prioritizedKeywordsFields":[
                   {
                      "fieldName":"Category"
                   },
                   {
                      "fieldName":"Tags"
                   }
                ]
             }
          }
       ]
    },
    
  10. 保存所做更改。

  11. 搜索资源管理器中运行以下查询以测试索引:complimentary breakfast

    输出应类似于以下示例。 搜索引擎直接返回的结果由字段及其逐字值以及搜索分数、语义排名分数和标题等元数据(如果使用语义排序器)组成。 我们使用 select 语句仅返回 HotelName、Description 和 Tags 字段。

    {
    "@odata.count": 18,
    "@search.answers": [],
    "value": [
       {
          "@search.score": 2.2896252,
          "@search.rerankerScore": 2.506816864013672,
          "@search.captions": [
          {
             "text": "Head Wind Resort. Suite. coffee in lobby\r\nfree wifi\r\nview. The best of old town hospitality combined with views of the river and cool breezes off the prairie. Our penthouse suites offer views for miles and the rooftop plaza is open to all guests from sunset to 10 p.m. Enjoy a **complimentary continental breakfast** in the lobby, and free Wi-Fi throughout the hotel..",
             "highlights": ""
          }
          ],
          "HotelName": "Head Wind Resort",
          "Description": "The best of old town hospitality combined with views of the river and cool breezes off the prairie. Our penthouse suites offer views for miles and the rooftop plaza is open to all guests from sunset to 10 p.m. Enjoy a complimentary continental breakfast in the lobby, and free Wi-Fi throughout the hotel.",
          "Tags": [
          "coffee in lobby",
          "free wifi",
          "view"
          ]
       },
       {
          "@search.score": 2.2158256,
          "@search.rerankerScore": 2.288334846496582,
          "@search.captions": [
          {
             "text": "Swan Bird Lake Inn. Budget. continental breakfast\r\nfree wifi\r\n24-hour front desk service. We serve a continental-style breakfast each morning, featuring a variety of food and drinks. Our locally made, oh-so-soft, caramel cinnamon rolls are a favorite with our guests. Other breakfast items include coffee, orange juice, milk, cereal, instant oatmeal, bagels, and muffins..",
             "highlights": ""
          }
          ],
          "HotelName": "Swan Bird Lake Inn",
          "Description": "We serve a continental-style breakfast each morning, featuring a variety of food and drinks. Our locally made, oh-so-soft, caramel cinnamon rolls are a favorite with our guests. Other breakfast items include coffee, orange juice, milk, cereal, instant oatmeal, bagels, and muffins.",
          "Tags": [
          "continental breakfast",
          "free wifi",
          "24-hour front desk service"
          ]
       },
       {
          "@search.score": 0.92481667,
          "@search.rerankerScore": 2.221315860748291,
          "@search.captions": [
          {
             "text": "White Mountain Lodge & Suites. Resort and Spa. continental breakfast\r\npool\r\nrestaurant. Live amongst the trees in the heart of the forest. Hike along our extensive trail system. Visit the Natural Hot Springs, or enjoy our signature hot stone massage in the Cathedral of Firs. Relax in the meditation gardens, or join new friends around the communal firepit. Weekend evening entertainment on the patio features special guest musicians or poetry readings..",
             "highlights": ""
          }
          ],
          "HotelName": "White Mountain Lodge & Suites",
          "Description": "Live amongst the trees in the heart of the forest. Hike along our extensive trail system. Visit the Natural Hot Springs, or enjoy our signature hot stone massage in the Cathedral of Firs. Relax in the meditation gardens, or join new friends around the communal firepit. Weekend evening entertainment on the patio features special guest musicians or poetry readings.",
          "Tags": [
          "continental breakfast",
          "pool",
          "restaurant"
          ]
       },
       . . .
    ]}
    

获取服务终结点

在其余部分中,设置对 Azure OpenAI 和 Azure AI 搜索的 API 调用。 获取服务终结点,以便可以在代码中将其作为变量提供。

  1. 登录到 Azure 门户

  2. 查找搜索服务

  3. 在“概述”主页上,复制 URL。 示例终结点可能类似于 https://example.search.windows.net

  4. 查找 Azure OpenAI 服务

  5. 在“概述”主页上,选择用于查看终结点的链接。 复制 URL。 示例终结点可能类似于 https://example.openai.azure.com/

创建虚拟环境

在此步骤中,切换回本地系统和 Visual Studio Code。 建议创建虚拟环境,以便可在隔离环境中安装依赖项。

  1. 在 Visual Studio Code 中,打开包含 Quickstart-RAG.ipynb 的文件夹。

  2. 按 Ctrl-Shift-P 打开命令面板,搜索“Python: 创建环境”,然后选择 Venv,在当前工作区中创建虚拟环境。

  3. 为依赖项选择 Quickstart-RAG\requirements.txt。

创建环境需要几分钟的时间。 环境准备就绪后,请继续执行下一步。

登录 Azure

你将为连接使用 Microsoft Entra ID 和角色分配。 请确保登录到与 Azure AI 搜索和 Azure OpenAI 相同的租户和订阅。 可以在命令行上使用 Azure CLI 来显示当前属性、更改属性和登录。 有关详细信息,请参阅在没有密钥的情况下连接

按顺序运行下面每个命令。

az account show

az account set --subscription <PUT YOUR SUBSCRIPTION ID HERE>

az login --tenant <PUT YOUR TENANT ID HERE>

现在,你应该被本地设备登录到了 Azure。

设置查询和聊天线程

本部分使用 Visual Studio Code 和 Python 在 Azure OpenAI 中调用聊天完成 API。

  1. 启动 Visual Studio Code 并打开 .ipynb 文件或创建新的 Python 文件。

  2. 安装以下 Python 包。

    ! pip install azure-search-documents==11.6.0b5 --quiet
    ! pip install azure-identity==1.16.1 --quiet
    ! pip install openai --quiet
    ! pip install aiohttp --quiet
    ! pip install ipykernel --quiet
    
  3. 设置以下变量,并将占位符替换为在上一步中收集的终结点。

     AZURE_SEARCH_SERVICE: str = "PUT YOUR SEARCH SERVICE ENDPOINT HERE"
     AZURE_OPENAI_ACCOUNT: str = "PUT YOUR AZURE OPENAI ENDPOINT HERE"
     AZURE_DEPLOYMENT_MODEL: str = "gpt-4o"
    
  4. 设置客户端、提示、查询和响应。

    对于 Azure 政府云,请将令牌提供程序上的 API 终结点修改为 "https://cognitiveservices.azure.us/.default"

    # Set up the query for generating responses
     from azure.identity import DefaultAzureCredential
     from azure.identity import get_bearer_token_provider
     from azure.search.documents import SearchClient
     from openai import AzureOpenAI
    
     credential = DefaultAzureCredential()
     token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")
     openai_client = AzureOpenAI(
         api_version="2024-06-01",
         azure_endpoint=AZURE_OPENAI_ACCOUNT,
         azure_ad_token_provider=token_provider
     )
    
     search_client = SearchClient(
         endpoint=AZURE_SEARCH_SERVICE,
         index_name="hotels-sample-index",
         credential=credential
     )
    
     # This prompt provides instructions to the model
     GROUNDED_PROMPT="""
     You are a friendly assistant that recommends hotels based on activities and amenities.
     Answer the query using only the sources provided below in a friendly and concise bulleted manner.
     Answer ONLY with the facts listed in the list of sources below.
     If there isn't enough information below, say you don't know.
     Do not generate answers that don't use the sources below.
     Query: {query}
     Sources:\n{sources}
     """
    
     # Query is the question being asked. It's sent to the search engine and the chat model
     query="Can you recommend a few hotels with complimentary breakfast?"
    
     # Search results are created by the search client
     # Search results are composed of the top 5 results and the fields selected from the search index
     # Search results include the top 5 matches to your query
     search_results = search_client.search(
         search_text=query,
         top=5,
         select="Description,HotelName,Tags"
     )
     sources_formatted = "\n".join([f'{document["HotelName"]}:{document["Description"]}:{document["Tags"]}' for document in search_results])
    
     # Send the search results and the query to the LLM to generate a response based on the prompt.
     response = openai_client.chat.completions.create(
         messages=[
             {
                 "role": "user",
                 "content": GROUNDED_PROMPT.format(query=query, sources=sources_formatted)
             }
         ],
         model=AZURE_DEPLOYMENT_MODEL
     )
    
     # Here is the response from the chat model.
     print(response.choices[0].message.content)
    

    输出来自 Azure OpenAI,其中包含多个酒店的建议。 下面是此输出的执行示例:

    Sure! Here are a few hotels that offer complimentary breakfast:
    
    - **Head Wind Resort**
    - Complimentary continental breakfast in the lobby
    - Free Wi-Fi throughout the hotel
    
    - **Double Sanctuary Resort**
    - Continental breakfast included
    
    - **White Mountain Lodge & Suites**
    - Continental breakfast available
    
    - **Swan Bird Lake Inn**
    - Continental-style breakfast each morning with a variety of food and drinks 
     such as caramel cinnamon rolls, coffee, orange juice, milk, cereal, 
     instant oatmeal, bagels, and muffins
    

    如果收到“已禁止”错误消息,请检查 Azure AI 搜索配置,以确保已启用基于角色的访问。

    如果收到“授权失败”错误消息,请等待几分钟,然后重试。 可能需要几分钟才能使角色分配生效。

    如果收到“找不到资源”错误消息,请检查资源 URI 并确保聊天模型中的 API 版本有效。

    否则,如果要进一步试验,请更改查询并重新运行上一步,以更好地了解模型如何处理基础数据。

    你还可以修改提示以更改输出的语气或结构。

    还可以通过在查询参数步骤中设置 use_semantic_reranker=False 来尝试没有语义排名的查询。 语义排名可以显著改善查询结果的相关性以及 LLM 返回有用信息的能力。 试验可以帮助你确定它是否对内容有影响。

发送复杂的 RAG 查询

Azure AI 搜索支持嵌套 JSON 结构的复杂类型。 在 hotels-sample-index 中,一个复杂类型的示例是 Address,其中包括 Address.StreetAddressAddress.CityAddress.StateProvinceAddress.PostalCodeAddress.Country。 该索引还包含每家酒店的复杂 Rooms 集合。

如果索引具有复杂类型,那么只要先将搜索结果输出转换为 JSON,然后将 JSON 传递给聊天模型,查询就可以提供这些字段。 以下示例将复杂类型添加到请求。 格式设置说明包括 JSON 规范。

import json

# Query is the question being asked. It's sent to the search engine and the LLM.
query="Can you recommend a few hotels that offer complimentary breakfast? 
Tell me their description, address, tags, and the rate for one room that sleeps 4 people."

# Set up the search results and the chat thread.
# Retrieve the selected fields from the search index related to the question.
selected_fields = ["HotelName","Description","Address","Rooms","Tags"]
search_results = search_client.search(
    search_text=query,
    top=5,
    select=selected_fields,
    query_type="semantic"
)
sources_filtered = [{field: result[field] for field in selected_fields} for result in search_results]
sources_formatted = "\n".join([json.dumps(source) for source in sources_filtered])

response = openai_client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": GROUNDED_PROMPT.format(query=query, sources=sources_formatted)
        }
    ],
    model=AZURE_DEPLOYMENT_MODEL
)

print(response.choices[0].message.content)

输出来自 Azure OpenAI,并且会添加复杂类型中的内容。

Here are a few hotels that offer complimentary breakfast and have rooms that sleep 4 people:

1. **Head Wind Resort**
   - **Description:** The best of old town hospitality combined with views of the river and 
   cool breezes off the prairie. Enjoy a complimentary continental breakfast in the lobby, 
   and free Wi-Fi throughout the hotel.
   - **Address:** 7633 E 63rd Pl, Tulsa, OK 74133, USA
   - **Tags:** Coffee in lobby, free Wi-Fi, view
   - **Room for 4:** Suite, 2 Queen Beds (Amenities) - $254.99

2. **Double Sanctuary Resort**
   - **Description:** 5-star Luxury Hotel - Biggest Rooms in the city. #1 Hotel in the area 
   listed by Traveler magazine. Free WiFi, Flexible check in/out, Fitness Center & espresso 
   in room. Offers continental breakfast.
   - **Address:** 2211 Elliott Ave, Seattle, WA 98121, USA
   - **Tags:** View, pool, restaurant, bar, continental breakfast
   - **Room for 4:** Suite, 2 Queen Beds (Amenities) - $254.99

3. **Swan Bird Lake Inn**
   - **Description:** Continental-style breakfast featuring a variety of food and drinks. 
   Locally made caramel cinnamon rolls are a favorite.
   - **Address:** 1 Memorial Dr, Cambridge, MA 02142, USA
   - **Tags:** Continental breakfast, free Wi-Fi, 24-hour front desk service
   - **Room for 4:** Budget Room, 2 Queen Beds (City View) - $85.99

4. **Gastronomic Landscape Hotel**
   - **Description:** Known for its culinary excellence under the management of William Dough, 
   offers continental breakfast.
   - **Address:** 3393 Peachtree Rd, Atlanta, GA 30326, USA
   - **Tags:** Restaurant, bar, continental breakfast
   - **Room for 4:** Budget Room, 2 Queen Beds (Amenities) - $66.99
...
   - **Tags:** Pool, continental breakfast, free parking
   - **Room for 4:** Budget Room, 2 Queen Beds (Amenities) - $60.99

Enjoy your stay! Let me know if you need any more information.

解决错误

要调试身份验证错误,请在调用搜索引擎和 LLM 的步骤之前插入以下代码。

import sys
import logging # Set the logging level for all azure-storage-* libraries
logger = logging.getLogger('azure.identity') 
logger.setLevel(logging.DEBUG)

handler = logging.StreamHandler(stream=sys.stdout)
formatter = logging.Formatter('[%(levelname)s %(name)s] %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)

重新运行查询脚本。 现在,应会在输出中获取 INFO 和 DEBUG 语句,该语句将提供有关此问题的更多详细信息。

如果看到与 ManagedIdentityCredential 和令牌获取失败相关的输出消息,则可能是因为你拥有多个租户,但你的 Azure 登录使用的是没有搜索服务的租户。 若要获取租户 ID,请在 Azure 门户中搜索“租户属性”或运行 az login tenant list

获得租户 ID 后,在命令提示符处运行 az login --tenant <YOUR-TENANT-ID>,然后重新运行脚本。

清理

在自己的订阅中操作时,最好在项目结束时确定是否仍需要已创建的资源。 持续运行资源可能会产生费用。 可以逐个删除资源,也可以删除资源组以删除整个资源集。

可以在 Azure 门户中使用最左侧窗格中的“所有资源”或“资源组”链接来查找和管理资源。

另请参阅