你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

使用 Semantic Kernel 和 Azure AI Foundry 开发应用程序

项目
12/18/2024

在本文中，你将了解如何将 Semantic Kernel 与从 Azure AI Foundry 门户中的 Azure AI 模型目录部署的模型配合使用。

先决条件

一个 Azure 订阅。
Azure AI 项目，如在 Azure AI Foundry 门户中创建项目中所述。
已部署一个支持 Azure AI 模型推理 API 的模型。在此示例中，我们使用 Mistral-Large 部署，但你可以使用自己偏好的任何模型。若要使用 LlamaIndex 中的嵌入功能，你需要一个像 cohere-embed-v3-multilingual 这样的嵌入模型。
- 可以按照将模型部署为无服务器 API 中的说明进行操作。
已安装 Python 3.10 或更高版本，包括 pip。
已安装 Semantic Kernel。可以使用以下方法执行此操作：
```
pip install semantic-kernel
```
在此示例中，我们使用 Azure AI 模型推理 API，因此我们安装相关的 Azure 依赖项。可以使用以下方法执行此操作：
```
pip install semantic-kernel[azure]
```

配置环境

若要使用部署在 Azure AI Foundry 门户中的 LLM，你需要终结点和凭据来连接到它。请按照以下步骤从想要使用的模型中获取所需的信息：

转到 Azure AI Foundry 门户。
打开部署模型的项目（如果尚未打开）。
转到“模型 + 终结点”，并根据先决条件选择已部署的模型。
复制终结点 URL 和密钥。

提示

如果模型部署了 Microsoft Entra ID 支持，则不需要密钥。

在这种情况下，我们将终结点 URL 和密钥都放在以下环境变量中：

export AZURE_AI_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"
export AZURE_AI_INFERENCE_API_KEY="<your-key-goes-here>"

配置完成后，创建一个客户端来连接到终结点：

from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion

chat_completion_service = AzureAIInferenceChatCompletion(ai_model_id="<deployment-name>")

提示

客户端会自动读取环境变量 AZURE_AI_INFERENCE_ENDPOINT 和 AZURE_AI_INFERENCE_API_KEY 来连接到模型。但是，还可通过构造函数上的 endpoint 和 api_key 参数直接将终结点和密钥传递给客户端。

也可在终结点支持 Microsoft Entra ID 的情况下，使用以下代码来创建客户端：

export AZURE_AI_INFERENCE_ENDPOINT="<your-model-endpoint-goes-here>"

from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion

chat_completion_service = AzureAIInferenceChatCompletion(ai_model_id="<deployment-name>")

注意

使用 Microsoft Entra ID 时，请确保已通过该身份验证方法部署终结点，并且你拥有调用它所需的权限。

Azure OpenAI 模型

如果使用 Azure OpenAI 模型，可使用以下代码创建客户端：

from azure.ai.inference.aio import ChatCompletionsClient
from azure.identity.aio import DefaultAzureCredential

from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatCompletion

chat_completion_service = AzureAIInferenceChatCompletion(
    ai_model_id="<deployment-name>",
    client=ChatCompletionsClient(
        endpoint=f"{str(<your-azure-open-ai-endpoint>).strip('/')}/openai/deployments/{<deployment_name>}",
        credential=DefaultAzureCredential(),
        credential_scopes=["https://cognitiveservices.azure.com/.default"],
    ),
)

推理参数

可以使用 AzureAIInferenceChatPromptExecutionSettings 类配置推理的执行方式：

from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceChatPromptExecutionSettings

execution_settings = AzureAIInferenceChatPromptExecutionSettings(
    max_tokens=100,
    temperature=0.5,
    top_p=0.9,
    # extra_parameters={...},    # model-specific parameters
)

调用服务

让我们先使用简单的历史聊天记录调用聊天完成服务：

提示

Semantic Kernel 是一个异步库，因此你需要使用 asyncio 库来运行代码。

import asyncio

async def main():
    ...

if __name__ == "__main__":
    asyncio.run(main())

from semantic_kernel.contents.chat_history import ChatHistory

chat_history = ChatHistory()
chat_history.add_user_message("Hello, how are you?")

response = await chat_completion_service.get_chat_message_content(
    chat_history=chat_history,
    settings=execution_settings,
)
print(response)

或者，可以从服务流式传输响应：

chat_history = ChatHistory()
chat_history.add_user_message("Hello, how are you?")

response = chat_completion_service.get_streaming_chat_message_content(
    chat_history=chat_history,
    settings=execution_settings,
)

chunks = []
async for chunk in response:
    chunks.append(chunk)
    print(chunk, end="")

full_response = sum(chunks[1:], chunks[0])

创建长时间运行的对话

可以使用循环创建长时间运行的对话：

while True:
    response = await chat_completion_service.get_chat_message_content(
        chat_history=chat_history,
        settings=execution_settings,
    )
    print(response)
    chat_history.add_message(response)
    chat_history.add_user_message(user_input = input("User:> "))

如果要流式传输响应，可使用以下代码：

while True:
    response = chat_completion_service.get_streaming_chat_message_content(
        chat_history=chat_history,
        settings=execution_settings,
    )

    chunks = []
    async for chunk in response:
        chunks.append(chunk)
        print(chunk, end="")

    full_response = sum(chunks[1:], chunks[0])
    chat_history.add_message(full_response)
    chat_history.add_user_message(user_input = input("User:> "))

使用嵌入模型

配置环境与上述步骤类似，但使用 AzureAIInferenceEmbeddings 类：

from semantic_kernel.connectors.ai.azure_ai_inference import AzureAIInferenceTextEmbedding

embedding_generation_service = AzureAIInferenceTextEmbedding(ai_model_id="<deployment-name>")

以下代码演示如何从服务中获取嵌入项：

embeddings = await embedding_generation_service.generate_embeddings(
    texts=["My favorite color is blue.", "I love to eat pizza."],
)

for embedding in embeddings:
    print(embedding)

通过

使用 Semantic Kernel 和 Azure AI Foundry 开发应用程序

先决条件

配置环境

Azure OpenAI 模型

推理参数

调用服务

创建长时间运行的对话

使用嵌入模型

反馈

其他资源

通过

使用 Semantic Kernel 和 Azure AI Foundry 开发应用程序

先决条件

配置环境

Azure OpenAI 模型

推理参数

调用服务

创建长时间运行的对话

使用嵌入模型

相关内容

反馈

其他资源