你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

了解如何使用 Azure OpenAI 生成嵌入

项目
11/13/2024

嵌入是一种特殊的数据表示格式，可由机器学习模型和算法轻松使用。嵌入是一段文本的语义含义的信息密集表示。每个嵌入是浮点数的一个向量，向量空间中两个嵌入之间的距离与原始格式的两个输入之间的语义相似性相关。例如，如果两个文本相似，则它们的向量表示形式也应该相似。嵌入支持在 Azure 数据库中进行矢量相似性搜索，例如 Azure Cosmos DB for MongoDB vCore、Azure SQL 数据库或 Azure Database for PostgreSQL - 灵活服务器。

如何获取嵌入

为了获取一段文本的嵌入向量，我们向嵌入终结点发出请求，如以下代码片段中所示：

curl https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings?api-version=2024-02-01\
  -H 'Content-Type: application/json' \
  -H 'api-key: YOUR_API_KEY' \
  -d '{"input": "Sample Document goes here"}'

import os
from openai import AzureOpenAI

client = AzureOpenAI(
  api_key = os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version = "2024-06-01",
  azure_endpoint =os.getenv("AZURE_OPENAI_ENDPOINT") 
)

response = client.embeddings.create(
    input = "Your text string goes here",
    model= "text-embedding-3-large"
)

print(response.model_dump_json(indent=2))

注意

OpenAI Python 库版本 0.28.1 已弃用。我们建议使用 1.x。有关如何从 0.28.1 迁移到 1.x 的信息，请参阅我们的迁移指南。

import openai

openai.api_type = "azure"
openai.api_key = "YOUR_API_KEY"
openai.api_base = "https://YOUR_RESOURCE_NAME.openai.azure.com"
openai.api_version = "2024-06-01"

response = openai.Embedding.create(
    input="Your text string goes here",
    engine="YOUR_DEPLOYMENT_NAME"
)
embeddings = response['data'][0]['embedding']
print(embeddings)

using Azure;
using Azure.AI.OpenAI;

Uri oaiEndpoint = new ("https://YOUR_RESOURCE_NAME.openai.azure.com");
string oaiKey = "YOUR_API_KEY";

AzureKeyCredential credentials = new (oaiKey);

OpenAIClient openAIClient = new (oaiEndpoint, credentials);

EmbeddingsOptions embeddingOptions = new()
{
    DeploymentName = "text-embedding-3-large",
    Input = { "Your text string goes here" },
};

var returnValue = openAIClient.GetEmbeddings(embeddingOptions);

foreach (float item in returnValue.Value.Data[0].Embedding.ToArray())
{
    Console.WriteLine(item);
}

# Azure OpenAI metadata variables
$openai = @{
    api_key     = $Env:AZURE_OPENAI_API_KEY
    api_base    = $Env:AZURE_OPENAI_ENDPOINT # your endpoint should look like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
    api_version = '2024-02-01' # this may change in the future
    name        = 'YOUR-DEPLOYMENT-NAME-HERE' #This will correspond to the custom name you chose for your deployment when you deployed a model.
}

$headers = [ordered]@{
    'api-key' = $openai.api_key
}

$text = 'Your text string goes here'

$body = [ordered]@{
    input = $text
} | ConvertTo-Json

$url = "$($openai.api_base)/openai/deployments/$($openai.name)/embeddings?api-version=$($openai.api_version)"

$response = Invoke-RestMethod -Uri $url -Headers $headers -Body $body -Method Post -ContentType 'application/json'
return $response.data.embedding

最佳做法

确认输入不超过最大长度

最新嵌入模型的输入文本的最大长度为 8192 个标记。在发出请求之前，应确认输入未超过此限制。
如果在单个嵌入请求中发送输入数组，则最大数组大小为 2048。
在一个请求中发送一组输入时，请记得请求中的每分钟令牌数需要始终小于模型部署中分配的配额限值。默认情况下，最新的第 3 代嵌入模型存在每个区域 350 K TPM 的限制。

限制和风险

在某些情况下，我们的嵌入模型可能不可靠或造成社会性风险，如果没有缓解措施，它们可能会造成损害。请查看负责任的 AI 内容，获取有关如何以负责的形式使用这些模型的详细信息。

后续步骤

通过我们的嵌入教程，详细了解如何使用 Azure OpenAI 和嵌入执行文档搜索。
详细了解为 Azure OpenAI 提供支持的基础模型。
使用你选择的服务来存储嵌入项并执行矢量（相似性）搜索：

通过