你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

在 Azure Database for PostgreSQL - 灵活服务器中使用 Azure OpenAI 生成矢量嵌入

项目
01/13/2025

适用于： Azure Database for PostgreSQL 灵活服务器

轻松调用 Azure OpenAI 嵌入来获取输入的矢量表示形式，然后可在矢量相似性搜索中使用，并由机器学习模型使用。

先决条件

启用和配置 azure_ai 扩展。
创建 OpenAI 帐户并请求访问 Azure OpenAI 服务。
在所需订阅中授予对 Azure OpenAI 的访问权限。
授予创建 Azure OpenAI 资源和部署模型的访问权限。
创建和部署 Azure OpenAI 服务资源和模型，例如部署嵌入模型 text-embedding-ada-002。复制部署名称，因为这是创建嵌入内容所必需的。

配置 OpenAI 终结点和密钥

在 Azure OpenAI 资源的“资源管理”>“密钥和终结点”下，可以找到 Azure OpenAI 资源的终结点和密钥。若要调用模型部署，请使用终结点和密钥之一启用 azure_ai 扩展。

select azure_ai.set_setting('azure_openai.endpoint', 'https://<endpoint>.openai.azure.com'); 
select azure_ai.set_setting('azure_openai.subscription_key', '<API Key>');

`azure_openai.create_embeddings`

调用 Azure OpenAI API，以通过给定输入使用提供的部署创建嵌入。

azure_openai.create_embeddings(deployment_name text, input text, timeout_ms integer DEFAULT 3600000, throw_on_error boolean DEFAULT true, max_attempts integer DEFAULT 1, retry_delay_ms integer DEFAULT 1000)
azure_openai.create_embeddings(deployment_name text, input text[], batch_size integer DEFAULT 100, timeout_ms integer DEFAULT 3600000, throw_on_error boolean DEFAULT true, max_attempts integer DEFAULT 1, retry_delay_ms integer DEFAULT 1000)

参数

`deployment_name`

包含模型的 Azure OpenAI 工作室中部署的 text 名称。

`input`

text 或 text[] 单个文本或文本数组，具体取决于为其创建嵌入的所用函数的重载。

`dimensions`

integer DEFAULT NULL 生成的输出嵌入应有的维度数。仅在 text-embedding-3 和更高版本的模型中受支持。在 1.1.0 及更高版本的 azure_ai 扩展中可用

`batch_size`

integer DEFAULT 100 一次要处理的记录数（仅适用于参数 input 为 text[] 类型的函数的重载）。

`timeout_ms`

操作停止之前的 integer DEFAULT 3600000 超时（以毫秒为单位）。

`throw_on_error`

如果函数引发导致包装事务回滚的异常，则在出错时为 boolean DEFAULT true。

`max_attempts`

integer DEFAULT 1 如果 Azure OpenAI 嵌入创建失败并出现任何可重试错误，扩展将重试 Azure OpenAI 嵌入创建的次数。

`retry_delay_ms`

integer DEFAULT 1000 如果调用 Azure OpenAI 终结点创建嵌入失败并出现任何可重试错误，则在再次调用 Azure OpenAI 终结点创建嵌入之前，扩展将等待的时间（毫秒）。

返回类型

real[] 或 TABLE(embedding real[]) 单个元素或单列表，具体取决于由所选部署处理时使用的函数的重载，其中包含输入文本的矢量表示形式。

使用 OpenAI 创建嵌入并将其存储在矢量数据类型中

-- Create tables and populate data
DROP TABLE IF EXISTS conference_session_embeddings;
DROP TABLE IF EXISTS conference_sessions;

CREATE TABLE conference_sessions(
  session_id int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
  title text,
  session_abstract text,
  duration_minutes integer,
  publish_date timestamp
);

-- Create a table to store embeddings with a vector column.
CREATE TABLE conference_session_embeddings(
  session_id integer NOT NULL REFERENCES conference_sessions(session_id),
  session_embedding vector(1536)
);

-- Insert a row into the sessions table
INSERT INTO conference_sessions
    (title,session_abstract,duration_minutes,publish_date) 
VALUES
    ('Gen AI with Azure Database for PostgreSQL flexible server'
    ,'Learn about building intelligent applications with azure_ai extension and pg_vector' 
    , 60, current_timestamp)
    ,('Deep Dive: PostgreSQL database storage engine internals'
    ,' We will dig deep into storage internals'
    , 30, current_timestamp)
    ;

-- Get an embedding for the Session Abstract
SELECT
     pg_typeof(azure_openai.create_embeddings('text-embedding-ada-002', c.session_abstract)) as embedding_data_type
    ,azure_openai.create_embeddings('text-embedding-ada-002', c.session_abstract)
  FROM
    conference_sessions c LIMIT 10;

-- Insert embeddings 
INSERT INTO conference_session_embeddings
    (session_id, session_embedding)
SELECT
    c.session_id, (azure_openai.create_embeddings('text-embedding-ada-002', c.session_abstract))
FROM
    conference_sessions as c  
LEFT OUTER JOIN
    conference_session_embeddings e ON e.session_id = c.session_id
WHERE
    e.session_id IS NULL;

-- Create a HNSW index
CREATE INDEX ON conference_session_embeddings USING hnsw (session_embedding vector_ip_ops);


-- Retrieve top similarity match
SELECT
    c.*
FROM
    conference_session_embeddings e
INNER JOIN
    conference_sessions c ON c.session_id = e.session_id
ORDER BY
    e.session_embedding <#> azure_openai.create_embeddings('text-embedding-ada-002', 'Session to learn about building chatbots')::vector
LIMIT 1;

通过

在 Azure Database for PostgreSQL - 灵活服务器中使用 Azure OpenAI 生成矢量嵌入

先决条件

配置 OpenAI 终结点和密钥

`azure_openai.create_embeddings`

参数

`deployment_name`

`input`

`dimensions`

`batch_size`

`timeout_ms`

`throw_on_error`

`max_attempts`

`retry_delay_ms`

返回类型

使用 OpenAI 创建嵌入并将其存储在矢量数据类型中

反馈

其他资源

通过

在 Azure Database for PostgreSQL - 灵活服务器中使用 Azure OpenAI 生成矢量嵌入

先决条件

配置 OpenAI 终结点和密钥

azure_openai.create_embeddings

参数

deployment_name

input

dimensions

batch_size

timeout_ms

throw_on_error

max_attempts

retry_delay_ms

返回类型

使用 OpenAI 创建嵌入并将其存储在矢量数据类型中

相关内容

反馈

其他资源

`azure_openai.create_embeddings`

`deployment_name`

`input`

`dimensions`

`batch_size`

`timeout_ms`

`throw_on_error`

`max_attempts`

`retry_delay_ms`