教程：创建外部模型终结点以查询 OpenAI 模型

项目
10/30/2024

本文提供了有关如何使用 MLflow 部署 SDK 配置和查询外部模型终结点的分步说明，该终结点为使用 OpenAI 模型进行完成、聊天和嵌入提供服务。详细了解外部模型。

如果想要使用服务 UI 来完成此任务，请参阅创建外部模型服务终结点。

要求

Databricks Runtime 13.0 ML 或更高版本。
MLflow 2.9 或更高版本。
OpenAI API 密钥。
安装 Databricks CLI 版本 0.205 或更高版本。

（可选）步骤 0：使用 Databricks 机密 CLI 存储 OpenAI API 密钥

可在步骤 3 中以纯文本字符串的形式提供 API 密钥，也可使用 Azure Databricks 机密提供该密钥。

若要将 OpenAI API 密钥存储为机密，可使用 Databricks 机密 CLI（版本 0.205 及更高版本）。你还可以使用用于机密的 REST API。

以下操作创建名为 my_openai_secret_scope 的机密范围，然后在该范围中创建机密 openai_api_key。

databricks secrets create-scope my_openai_secret_scope
databricks secrets put-secret my_openai_secret_scope openai_api_key

步骤 1：安装支持外部模型的 MLflow

使用以下命令安装支持外部模型的 MLflow 版本：

%pip install mlflow[genai]>=2.9.0

步骤 2：创建和管理外部模型终结点

重要

本节中的代码示例演示如何使用公共预览版 MLflow 部署 CRUD SDK。

要为大型语言模型 (LLM) 创建外部模型终结点，请使用 MLflow 部署 SDK 中的 create_endpoint() 方法。还可以在服务 UI 中创建外部模型终结点。

以下代码片段按照配置的 gpt-3.5-turbo-instruct 部分中指定的内容为 OpenAI served_entities 创建完成终结点。对于终结点，请务必使用每个字段的唯一值填充 name 和 openai_api_key。

import mlflow.deployments

client = mlflow.deployments.get_deploy_client("databricks")
client.create_endpoint(
    name="openai-completions-endpoint",
    config={
        "served_entities": [{
            "name": "openai-completions",
            "external_model": {
                "name": "gpt-3.5-turbo-instruct",
                "provider": "openai",
                "task": "llm/v1/completions",
                "openai_config": {
                    "openai_api_key": "{{secrets/my_openai_secret_scope/openai_api_key}}"
                }
            }
        }]
    }
)

以下代码片段演示如何以纯文本字符串的形式提供 OpenAI API 密钥，以另一种方式创建与上文相同的补全终结点。

import mlflow.deployments

client = mlflow.deployments.get_deploy_client("databricks")
client.create_endpoint(
    name="openai-completions-endpoint",
    config={
        "served_entities": [{
            "name": "openai-completions",
            "external_model": {
                "name": "gpt-3.5-turbo-instruct",
                "provider": "openai",
                "task": "llm/v1/completions",
                "openai_config": {
                    "openai_api_key_plaintext": "sk-yourApiKey"
                }
            }
        }]
    }
)

如果使用 Azure OpenAI，还可以在配置的 openai_config 部分中指定 Azure OpenAI 部署名称、终结点 URL 和 API 版本。

client.create_endpoint(
    name="openai-completions-endpoint",
    config={
        "served_entities": [
          {
            "name": "openai-completions",
            "external_model": {
                "name": "gpt-3.5-turbo-instruct",
                "provider": "openai",
                "task": "llm/v1/completions",
                "openai_config": {
                    "openai_api_type": "azure",
                    "openai_api_key": "{{secrets/my_openai_secret_scope/openai_api_key}}",
                    "openai_api_base": "https://my-azure-openai-endpoint.openai.azure.com",
                    "openai_deployment_name": "my-gpt-35-turbo-deployment",
                    "openai_api_version": "2023-05-15"
                },
            },
          }
        ],
    },
)

若要更新终结点，请使用 update_endpoint()。以下代码片段演示如何将终结点的速率限制更新为每个用户每分钟 20 次调用。

client.update_endpoint(
    endpoint="openai-completions-endpoint",
    config={
        "rate_limits": [
            {
                "key": "user",
                "renewal_period": "minute",
                "calls": 20
            }
        ],
    },
)

步骤 3：将请求发送到外部模型终结点

重要

本节中的代码示例演示如何使用试验 MLflow 部署 SDK 的 predict() 方法。

可以使用 MLflow 部署 SDK 的 predict() 方法将聊天、完成和嵌入请求发送到外部模型终结点。

下面的示例向 OpenAI 托管的 gpt-3.5-turbo-instruct 发送请求。

completions_response = client.predict(
    endpoint="openai-completions-endpoint",
    inputs={
        "prompt": "What is the capital of France?",
        "temperature": 0.1,
        "max_tokens": 10,
        "n": 2
    }
)
completions_response == {
    "id": "cmpl-8QW0hdtUesKmhB3a1Vel6X25j2MDJ",
    "object": "text_completion",
    "created": 1701330267,
    "model": "gpt-3.5-turbo-instruct",
    "choices": [
        {
            "text": "The capital of France is Paris.",
            "index": 0,
            "finish_reason": "stop",
            "logprobs": None
        },
        {
            "text": "Paris is the capital of France",
            "index": 1,
            "finish_reason": "stop",
            "logprobs": None
        },
    ],
    "usage": {
        "prompt_tokens": 7,
        "completion_tokens": 16,
        "total_tokens": 23
    }
}

步骤 4：比较来自不同提供程序的模型

模型服务支持许多外部模型提供程序，包括 Open AI、Anthropic、Cohere、Amazon Bedrock、Google Cloud Vertex AI 等。可以跨提供程序比较 LLM，以帮助你使用 AI 操场优化应用程序的准确性、速度和成本。

以下示例为 Anthropic claude-2 创建一个终结点，并将其响应与使用 OpenAI gpt-3.5-turbo-instruct 的问题进行比较。这两个响应都具有相同的标准格式，这使得它们易于比较。

为 Anthropic claude-2 创建终结点

import mlflow.deployments

client = mlflow.deployments.get_deploy_client("databricks")

client.create_endpoint(
    name="anthropic-completions-endpoint",
    config={
        "served_entities": [
            {
                "name": "claude-completions",
                "external_model": {
                    "name": "claude-2",
                    "provider": "anthropic",
                    "task": "llm/v1/completions",
                    "anthropic_config": {
                        "anthropic_api_key": "{{secrets/my_anthropic_secret_scope/anthropic_api_key}}"
                    },
                },
            }
        ],
    },
)

比较每个终结点的响应


openai_response = client.predict(
    endpoint="openai-completions-endpoint",
    inputs={
        "prompt": "How is Pi calculated? Be very concise."
    }
)
anthropic_response = client.predict(
    endpoint="anthropic-completions-endpoint",
    inputs={
        "prompt": "How is Pi calculated? Be very concise."
    }
)
openai_response["choices"] == [
    {
        "text": "Pi is calculated by dividing the circumference of a circle by its diameter."
                " This constant ratio of 3.14159... is then used to represent the relationship"
                " between a circle's circumference and its diameter, regardless of the size of the"
                " circle.",
        "index": 0,
        "finish_reason": "stop",
        "logprobs": None
    }
]
anthropic_response["choices"] == [
    {
        "text": "Pi is calculated by approximating the ratio of a circle's circumference to"
                " its diameter. Common approximation methods include infinite series, infinite"
                " products, and computing the perimeters of polygons with more and more sides"
                " inscribed in or around a circle.",
        "index": 0,
        "finish_reason": "stop",
        "logprobs": None
    }
]

其他资源

Mosaic AI Model Serving 中的外部模型。

通过

教程：创建外部模型终结点以查询 OpenAI 模型

要求

（可选）步骤 0：使用 Databricks 机密 CLI 存储 OpenAI API 密钥

步骤 1：安装支持外部模型的 MLflow

步骤 2：创建和管理外部模型终结点

步骤 3：将请求发送到外部模型终结点

步骤 4：比较来自不同提供程序的模型

为 Anthropic claude-2 创建终结点

比较每个终结点的响应

其他资源

反馈

其他资源