查詢基礎模型

發行項
03/11/2025

在本文中，您將瞭解如何格式化基礎模型的查詢要求，並將其傳送至您的模型服務端點。您可以查詢 Databricks 所裝載的基礎模型，以及裝載於 Databricks 外部的基礎模型。

如需傳統 ML 或 Python 模型查詢要求，請參閱自訂模型的查詢服務端點。

馬賽克 AI 模型服務支援基礎模型 API 及外部模型，以便存取基礎模型。模型服務使用統一的 OpenAI 相容 API 和 SDK 進行查詢。這可讓您針對支援雲端和提供者的生產環境進行實驗和自定義基礎模型。

Mosaic AI 模型服務提供下列選項，可將評分要求傳送至為基礎模型或外部模型提供服務的端點：

方法	詳細資料
OpenAI 用戶端	使用 OpenAI 用戶端查詢 Mosaic AI 模型服務端點所裝載的模型。指定模型服務端點名稱作為 `model` 輸入。支援透過基礎模型 API 或外部模型提供的聊天、嵌入和完成模型。
SQL 函數	使用 `ai_query` SQL 函數直接從 SQL 叫用模型推斷。請參閱範例：查詢基礎模型。
提供介面	從 [服務端點] 頁面選取 [查詢端點]。插入 JSON 格式模型輸入資料，然後按下傳送要求。如果模型已記錄輸入範例，請使用 [顯示範例] 加以載入。
REST API	使用 REST API 呼叫和查詢模型。如需詳細資料，請參閱 POST/serving-endpoints/{name}/invocations。如需為多個模型提供服務的端點的評分要求，請參閱查詢端點背後的各個模型。
MLflow 部署 SDK	使用 MLflow 部署 SDK 的 predict() 函數來查詢模型。
Databricks Python SDK	Databricks Python SDK 是 REST API 之上的一層。可處理低階詳細資料 (例如驗證)，讓您更輕鬆地與模型互動。

需求

模型服務端點。
支援區域中的 Databricks 工作區。
- 基礎模型 API 區域
- 外部模型區域
若要透過 OpenAI 用戶端、REST API 或 MLflow 部署 SDK 傳送評分要求，您必須具有 Databricks API 權杖。

重要

作為生產案例的安全最佳做法，Databricks 建議您在生產期間使用機器對機器 OAuth 權杖進行驗證。

對於測試和開發，Databricks 建議使用屬於服務主體而不是工作區使用者的個人存取權杖。若要建立服務主體的權杖，請參閱管理服務主體的權杖。

安裝套件

選取查詢方法之後，您必須先將適當的套件安裝至叢集。

OpenAI 用戶端

若要使用 OpenAI 用戶端，需要將 databricks-sdk[openai] 套件安裝在叢集上。 Databricks SDK 提供包裝函式來建構 OpenAI 用戶端，並自動設定授權來查詢產生 AI 模型。在筆記本或本機終端機中執行下列命令：

!pip install databricks-sdk[openai]>=0.35.0

只有在 Databricks Notebook 上安裝套件時，才需要下列項目

dbutils.library.restartPython()

REST API

可在 Databricks Runtime for Machine Learning 中存取服務 REST API。

MLflow 部署 SDK

!pip install mlflow

只有在 Databricks Notebook 上安裝套件時，才需要下列項目

dbutils.library.restartPython()

Databricks Python SDK

適用於 Python 的 Databricks SDK 已安裝在使用 Databricks Runtime 13.3 LTS 或更新版本的所有 Azure Databricks 叢集上。對於使用 Databricks Runtime 12.2 LTS 及更低版本的 Azure Databricks 叢集，您必須先安裝適用於 Python 的 Databricks SDK。請參閱適用於 Python 的 Databricks SDK。

查詢聊天完成模型

以下是查詢聊天模型的範例。此範例適用於查詢使用模型服務功能所提供的聊天模型：基礎模型 API 或外部模型。

如需批次推斷範例，請參閱使用 AI Functions 執行批次 LLM 推斷。

OpenAI 用戶端

以下是 DBRX Instruct 模型的聊天要求，由您工作區中的 Foundation Model 的 API 中按代幣付費的端點 databricks-dbrx-instruct 提供。

若要使用 OpenAI 用戶端，請指定模型服務端點名稱作為 model 輸入。


from databricks.sdk import WorkspaceClient

w = WorkspaceClient()
openai_client = w.serving_endpoints.get_open_ai_client()

response = openai_client.chat.completions.create(
    model="databricks-dbrx-instruct",
    messages=[
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "What is a mixture of experts model?",
      }
    ],
    max_tokens=256
)

若要在工作區外部查詢基礎模型，您必須直接使用 OpenAI 用戶端。您也需要 Databricks 工作區執行個體，才能將 OpenAI 用戶端連線至 Databricks。下列範例假設您具有 Databricks API 令牌，並且已將 openai 安裝在您的運算環境中。


import os
import openai
from openai import OpenAI

client = OpenAI(
    api_key="dapi-your-databricks-token",
    base_url="https://example.staging.cloud.databricks.com/serving-endpoints"
)

response = client.chat.completions.create(
    model="databricks-dbrx-instruct",
    messages=[
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "What is a mixture of experts model?",
      }
    ],
    max_tokens=256
)

REST API

重要

下列範例會使用 REST API 參數來查詢服務基礎模型的端點。這些參數公眾預覽，且定義可能會改變。請參閱 POST /serving-endpoints/{name}/invocations。

以下是 DBRX Instruct 模型的聊天要求，由工作區中的基礎模型 API 按權杖付費端點 databricks-dbrx-instruct 提供。

curl \
-u token:$DATABRICKS_TOKEN \
-X POST \
-H "Content-Type: application/json" \
-d '{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": " What is a mixture of experts model?"
    }
  ]
}' \
https://<workspace_host>.databricks.com/serving-endpoints/databricks-dbrx-instruct/invocations \

MLflow 的部署 SDK

重要

下列範例使用來自 MLflow 部署 SDK 的 predict() API。

以下是在您的工作區中，透過基礎模型 API 中的按字元付費端點 databricks-dbrx-instruct 提供的 DBRX Instruct 模型聊天請求。


import mlflow.deployments

# Only required when running this example outside of a Databricks Notebook
export DATABRICKS_HOST="https://<workspace_host>.databricks.com"
export DATABRICKS_TOKEN="dapi-your-databricks-token"

client = mlflow.deployments.get_deploy_client("databricks")

chat_response = client.predict(
    endpoint="databricks-dbrx-instruct",
    inputs={
        "messages": [
            {
              "role": "user",
              "content": "Hello!"
            },
            {
              "role": "assistant",
              "content": "Hello! How can I assist you today?"
            },
            {
              "role": "user",
              "content": "What is a mixture of experts model??"
            }
        ],
        "temperature": 0.1,
        "max_tokens": 20
    }
)

Databricks Python SDK

以下是 DBRX Instruct 模型的對話請求，由您的工作區中的基礎模型 API 按權杖付費端點 databricks-dbrx-instruct 提供。

此程式碼必須在工作區的筆記本中執行。請參閱在 Azure Databricks 筆記本中使用 Python 的 Databricks SDK。

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import ChatMessage, ChatMessageRole

w = WorkspaceClient()
response = w.serving_endpoints.query(
    name="databricks-dbrx-instruct",
    messages=[
        ChatMessage(
            role=ChatMessageRole.SYSTEM, content="You are a helpful assistant."
        ),
        ChatMessage(
            role=ChatMessageRole.USER, content="What is a mixture of experts model?"
        ),
    ],
    max_tokens=128,
)
print(f"RESPONSE:\n{response.choices[0].message.content}")

LangChain

若要使用 LangChain 查詢基礎模型端點，可以使用 ChatDatabricks ChatModel 類別並指定 endpoint。

下列範例使用 LangChain 中的 ChatDatabricks ChatModel 類別來查詢基礎模型 API 按權杖付費端點 databricks-dbrx-instruct。

%pip install databricks-langchain

from langchain_core.messages import HumanMessage, SystemMessage
from databricks_langchain import ChatDatabricks

messages = [
    SystemMessage(content="You're a helpful assistant"),
    HumanMessage(content="What is a mixture of experts model?"),
]

llm = ChatDatabricks(endpoint_name="databricks-dbrx-instruct")
llm.invoke(messages)

SQL

重要

下列範例使用內建 SQL 函數 ai_query。此函數為公開預覽，而且定義可能會變更。

以下是 meta-llama-3-1-70b-instruct 的聊天要求，由工作區中的基礎模型 API 按權杖付費端點 databricks-meta-llama-3-1-70b-instruct 提供。

注意

ai_query() 函數不支援為 DBRX 或 DBRX Instruct 模型提供服務的查詢端點。

SELECT ai_query(
    "databricks-meta-llama-3-1-70b-instruct",
    "Can you explain AI in ten words?"
  )

例如，以下是使用 REST API 時聊天模型的預期要求格式。針對外部模型，您可以包含適用於指定提供者和端點組態的其他參數。請參閱其他查詢參數。

{
  "messages": [
    {
      "role": "user",
      "content": "What is a mixture of experts model?"
    }
  ],
  "max_tokens": 100,
  "temperature": 0.1
}

以下是使用 REST API 提出要求的預期回應格式：

{
  "model": "databricks-dbrx-instruct",
  "choices": [
    {
      "message": {},
      "index": 0,
      "finish_reason": null
    }
  ],
  "usage": {
    "prompt_tokens": 7,
    "completion_tokens": 74,
    "total_tokens": 81
  },
  "object": "chat.completion",
  "id": null,
  "created": 1698824353
}

查詢內嵌模型

以下是基礎模型 API 所提供 gte-large-en 模型的內嵌要求。此範例適用於查詢使用模型服務功能之一提供的內嵌模型：基礎模型 API 或外部模型。

OpenAI 用戶端

若要使用 OpenAI 用戶端，請指定模型服務端點名稱作為 model 輸入。


from databricks.sdk import WorkspaceClient

w = WorkspaceClient()
openai_client = w.serving_endpoints.get_open_ai_client()

response = openai_client.embeddings.create(
  model="databricks-gte-large-en",
  input="what is databricks"
)

若要在工作區外部查詢基礎模型，您必須直接使用 OpenAI 用戶端，如下所示。下列範例假設您有一個 Databricks API 令牌，並在您的運算設備上安裝了 OpenAI。您也需要 Databricks 工作區執行個體，才能將 OpenAI 用戶端連線至 Databricks。


import os
import openai
from openai import OpenAI

client = OpenAI(
    api_key="dapi-your-databricks-token",
    base_url="https://example.staging.cloud.databricks.com/serving-endpoints"
)

response = client.embeddings.create(
  model="databricks-gte-large-en",
  input="what is databricks"
)

REST API

重要

下列範例會使用 REST API 參數來查詢服務基礎模型或外部模型的端點。這些參數公眾預覽，且定義可能會改變。請參閱 POST /serving-endpoints/{name}/invocations。


curl \
-u token:$DATABRICKS_TOKEN \
-X POST \
-H "Content-Type: application/json" \
-d  '{ "input": "Embed this sentence!"}' \
https://<workspace_host>.databricks.com/serving-endpoints/databricks-gte-large-en/invocations

MLflow 部署 SDK

重要

下列範例使用來自 MLflow 部署 SDK 的 predict() API。


import mlflow.deployments

export DATABRICKS_HOST="https://<workspace_host>.databricks.com"
export DATABRICKS_TOKEN="dapi-your-databricks-token"

client = mlflow.deployments.get_deploy_client("databricks")

embeddings_response = client.predict(
    endpoint="databricks-gte-large-en",
    inputs={
        "input": "Here is some text to embed"
    }
)

Databricks Python SDK


from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import ChatMessage, ChatMessageRole

w = WorkspaceClient()
response = w.serving_endpoints.query(
    name="databricks-gte-large-en",
    input="Embed this sentence!"
)
print(response.data[0].embedding)

LangChain

若要使用 LangChain 中的 Databricks 基礎模型 API 模型作為內嵌模型，請匯入 DatabricksEmbeddings 類別並指定 endpoint 參數，如下所示：

%pip install databricks-langchain

from databricks_langchain import DatabricksEmbeddings

embeddings = DatabricksEmbeddings(endpoint="databricks-gte-large-en")
embeddings.embed_query("Can you explain AI in ten words?")

SQL

重要

下列範例使用內建 SQL 函數 ai_query。此函數為公開預覽，而且定義可能會變更。


SELECT ai_query(
    "databricks-gte-large-en",
    "Can you explain AI in ten words?"
  )

以下是內嵌模型的預期要求格式。針對外部模型，您可以包含適用於指定提供者和端點組態的其他參數。請參閱其他查詢參數。


{
  "input": [
    "embedding text"
  ]
}

以下是預期的回應格式：

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": []
    }
  ],
  "model": "text-embedding-ada-002-v2",
  "usage": {
    "prompt_tokens": 2,
    "total_tokens": 2
  }
}

檢查內嵌是否已正規化

使用下列命令來檢查模型所產生的內嵌是否已正規化。


  import numpy as np

  def is_normalized(vector: list[float], tol=1e-3) -> bool:
      magnitude = np.linalg.norm(vector)
      return abs(magnitude - 1) < tol

查詢文字完成模型

OpenAI 用戶端

重要

使用 OpenAI 用戶端來查詢基礎模型 API 的文字完成模型並不支援按令牌付費。本節僅支援使用 OpenAI 用戶端查詢外部模型。

若要使用 OpenAI 用戶端，請指定模型服務端點名稱作為 model 輸入。下列範例會使用 OpenAI 用戶端查詢 Anthropic 所託管的 claude-2 完成模型。如果要使用 OpenAI 用戶端，請使用託管你要查詢之模型的模型服務端點名稱填入 model 欄位。

此範例會使用先前建立的、為存取來自 Anthropic 模型提供者的外部模型而設定的端點 anthropic-completions-endpoint。請參閱如何建立外部模型端點。

如需您可以查詢的其他模型及其提供者，請參閱支援的模型。


from databricks.sdk import WorkspaceClient

w = WorkspaceClient()
openai_client = w.serving_endpoints.get_open_ai_client()

completion = openai_client.completions.create(
model="anthropic-completions-endpoint",
prompt="what is databricks",
temperature=1.0
)
print(completion)

REST API

以下是查詢使用外部模型提供之完成模型的完成要求。

重要

下列範例會使用 REST API 參數來查詢服務外部模型的端點。這些參數公眾預覽，且定義可能會改變。請參閱 POST /serving-endpoints/{name}/invocations。


curl \
-u token:$DATABRICKS_TOKEN \
-X POST \
-H "Content-Type: application/json" \
-d '{"prompt": "What is a quoll?", "max_tokens": 64}' \
https://<workspace_host>.databricks.com/serving-endpoints/<completions-model-endpoint>/invocations

MLflow 部署 SDK

以下是查詢使用外部模型提供之完成模型的完成要求。

重要

下列範例使用 MLflow Deployments SDK 中的 predict() API。


import os
import mlflow.deployments

# Only required when running this example outside of a Databricks Notebook

os.environ['DATABRICKS_HOST'] = "https://<workspace_host>.databricks.com"
os.environ['DATABRICKS_TOKEN'] = "dapi-your-databricks-token"

client = mlflow.deployments.get_deploy_client("databricks")

completions_response = client.predict(
    endpoint="<completions-model-endpoint>",
    inputs={
        "prompt": "What is the capital of France?",
        "temperature": 0.1,
        "max_tokens": 10,
        "n": 2
    }
)

# Print the response
print(completions_response)

Databricks Python SDK

以下是查詢使用外部模型提供之完成模型的完成要求。

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import ChatMessage, ChatMessageRole

w = WorkspaceClient()
response = w.serving_endpoints.query(
    name="<completions-model-endpoint>",
    prompt="Write 3 reasons why you should train an AI model on domain specific data sets."
)
print(response.choices[0].text)

SQL

重要

下列範例使用內建 SQL 函數 ai_query。此函數為公開預覽，而且定義可能會變更。

SELECT ai_query(
    "<completions-model-endpoint>",
    "Can you explain AI in ten words?"
  )

以下是完成模型的預期要求格式。針對外部模型，您可以包含適用於指定提供者和端點組態的其他參數。請參閱其他查詢參數。

{
  "prompt": "What is mlflow?",
  "max_tokens": 100,
  "temperature": 0.1,
  "stop": [
    "Human:"
  ],
  "n": 1,
  "stream": false,
  "extra_params":
  {
    "top_p": 0.9
  }
}

以下是預期的回應格式：

{
  "id": "cmpl-8FwDGc22M13XMnRuessZ15dG622BH",
  "object": "text_completion",
  "created": 1698809382,
  "model": "gpt-3.5-turbo-instruct",
  "choices": [
    {
      "text": "MLflow is an open-source platform for managing the end-to-end machine learning lifecycle. It provides tools for tracking experiments, managing and deploying models, and collaborating on projects. MLflow also supports various machine learning frameworks and languages, making it easier to work with different tools and environments. It is designed to help data scientists and machine learning engineers streamline their workflows and improve the reproducibility and scalability of their models.",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 5,
    "completion_tokens": 83,
    "total_tokens": 88
  }
}

使用 AI 遊樂場與支援的 LLM 聊天

您可以使用 AI 遊樂場與支援的大型語言模型互動。 AI 遊樂場是類似聊天的互動環境，您可以在 Azure Databricks 工作區內對大型語言模型（LLM）進行測試、輸入指令並進行比較。

AI 遊樂場

共用方式為

查詢基礎模型

需求

安裝套件

OpenAI 用戶端

REST API

MLflow 部署 SDK

Databricks Python SDK

查詢聊天完成模型

OpenAI 用戶端

REST API

MLflow 的部署 SDK

Databricks Python SDK

LangChain

SQL

查詢內嵌模型

OpenAI 用戶端

REST API

MLflow 部署 SDK

Databricks Python SDK

LangChain

SQL

檢查內嵌是否已正規化

查詢文字完成模型

OpenAI 用戶端

REST API

MLflow 部署 SDK

Databricks Python SDK

SQL

使用 AI 遊樂場與支援的 LLM 聊天

其他資源

意見反應

其他資源