記錄和註冊 AI 代理程式

發行項
03/09/2025

使用 Mosaic AI 代理程式架構記錄 AI 代理程式。記錄代理程式是開發流程的基礎。記錄會擷取代理程式程式碼與組態的「時間點」，讓您可以評估組態品質。

需求

Databricks 建議安裝最新版的 databricks-sdk。

% pip install databricks-sdk

以程式碼為基礎的記錄

Databricks 建議在記錄代理程式時，使用來自程式代碼功能的 MLflow 模型。

在此方法中，代理程式的程式代碼會擷取為 Python 檔案，而 Python 環境會擷取為套件清單。部署代理程式時，會還原 Python 環境，並執行代理程式的程式代碼以將代理程式載入記憶體，以便在呼叫端點時叫用它。

您可以結合此方法與使用預先部署驗證 API，例如 mlflow.models.predict（），以確保代理程式在部署以供服務時可靠地執行。

若要檢視程式代碼型記錄的範例，請參閱 ChatAgent 撰寫範例筆記本。

在記錄的過程中推斷模型特徵

注意

Databricks 建議使用 ChatAgent 介面撰寫代理程式。如果使用 ChatAgent，您可以略過本節;MLflow 會自動推斷代理程式的有效簽章。

如果未使用 ChatAgent 介面，您必須在記錄時間使用下列其中一種方法來指定代理程式的 MLflow 模型簽章：

手動定義簽章
使用 MLflow 的模型簽章推斷功能，根據您提供的輸入範例自動產生代理程式的簽章。這種方法比手動定義簽章更方便。

MLflow 模型簽章會驗證輸入和輸出，以確保代理程式與 AI 遊樂場和檢閱應用程式等下游工具正確互動。它也會引導其他應用程式有效地使用代理程式。

下列 LangChain 和 PyFunc 範例使用模型簽章推斷。

如果您想要在記錄時自行明確定義模型簽章，請參閱 MLflow 檔 - 如何使用簽章記錄模型。

基於程式碼的日誌記錄使用 LangChain

下列指示和程式代碼範例示範如何使用 LangChain 來記錄代理程式。

使用您的程式碼建立筆記本或 Python 檔案。在這裡範例中，筆記本或檔案會命名為 agent.py。筆記本或檔案必須包含 LangChain 代理程式，這裡稱為 lc_agent。
在筆記本或檔案中包含 mlflow.models.set_model（lc_agent）。
建立新的筆記本，作為驅動程式筆記本（在此範例稱為 driver.py）。
在驅動程序筆記本中，使用下列程式代碼來執行 agent.py 並將結果記錄至 MLflow 模型：
```
mlflow.langchain.log_model(lc_model="/path/to/agent.py", resources=list_of_databricks_resources)
```
resources 參數會宣告為代理程式提供服務所需的 Databricks 管理資源，例如向量搜尋索引或用來提供基礎模型服務的端點。如需詳細資訊，請參閱指定自動驗證通過的資源。
部署模型。請參閱為生成式 AI 應用程式部署代理程式。
當載入服務環境時，會執行 agent.py。
當有服務請求進入時，會呼叫 lc_agent.invoke(...)。


import mlflow

code_path = "/Workspace/Users/first.last/agent.py"
config_path = "/Workspace/Users/first.last/config.yml"

# Input example used by MLflow to infer Model Signature
input_example = {
  "messages": [
    {
      "role": "user",
      "content": "What is Retrieval-augmented Generation?",
    }
  ]
}

# example using langchain
with mlflow.start_run():
  logged_agent_info = mlflow.langchain.log_model(
    lc_model=code_path,
    model_config=config_path, # If you specify this parameter, this configuration is used by agent code. The development_config is overwritten.
    artifact_path="agent", # This string is used as the path inside the MLflow model where artifacts are stored
    input_example=input_example, # Must be a valid input to the agent
    example_no_conversion=True, # Required
  )

print(f"MLflow Run: {logged_agent_info.run_id}")
print(f"Model URI: {logged_agent_info.model_uri}")

# To verify that the model has been logged correctly, load the agent and call `invoke`:
model = mlflow.langchain.load_model(logged_agent_info.model_uri)
model.invoke(example)

使用 PyFunc 的基於程式代碼的日誌記錄

下列指示和程式代碼範例示範如何使用 PyFunc 記錄代理程式。

使用您的程式碼建立筆記本或 Python 檔案。在這裡範例中，筆記本或檔案會命名為 agent.py。筆記本或檔案必須包含名為 PyFuncClass的 PyFunc 類別。
將 mlflow.models.set_model(PyFuncClass) 包含在筆記本或檔案中。
建立新的筆記本，作為驅動程式筆記本（在此範例稱為 driver.py）。
在驅動程序筆記本中，使用下列程式代碼來執行 agent.py 並將結果記錄至 MLflow 模型：
```
mlflow.pyfunc.log_model(python_model="/path/to/agent.py", resources=list_of_databricks_resources)
```
resources 參數會宣告為代理程式提供服務所需的 Databricks 管理資源，例如向量搜尋索引或用來提供基礎模型服務的端點。如需詳細資訊，請參閱指定自動驗證通過的資源。
部署模型。請參閱為生成式 AI 應用程式部署代理程式。
當載入服務環境時，會執行 agent.py。
當有服務請求進入時，會呼叫 PyFuncClass.predict(...)。

import mlflow
from mlflow.models.resources import (
    DatabricksServingEndpoint,
    DatabricksVectorSearchIndex,
)

code_path = "/Workspace/Users/first.last/agent.py"
config_path = "/Workspace/Users/first.last/config.yml"

# Input example used by MLflow to infer Model Signature
input_example = {
  "messages": [
    {
      "role": "user",
      "content": "What is Retrieval-augmented Generation?",
    }
  ]
}

with mlflow.start_run():
  logged_agent_info = mlflow.pyfunc.log_model(
    python_model=agent_notebook_path,
    artifact_path="agent",
    input_example=input_example,
    resources=resources_path,
    example_no_conversion=True,
    resources=[
      DatabricksServingEndpoint(endpoint_name="databricks-mixtral-8x7b-instruct"),
      DatabricksVectorSearchIndex(index_name="prod.agents.databricks_docs_index"),
    ]
  )

print(f"MLflow Run: {logged_agent_info.run_id}")
print(f"Model URI: {logged_agent_info.model_uri}")

# To verify that the model has been logged correctly, load the agent and call `invoke`:
model = mlflow.pyfunc.load_model(logged_agent_info.model_uri)
model.invoke(example)

指定自動身份驗證通過的資源

AI 代理程式通常必須向其他資源進行驗證，才能完成工作。例如，代理程式可能需要存取向量搜尋索引來查詢非結構化數據。

如相依資源驗證中所述，模型服務支援在部署代理程式時向 Databricks 管理的和外部資源進行驗證。

針對最常見的 Databricks 資源類型，Databricks 支援並建議在記錄期間預先宣告代理程式的資源相依性。這可讓您在部署代理程式時自動認證傳遞 - Databricks 會自動布建、輪替及管理短期憑證，以便能夠安全地從代理程式端點訪問這些資源相依性。

若要啟用自動驗證傳遞，請使用 log_model（） API 的 resources 參數指定相依資源，如下列程式代碼所示。

import mlflow
from mlflow.models.resources import (
  DatabricksVectorSearchIndex,
  DatabricksServingEndpoint,
  DatabricksSQLWarehouse,
  DatabricksFunction,
  DatabricksGenieSpace,
  DatabricksTable,
  DatabricksUCConnection
)

with mlflow.start_run():
  logged_agent_info = mlflow.pyfunc.log_model(
    python_model=agent_notebook_path,
    artifact_path="agent",
    input_example=input_example,
    example_no_conversion=True,
    # Specify resources for automatic authentication passthrough
    resources=[
      DatabricksVectorSearchIndex(index_name="prod.agents.databricks_docs_index"),
      DatabricksServingEndpoint(endpoint_name="databricks-mixtral-8x7b-instruct"),
      DatabricksServingEndpoint(endpoint_name="databricks-bge-large-en"),
      DatabricksSQLWarehouse(warehouse_id="your_warehouse_id"),
      DatabricksFunction(function_name="ml.tools.python_exec"),
      DatabricksGenieSpace(genie_space_id="your_genie_space_id"),
      DatabricksTable(table_name="your_table_name"),
      DatabricksUCConnection(connection_name="your_connection_name"),
    ]
  )

Databricks 建議您手動指定所有代理程式類型的 resources。

注意

如果您在使用 mlflow.langchain.log_model(...)記錄 LangChain 代理程式時未指定資源，MLflow 會盡力自動推斷資源。不過，這可能不會擷取所有相依性，導致服務或查詢代理程式時發生授權錯誤。

下表列出支援自動認證通過的 Databricks 資源，以及登入資源所需的最低 mlflow 版本。

資源類型	記錄資源所需的最低版本是 `mlflow` 版本
向量搜尋索引	需要 `mlflow` 2.13.1 或更新版本
模型服務端點	需要 `mlflow` 2.13.1 或更新版本
SQL 資料倉庫	需要 `mlflow` 2.16.1 或更新版本
Unity 資料目錄函式	需要 `mlflow` 2.16.1 或更新版本
精靈空間	需要 `mlflow` 2.17.1 或更新版本
Unity 目錄數據表	需要 `mlflow` 2.18.0 或更新版本
Unity 目錄連線	需要 `mlflow` 2.17.1 或更新版本

OpenAI 用戶端的自動驗證

如果您的代理程式使用 OpenAI 用戶端，請使用 Databricks SDK 在部署期間自動進行驗證。 Databricks SDK 提供包裝函式，可透過自動設定授權來建構 OpenAI 用戶端。在您的筆記本中執行下列命令：

% pip install databricks-sdk[openai]

from databricks.sdk import WorkspaceClient
def openai_client(self):
  w = WorkspaceClient()
  return w.serving_endpoints.get_open_ai_client()

然後，將模型服務端點指定為 resources 的一部分，以在部署時間自動進行驗證。

將代理程序註冊至 Unity 目錄

在部署代理程式之前，您必須將代理程式註冊到 Unity 目錄。在 Unity Catalog 中註冊代理程式，將其封裝為模型。因此，您可以使用 Unity 目錄權限來授權代理程式中的資源。

import mlflow

mlflow.set_registry_uri("databricks-uc")

catalog_name = "test_catalog"
schema_name = "schema"
model_name = "agent_name"

model_name = catalog_name + "." + schema_name + "." + model_name
uc_model_info = mlflow.register_model(model_uri=logged_agent_info.model_uri, name=model_name)

共用方式為