다음을 통해 공유


Author AI agents in code

This article shows how to author an AI agent in Python using Mosaic AI Agent Framework and popular agent-authoring libraries like LangGraph, PyFunc, and OpenAI.

Requirements

Databricks recommends installing the latest version of the MLflow Python client when developing agents.

To author and deploy agents using the approach in this article, you must have the following minimum package versions:

  • databricks-agents version 0.16.0 and above
  • mlflow version 2.20.2 and above
%pip install -U -qqqq databricks-agents>=0.16.0 mlflow>=2.20.2

Databricks also recommends installing Databricks AI Bridge integration packages when authoring agents. These integration packages (such as databricks-langchain, databricks-openai) provide a shared layer of APIs to interact with Databricks AI features, such as Databricks AI/BI Genie and Vector Search, across agent authoring frameworks and SDKs.

LangChain/LangGraph

%pip install -U -qqqq databricks-langchain

OpenAI

%pip install -U -qqqq databricks-openai

Pure Python agents

%pip install -U -qqqq databricks-ai-bridge

Use ChatAgent to author agents

Databricks recommends using MLflow's ChatAgent interface to author production-grade agents. This chat schema specification is designed for agent scenarios and is similar to, but not strictly compatible with, the OpenAI ChatCompletion schema. ChatAgent also adds functionality for multi-turn, tool-calling agents.

Authoring your agent using ChatAgent provides the following benefits:

  • Advanced agent capabilities

    • Streaming output: Enable interactive user experiences by streaming output in smaller chunks.
    • Comprehensive tool-calling message history: Return multiple messages, including intermediate tool-calling messages, for improved quality and conversation management.
    • Tool-calling confirmation support
    • Multi-agent system support
  • Streamlined development, deployment, and monitoring

    • Databricks feature integration: Out-of-the-box compatibility with AI Playground, Agent Evaluation, and Agent Monitoring.
    • Typed authoring interfaces: Write agent code using typed Python classes, benefiting from IDE and notebook autocomplete.
    • Automatic signature inference: MLflow automatically infers ChatAgent signatures when logging the agent, simplifying registration and deployment. See Infer Model Signature during logging.
    • AI Gateway-enhanced inference tables: AI Gateway inference tables are automatically enabled for deployed agents, providing access to detailed request log metadata.

To learn how to create a ChatAgent, see the examples in the following section and MLflow documentation - What is the ChatAgent interface.

ChatAgent examples

The following notebooks show you how to author streaming and non-streaming ChatAgents using the popular libraries OpenAI and LangGraph.

LangGraph tool-calling agent

Get notebook

OpenAI tool-calling agent

Get notebook

To learn how to expand the capabilities of these agents by adding tools, see AI agent tools.

Design stateless ChatAgent for distributed model serving

Databricks deploys ChatAgent in a distributed environment on Databricks Model Serving, which means that during a multi-turn conversation, the same serving replica may not handle all requests. Pay attention to the following implications for managing agent state:

  • Avoid local caching: When deploying a ChatAgent, don't assume the same replica will handle all requests in a multi-turn conversation. Reconstruct internal state using a dictionary ChatAgentRequest schema for each turn.

  • Thread-safe state: Design agent state to prevent conflicts in multi-threaded environments.

  • Initialize state in the predict function: Initialize state each time the predict function is called, not during ChatAgent initialization. Storing state at the ChatAgent level could leak information between conversations and cause conflicts because a single ChatAgent replica could handle requests from multiple conversations.

Custom inputs and outputs

Some scenarios may require additional agent inputs, such as client_type and session_id, or outputs like retrieval source links that should not be included in the chat history for future interactions.

For these scenarios, MLflow ChatAgent natively supports the fields custom_inputs and custom_outputs.

Warning

The Agent Evaluation review app does not currently support rendering traces for agents with additional input fields.

See the following examples to learn how to set custom inputs and outputs for OpenAI/PyFunc and LangGraph agents.

OpenAI + PyFunc custom schema agent notebook

Get notebook

LangGraph custom schema agent notebook

Get notebook

Provide custom_inputs in the AI Playground and agent review app

If your agent accepts additional inputs using the custom_inputs field, you can manually provide these inputs in both the AI Playground and the agent review app.

  1. In either the AI Playground or the Agent Review App, select the gear icon Gear icon.

  2. Enable custom_inputs.

  3. Provide a JSON object that matches your agent’s defined input schema.

    Provide custom_inputs in the AI playground.

Specify custom retriever schemas

AI agents commonly use retrievers to find and query unstructured data from vector search indices.

To ensure that your retriever is compatible with other Databricks features and gets handled correctly by other features, follow these guidelines when authoring your agent:

  • Trace retrievers with MLflow RETRIEVER spans: Downstream Databricks features look for RETRIEVER spans to handle retriever trace information correctly. By using RETRIEVER spans, you enable functionality like automatically displaying links to source documents in the AI Playground and automatically running retrieval groundedness and relevance judges in Agent Evaluation. See MLflow documentation - RETRIEVER spans.

Note

Databricks recommends using retriever tools provided by Databricks AI Bridge packages like databricks_langchain.VectorSearchRetrieverTool and databricks_openai.VectorSearchRetrieverTool because they already conform to the MLflow retriever schema. See Locally develop Vector Search retriever tools with AI Bridge.

  • Specify custom retriever schema: If your agent includes retriever spans with different schemas, call mlflow.models.set_retriever_schema when you define your agent in code. This maps your retriever's output columns to MLflow's expected fields (primary_key, text_column, doc_uri).
import mlflow
# Define the retriever's schema by providing your column names
mlflow.models.set_retriever_schema(
    # Specify the name of your retriever span
    name="vector_search",
    # Specify the output column name to treat as the primary key (ID) of each retrieved document
    primary_key="chunk_id",
    # Specify the output column name to treat as the text content (page content) of each retrieved document
    text_column="text_column",
    # Specify the output column name to treat as the document URI of each retrieved document
    doc_uri="doc_uri",
)

Note

The doc_uri column is especially important when evaluating the retriever’s performance. doc_uri is the main identifier for documents returned by the retriever, allowing you to compare them against ground truth evaluation sets. See Evaluation sets.

For example retriever tools, see Unstructured retrieval AI agent tools.

Parametrize agent code for deployment across environments

You can parametrize agent code to reuse the same agent code across different environments.

Parameters are key-value pairs that you define in a Python dictionary or a .yaml file.

To configure the code, create a ModelConfig using either a Python dictionary or a .yaml file. ModelConfig is a set of key-value parameters that allows for flexible configuration management. For example, you can use a dictionary during development and then convert it to a .yaml file for production deployment and CI/CD.

For details about ModelConfig, see the MLflow documentation.

An example ModelConfig is shown below:

llm_parameters:
  max_tokens: 500
  temperature: 0.01
model_serving_endpoint: databricks-dbrx-instruct
vector_search_index: ml.docs.databricks_docs_index
prompt_template: 'You are a hello world bot. Respond with a reply to the user''s
  question that indicates your prompt template came from a YAML file. Your response
  must use the word "YAML" somewhere. User''s question: {question}'
prompt_template_input_vars:
  - question

In your agent code, you can reference a default (development) configuration from the .yaml file or dictionary:

import mlflow
# Example for loading from a .yml file
config_file = "configs/hello_world_config.yml"
model_config = mlflow.models.ModelConfig(development_config=config_file)

# Example of using a dictionary
config_dict = {
    "prompt_template": "You are a hello world bot. Respond with a reply to the user's question that is fun and interesting to the user. User's question: {question}",
    "prompt_template_input_vars": ["question"],
    "model_serving_endpoint": "databricks-dbrx-instruct",
    "llm_parameters": {"temperature": 0.01, "max_tokens": 500},
}

model_config = mlflow.models.ModelConfig(development_config=config_dict)

# Use model_config.get() to retrieve a parameter value
# You can also use model_config.to_dict() to convert the loaded config object
# into a dictionary
value = model_config.get('sample_param')

Then, when logging your agent, specify the model_config parameter to log_model to specify a custom set of parameters to use when loading the logged agent. See MLflow documentation - ModelConfig.

Streaming error propagation

Mosaic AI propagation any errors encountered while streaming with the last token under databricks_output.error. It is up to the calling client to properly handle and surface this error.

{
  "delta": …,
  "databricks_output": {
    "trace": {...},
    "error": {
      "error_code": BAD_REQUEST,
      "message": "TimeoutException: Tool XYZ failed to execute."
    }
  }
}

Next steps