Define an agent’s input and output schema

AI agents must adhere to specific input and output schema requirements to be compatible with other features on Azure Databricks. This article explains how to ensure your AI agent adheres to those requirements and how to customize your agent’s input and output schema while ensuring compatibility.

Mosaic AI uses MLflow Model Signatures to define an agent’s input and output schema requirements. The model signature tells internal and external components how to interact with your agent and validates that they adhere to the schema.

OpenAI chat completion schema (recommended)

Databricks recommends using the OpenAI chat completion schema to define agent input and output. This schema is widely adopted and compatible with many agent frameworks and applications, including those in Databricks.

See OpenAI documentation for more information about the chat completion input schema and output schema.

Note

The OpenAI chat completion schema is simply a standard for structuring agent inputs and outputs. Implementing this schema does not involve using OpenAI’s resources or models.

MLflow provides convenient APIs for LangChain and PyFunc-flavored agents, helping you create agents compatible with the chat completion schema.

Implement chat completion with LangChain

If your agent uses LangChain, use MLflow’s ChatCompletionOutputParser() to format your agent’s final output to be compatible with the chat completion schema. If you use LangGraph, see LangGraph custom schemas.


  from mlflow.langchain.output_parsers import ChatCompletionOutputParser

  chain = (
      {
          "user_query": itemgetter("messages")
          | RunnableLambda(extract_user_query_string),
          "chat_history": itemgetter("messages") | RunnableLambda(extract_chat_history),
      }
      | RunnableLambda(DatabricksChat)
      | ChatCompletionOutputParser()
  )

Implement chat completion with PyFunc

If you use PyFunc, Databricks recommends writing your agent as a subclass of mlflow.pyfunc.ChatModel. This method provides the following benefits:

  • Allows you to write agent code compatible with the chat completion schema using typed Python classes.

  • When logging the agent, MLflow will automatically infer a chat completion-compatible signature, even without an input_example. This simplifies the process of registering and deploying the agent. See Infer Model Signature during logging.

    from dataclasses import dataclass
    from typing import Optional, Dict, List, Generator
    from mlflow.pyfunc import ChatModel
    from mlflow.types.llm import (
        # Non-streaming helper classes
        ChatCompletionRequest,
        ChatCompletionResponse,
        ChatCompletionChunk,
        ChatMessage,
        ChatChoice,
        ChatParams,
        # Helper classes for streaming agent output
        ChatChoiceDelta,
        ChatChunkChoice,
    )
    
    class MyAgent(ChatModel):
        """
        Defines a custom agent that processes ChatCompletionRequests
        and returns ChatCompletionResponses.
        """
        def predict(self, context, messages: list[ChatMessage], params: ChatParams) -> ChatCompletionResponse:
            last_user_question_text = messages[-1].content
            response_message = ChatMessage(
                role="assistant",
                content=(
                    f"I will always echo back your last question. Your last question was: {last_user_question_text}. "
                )
            )
            return ChatCompletionResponse(
                choices=[ChatChoice(message=response_message)]
            )
    
        def _create_chat_completion_chunk(self, content) -> ChatCompletionChunk:
            """Helper for constructing a ChatCompletionChunk instance for wrapping streaming agent output"""
            return ChatCompletionChunk(
                    choices=[ChatChunkChoice(
                        delta=ChatChoiceDelta(
                            role="assistant",
                            content=content
                        )
                    )]
                )
    
        def predict_stream(
            self, context, messages: List[ChatMessage], params: ChatParams
        ) -> Generator[ChatCompletionChunk, None, None]:
            last_user_question_text = messages[-1].content
            yield self._create_chat_completion_chunk(f"Echoing back your last question, word by word.")
            for word in last_user_question_text.split(" "):
                yield self._create_chat_completion_chunk(word)
    
    agent = MyAgent()
    model_input = ChatCompletionRequest(
        messages=[ChatMessage(role="user", content="What is Databricks?")]
    )
    response = agent.predict(context=None, model_input=model_input)
    print(response)
    

Custom inputs and outputs

Databricks recommends adhering to the OpenAI chat completions schema for most agent use cases.

However, some scenarios may require additional inputs, such as client_type and session_id, or outputs like retrieval source links that should not be included in the chat history for future interactions.

For these scenarios, Mosaic AI Agent Framework supports augmenting OpenAI chat completion requests and responses with custom inputs and outputs.

See the following examples to learn how to create custom inputs and outputs for PyFunc and LangGraph agents.

Warning

The Agent Evaluation review app does not currently support rendering traces for agents with additional input fields.

PyFunc custom schemas

The following notebooks show a custom schema example using PyFunc.

PyFunc custom schema agent notebook

Get notebook

PyFunc custom schema driver notebook

Get notebook

LangGraph custom schemas

The following notebooks show a custom schema example using LangGraph. You can modify the wrap_output function in the notebooks to parse and extract information from the message stream.

LangGraph custom schema agent notebook

Get notebook

LangGraph custom schema driver notebook

Get notebook

Provide custom_inputs in the AI Playground and agent review app

If your agent accepts additional inputs using the custom_inputs field, you can manually provide these inputs in both the AI Playground and the agent review app.

  1. In either the AI Playground or the Agent Review App, select the gear icon Gear icon.

  2. Enable custom_inputs.

  3. Provide a JSON object that matches your agent’s defined input schema.

    Provide custom_inputs in the AI playground.

Legacy input and output schemas

The SplitChatMessageRequest input schema and StringResponse output schema have been deprecated. If you are using either of these legacy schemas, Databricks recommends that you migrate to the recommended chat completion schema.

SplitChatMessageRequest input schema (deprecated)

SplitChatMessagesRequest allows you to pass the current query and history separately as agent input.

  question = {
      "query": "What is MLflow",
      "history": [
          {
              "role": "user",
              "content": "What is Retrieval-augmented Generation?"
          },
          {
              "role": "assistant",
              "content": "RAG is"
          }
      ]
  }

StringResponse output schema (deprecated)

StringResponse allows you to return the agent’s response as an object with a single string content field:

{"content": "This is an example string response"}