Deploy models for batch inference and prediction
This article describes what Databricks recommends for batch inference.
For real-time model serving on Azure Databricks, see Deploy models using Mosaic AI Model Serving.
AI Functions for batch inference
Important
This feature is in Public Preview.
AI Functions are built-in functions that you can use to apply AI on your data that is stored on Databricks. You can run batch inference using task-specific AI functions or the general purpose function, ai_query
. For flexibility, Databricks recommends using ai_query
for batch inference.
There are two main ways to use ai_query
for batch inference:
- Batch inference using
ai_query
and Databricks-hosted foundation models. When you use this method, Databricks configures a model serving endpoint that scales automatically based on the workload. See which pre-provisioned LLMs are supported. - Batch inference using
ai_query
and a model serving endpoint you configure yourself. This method is required for batch inference workflows that use foundation models hosted outside of Databricks, fine-tuned foundation models, or traditional ML models. After deployment, the endpoint can be directly used withai_query
.
Batch inference using a Spark DataFrame
See Perform batch inference using a Spark DataFrame for a step-by-step guide through the model inference workflow using Spark.
For deep learning model inference examples see the following articles:
Structured data extraction and batch inference using Spark UDF
The following example notebook demonstrates the development, logging, and evaluation of a simple agent for structured data extraction to transform raw, unstructured data into organized, useable information through automated extraction techniques. This approach demonstrates how to implement custom agents for batch inference using MLflow's PythonModel
class and employ the logged agent model as a Spark User-Defined Function (UDF). This notebook also shows how to leverage Mosaic AI Agent Evaluation to evaluate the accuracy using ground truth data.
Structured data extraction and batch inference using Spark UDF
:::