ai_query
function
Applies to: Databricks SQL
Databricks Runtime
Important
This feature is in Public Preview.
Invokes an existing Azure Databricks Model Serving endpoint and parses and returns its response.
To use ai_query
in production workflows, see Perform batch LLM inference using AI Functions.
Requirements
Note
- In Databricks Runtime 14.2 and above, this function is supported in Databricks notebooks, including notebooks that are run as a task in a Databricks workflow.
- In Databricks Runtime 14.1 and below, this function is not supported in Databricks notebooks.
This function is not available on Azure Databricks SQL Classic.
You must enable Azure Private Link to use this feature on pro SQL warehouses.
Databricks Runtime 15.3 or above is recommended. Using Databricks Runtime 15.2 or below might result in slower performance speeds.
Your workspace must be in a supported Model Serving region.
An existing model serving endpoint with your model loaded. If you are using a Databricks hosted foundation model, an endpoint is created for you. Otherwise, see Create custom model serving endpoints or Create foundation model serving endpoints.
Querying Foundation Model APIs is enabled by default. To query endpoints that serve custom models or external models:
- Enable AI_Query for Custom Models and External Models in the Databricks Previews UI.
The current DLT warehouse channel does not use the latest Databricks Runtime version that supports
ai_query()
. Set thepipelines.channel
in the table properties as'preview'
to useai_query()
.> create or replace materialized view ai_query_mv TBLPROPERTIES('pipelines.channel' = 'PREVIEW') AS SELECT ai_query("databricks-dbrx-instruct", text) as response FROM messages LIMIT 10;
Syntax
To query an endpoint that serves a foundation model, including external models or a custom foundation model:
ai_query(endpoint, request)
To query a custom model serving endpoint with a model schema:
ai_query(endpoint, request)
To query a custom model serving endpoint without a model schema:
ai_query(endpoint, request, returnType, failOnError)
Arguments and returns
Argument | Description | Returns |
---|---|---|
endpoint |
A STRING literal, the name of a Databricks Foundation Model serving endpoint, an external model serving endpoint or a custom model endpoint in the same workspace for invocations. The definer must have CAN QUERY permission on the endpoint. |
|
request |
An expression, the request used to invoke the endpoint.
|
|
returnType |
An expression, the expected returnType from the endpoint. This is similar to the schema parameter in from_json function, which accepts both a STRING expression or invocation of schema_of_json function.
|
|
failOnError |
(Optional) A boolean literal defaulting to true. This flag indicates whether to include error status in the ai_query response. |
See Handle errors using failOnError for an example. |
modelParameters |
(Optional) A struct field that contains chat, completion and embedding model parameters for serving foundation models or external models. These model parameters must be constant parameters and not data dependent. When these model parameters are not specified or set to null the default value is used. With the exception of temperature which has a default value of 0.0 , the default values for these model parameters are the same as those listed in Foundation model REST API reference.See Configure a model by passing model parameters for an example. |
|
responseFormat |
(Optional) A JSON string field that specifies the response format you want the model to follow. Three string types of response format are supported:
|
See Enforce output schema with structured output for an example. |
Concatenate the prompt and inference column
There are multiple ways to concatenate the prompt and the inference column, such as using ||
, CONCAT()
, or format_string()
:
SELECT
CONCAT('${prompt}', ${input_column_name}) AS concatenated_prompt
FROM ${input_table_name};
Alternatively:
SELECT
'${prompt}' || ${input_column_name} AS concatenated_prompt
FROM ${input_table_name};
Or using format_string()
:
SELECT
format_string('%s%s', '${prompt}', ${input_column_name}) AS concatenated_prompt
FROM ${input_table_name};
Configure a model by passing model parameters
Customize model behavior by passing specific parameters such as maximum tokens and temperature. For example:
SELECT text, ai_query(
"databricks-meta-llama-3-3-70b-instruct",
"Please summarize the following article: " || text,
modelParameters => named_struct('max_tokens', 100, 'temperature', 0.7)
) AS summary
FROM uc_catalog.schema.table;
Handle errors using failOnError
Use the failOnError
argument for ai_query
to handle errors. The following example shows how to make sure that if one row has an error, it won't stop the whole query from running. See Arguments and returns for expected behaviors based on how this argument is set.
SELECT text, ai_query(
"databricks-meta-llama-3-3-70b-instruct",
"Summarize the given text comprehensively, covering key points and main ideas concisely while retaining relevant details and examples. Ensure clarity and accuracy without unnecessary repetition or omissions: " || text,
failOnError => false
) AS summary
FROM uc_catalog.schema.table;
Enforce output schema with structured output
Ensure that the output conforms to a specific schema for easier downstream processing. For example, you can enforce a JSON schema response format:
SELECT ai_query(
"databricks-meta-llama-3-3-70b-instruct",
"Extract research paper details from the following abstract: " || abstract,
responseFormat => 'STRUCT<research_paper_extraction:STRUCT<title:STRING, authors:ARRAY<STRING>, abstract:STRING, keywords:ARRAY<STRING>>>'
)
FROM research_papers;
Alternatively, using a DDL style JSON schema:
SELECT ai_query(
"databricks-meta-llama-3-3-70b-instruct",
"Extract research paper details from the following abstract: " || abstract,
responseFormat => '{
"type": "json_schema",
"json_schema": {
"name": "research_paper_extraction",
"schema": {
"type": "object",
"properties": {
"title": {"type": "string"},
"authors": {"type": "array", "items": {"type": "string"}},
"abstract": {"type": "string"},
"keywords": {"type": "array", "items": {"type": "string"}}
}
}
},
"strict": true
}'
)
FROM research_papers;
An expected output might look like:
{ "title": "Understanding AI Functions in Databricks", "authors": ["Alice Smith", "Bob Jones"], "abstract": "This paper explains how AI functions can be integrated into data workflows.", "keywords": ["Databricks", "AI", "LLM"] }
Use ai_query
in user-defined functions
You can wrap a call to ai_query
in a UDF, making it easy to use functions across different workflows and share them.
CREATE FUNCTION correct_grammar(text STRING)
RETURNS STRING
RETURN ai_query(
'databricks-meta-llama-3-3-70b-instruct',
CONCAT('Correct this to standard English:\n', text));
GRANT EXECUTE ON correct_grammar TO ds;
SELECT
* EXCEPT text,
correct_grammar(text) AS text
FROM articles;
Example: Query a foundation model
To query an external model serving endpoint:
> SELECT ai_query(
'my-external-model-openai-chat',
'Describe Databricks SQL in 30 words.'
) AS summary
"Databricks SQL is a cloud-based platform for data analytics and machine learning, providing a unified workspace for collaborative data exploration, analysis, and visualization using SQL queries."
To query a foundation model supported by Databricks Foundation Model APIs:
> SELECT *,
ai_query(
'databricks-meta-llama-3-3-70b-instruct',
"Can you tell me the name of the US state that serves the provided ZIP code? zip code: " || pickup_zip
)
FROM samples.nyctaxi.trips
LIMIT 10
Optionally, you can also wrap a call to ai_query()
in a UDF for function calling as follows:
CREATE FUNCTION correct_grammar(text STRING)
RETURNS STRING
RETURN ai_query(
'databricks-meta-llama-3-3-70b-instruct',
CONCAT('Correct this to standard English:\n', text));
> GRANT EXECUTE ON correct_grammar TO ds;
- DS fixes grammar issues in a batch.
> SELECT
* EXCEPT text,
correct_grammar(text) AS text
FROM articles;
Example: Batch inference use case
The following is a batch inference example using the failOnError
and modelParameters
with max_tokens
and temperature
.
This example also shows how to concatenate the prompt for your model and the inference column using CONCAT()
.
There are multiple ways to perform concatenation, such as using ||
, concat()
, or format_string()
.
CREATE OR REPLACE TABLE ${output_table_name} AS (
SELECT
${input_column_name},
AI_QUERY(
"${endpoint}",
CONCAT("${prompt}", ${input_column_name}),
failOnError => false,
modelParameters => named_struct('max_tokens', ${num_output_tokens},'temperature', ${temperature})
) as response
FROM ${input_table_name}
LIMIT ${input_num_rows}
)
Example: Query a traditional ML model
To query a custom model or a traditional ML model serving endpoint:
> SELECT text, ai_query(
endpoint => 'spam-classification-endpoint',
request => named_struct(
'timestamp', timestamp,
'sender', from_number,
'text', text),
returnType => 'BOOLEAN') AS is_spam
FROM messages
LIMIT 10
> SELECT ai_query(
'weekly-forecast',
request => struct(*),
returnType => 'FLOAT') AS predicted_revenue
FROM retail_revenue
> SELECT ai_query(
'custom-llama-chat',
request => named_struct("messages",
ARRAY(named_struct("role", "user", "content", "What is ML?"))),
returnType => 'STRUCT<candidates:ARRAY<STRING>>')
{"candidates":["ML stands for Machine Learning. It's a subfield of Artificial Intelligence that involves the use of algorithms and statistical models to enable machines to learn from data, make decisions, and improve their performance on a specific task over time."]}