ai_query
function
Applies to: Databricks SQL Databricks Runtime
Important
This feature is in Public Preview.
Invokes an existing Azure Databricks Model Serving endpoint and parses and returns its response.
See the following guides for how to use ai_query
for different use cases:
- Perform batch LLM inference using ai_query
- Query an external model with ai_query
- Query a served model with ai_query
Requirements
- This function is not available on Azure Databricks SQL Classic.
- You must enable Azure Private Link to use this feature on pro SQL warehouses.
- Databricks Runtime 15.3 or above is recommended. Using Databricks Runtime 15.2 or below might result in slower performance speeds.
- The current DLT warehouse channel does not use the latest Databricks Runtime version that supports
ai_query()
. Set thepipelines.channel
in the table properties as'preview'
to useai_query()
. See Examples for a sample query. - An existing model serving endpoint with your model loaded. See Create custom model serving endpoints.
- Your workspace must be in a supported Model Serving region.
- Querying Foundation Model APIs is enabled by default. To query endpoints that serve custom models or external models:
- Enable AI_Query for Custom Models and External Models in the Databricks Previews UI.
- See Query an external model with ai_query.
- See Perform batch LLM inference using ai_query
Note
- In Databricks Runtime 14.2 and above, this function is supported in Databricks notebooks, including notebooks that are run as a task in a Databricks workflow.
- In Databricks Runtime 14.1 and below, this function is not supported in Databricks notebooks.
Syntax
To query an endpoint that serves an external model or a foundation model:
ai_query(endpoint, request)
To query a custom model serving endpoint with a model schema:
ai_query(endpoint, request)
To query a custom model serving endpoint without a model schema:
ai_query(endpoint, request, returnType, failOnError)
Arguments
endpoint
: ASTRING
literal, the name of a Databricks Foundation Model serving endpoint, an external model serving endpoint or a custom model endpoint in the same workspace for invocations. The definer must haveCAN QUERY
permission on the endpoint.request
: An expression, the request used to invoke the endpoint.- If the endpoint is an external model serving endpoint or Databricks Foundation Model APIs endpoint, the request must be a
STRING
. - If the endpoint is a custom model serving endpoint, the request can be a single column or a struct expression. The struct field names should match the input feature names expected by the endpoint.
- If the endpoint is an external model serving endpoint or Databricks Foundation Model APIs endpoint, the request must be a
returnType
: An expression, the expectedreturnType
from the endpoint. This is similar to the schema parameter in from_json function, which accepts both aSTRING
expression or invocation of schema_of_json function.- In Databricks Runtime 14.2 and above, if this expression is not provided,
ai_query()
automatically infers the return type from the model schema of the custom model serving endpoint. - In Databricks Runtime 14.1 and below, this expression is required for querying a custom model serving endpoint.
- In Databricks Runtime 14.2 and above, if this expression is not provided,
failOnError
: (optional) A boolean literal defaulting to true. This flag indicates whether to include error status in theai_query
response.modelParameters
(optional): A struct field that contains chat, completion and embedding model parameters for serving foundation models or external models. These model parameters must be constant parameters and not data dependent. When these model parameters are not specified or set tonull
the default value is used. With the exception oftemperature
which has a default value of0.0
, the default values for these model parameters are the same as those listed in Foundation model REST API reference.responseFormat
(optional): A JSON string field that specifies the response format you want the model to follow. Three string types of response format are supported:text
json_object
json_schema
Returns
The parsed response from the endpoint.
- If
failOnError => true
, the function returns the same result as the existing behavior, which is the parsed response from the endpoint. The data type of the parsed response is inferred from the model type, the model schema endpoint, or thereturnType
parameter in theai_query
function. - If
failOnError => false
, the function returns the parsed response and the error status string as aSTRUCT
object.- If the inference of the row succeeded, the
errorStatus
field isnull
. - If the inference of the row failed due to model endpoint errors, the
response
field isnull
. - If the inference of the row failed due to other errors, the whole query fails.
- If the inference of the row succeeded, the
- Depending on the JSON string type specified in
responseFormat
, the following response is returned:- For
responseFormat => '{“type”, “text”}'
, the response is a string such as,“Here is the response”
. - For
responseFormat => '{“type”, “json_object”}'
, the response is a key-value pair JSON string, such as{“key”: “value”}
. - For
responseFormat => '{“type”, “json_schema”...}'
, the response is a JSON string. See the Examples.
- For
- If
failOnError => false
and you have specifiedresponseFormat
, the function returns the parsed response and the error status string as aSTRUCT
object.
Examples
The following example specifies a json_schema
response format:
SELECT
ai_query(
"llama-3-1-70b",
<request>,
responseFormat =>'{
"type": "json_schema",
"json_schema":
{
"name": "research_paper_extraction",
"schema":
{
"type": "object",
"properties":
{
"title": { "type": "string" }
}
},
"strict": true
}
}'
)
The following is an example output for the specified json_schema
response format:
{"title": "<the title of the paper>"}
The following is a batch inference example using the failOnError
and modelParameters
with max_tokens
and temperature
.
This example also shows how to concatenate the prompt for your model and the inference column using concat()
.
There are multiple ways to perform concatenation, such as using ||
, concat()
, or format_string()
.
CREATE OR REPLACE TABLE ${output_table_name} AS (
SELECT
${input_column_name},
AI_QUERY(
"${endpoint}",
CONCAT("${prompt}", ${input_column_name}),
failOnError => false,
modelParameters => named_struct('max_tokens', ${num_output_tokens},'temperature', ${temperature})
) as response
FROM ${input_table_name}
LIMIT ${input_num_rows}
)
To query an external model serving endpoint:
> SELECT ai_query(
'my-external-model-openai-chat',
'Describe Databricks SQL in 30 words.'
) AS summary
"Databricks SQL is a cloud-based platform for data analytics and machine learning, providing a unified workspace for collaborative data exploration, analysis, and visualization using SQL queries."
To query a foundation model supported by Databricks Foundation Model APIs:
> SELECT *,
ai_query(
'databricks-meta-llama-3-1-70b-instruct',
"Can you tell me the name of the US state that serves the provided ZIP code? zip code: " || pickup_zip
)
FROM samples.nyctaxi.trips
LIMIT 10
Optionally, you can also wrap a call to ai_query()
in a UDF for function calling as follows:
> CREATE FUNCTION correct_grammar(text STRING)
RETURNS STRING
RETURN ai_query(
'databricks-meta-llama-3-1-70b-instruct',
CONCAT('Correct this to standard English:\n', text));
> GRANT EXECUTE ON correct_grammar TO ds;
- DS fixes grammar issues in a batch.
> SELECT
* EXCEPT text,
correct_grammar(text) AS text
FROM articles;
To query a custom model serving endpoint:
> SELECT text, ai_query(
endpoint => 'spam-classification-endpoint',
request => named_struct(
'timestamp', timestamp,
'sender', from_number,
'text', text),
returnType => 'BOOLEAN') AS is_spam
FROM messages
LIMIT 10
> SELECT ai_query(
'weekly-forecast',
request => struct(*),
returnType => 'FLOAT') AS predicted_revenue
FROM retail_revenue
> SELECT ai_query(
'custom-llama-2-7b-chat',
request => named_struct("messages",
ARRAY(named_struct("role", "user", "content", "What is ML?"))),
returnType => 'STRUCT<candidates:ARRAY<STRING>>')
{"candidates":["ML stands for Machine Learning. It's a subfield of Artificial Intelligence that involves the use of algorithms and statistical models to enable machines to learn from data, make decisions, and improve their performance on a specific task over time."]}
Example query for setting the DLT channel to preview:
> create or replace materialized view
ai_query_mv
TBLPROPERTIES('pipelines.channel' = 'PREVIEW') AS
SELECT
ai_query("databricks-dbrx-instruct", text) as response
FROM
messages
LIMIT 10;