AI Functions on Azure Databricks

Important

This feature is in Public Preview.

This article describes Azure Databricks AI Functions, built-in SQL functions that allow you to apply AI on your data directly from SQL.

SQL is crucial for data analysis due to its versatility, efficiency, and widespread use. Its simplicity enables swift retrieval, manipulation, and management of large datasets. Incorporating AI functions into SQL for data analysis enhances efficiency, which enables businesses to swiftly extract insights.

Integrating AI into analysis workflows provides access to information previously inaccessible to analysts, and empowers them to make more informed decisions, manage risks, and sustain a competitive advantage through data-driven innovation and efficiency.

AI functions using Databricks Foundation Model APIs

Note

  • In Databricks Runtime 15.1 and above, these functions are supported in Databricks notebooks, including notebooks that are run as a task in a Databricks workflow.
  • These functions are powered by Meta-Llama-3.1-70B-Instruct for chat tasks and GTE Large (English) for embedding tasks. These models are limited to US and EU regions. See AI and machine learning.

These functions invoke a state-of-the-art generative AI model from Databricks Foundation Model APIs to perform tasks such as sentiment analysis, classification, and translation. See Analyze customer reviews using AI Functions.

ai_query

Note

  • In Databricks Runtime 14.2 and above, this function is supported in Databricks notebooks, including notebooks that are run as a task in a Databricks workflow.
  • In Databricks Runtime 14.1 and below, this function is not supported in Databricks notebooks.

The ai_query() function allows you to query machine learning models and large language models served using Mosaic AI Model Serving. To do so, this function invokes an existing Mosaic AI Model Serving endpoint and parses and returns its response. You can use ai_query() to query endpoints that serve custom models, foundation models made available using Foundation Model APIs, and external models.

For use cases with over 100 rows of data, Databricks recommends using ai_query and a provisioned throughput endpoint. See Perform batch LLM inference using ai_query.

The vector_search() function allows you to search and query a Mosaic AI Vector Search index using SQL.

See vector_search function for more information.

ai_forecast

The ai_forecast() function is a table-valued function designed to extrapolate time series data into the future. In its most general form, ai_forecast() accepts grouped, multivariate, or mixed-granularity data, and forecasts that data up to some horizon in the future.

Important

This functionality is in Public Preview. Reach out to your Databricks account team to participate in the preview.

See ai_forecast function for more information.