What's new and planned for Data Warehouse in Microsoft Fabric
Important
The release plans describe functionality that may or may not have been released yet. The delivery timelines and projected functionality may change or may not ship. Refer to Microsoft policy for more information.
Data Warehouse in Microsoft Fabric is the first data warehouse that supports multi-table transactions and natively embraces an open data format. The warehouse is built on the robust SQL Server query optimizer and an enterprise grade distributed query processing engine that eliminates the need for configuration and management. Data Warehouse in Microsoft Fabric seamlessly integrates with Data Factory for data ingestion, Power BI for analysis and reporting, and Synapse Spark for data science and machine learning. It streamlines an organization's analytics investments by converging data lakes and warehouses.
Data warehousing workloads benefit from the rich capabilities of the SQL engine over an open data format, enabling customers to focus on analysis and reporting. They also benefit from accessing data from OneLake, a data lake storage virtualization service.
To learn more, see the documentation.
Investment areas
Query Insights Updates
Estimated release timeline: Q4 2024
Release Type: General availability
Query Insights will now present the amount of data scanned during query execution, spilt up by memory, disk and remote.
Copilot for Data Warehouse: Sidecar Chat
Estimated release timeline: Q4 2024
Release Type: Public preview
Copilot for Data Warehouse is the ultimate paired programmer, and productivity booster for any skill level developer – from data professionals to analysts. It is designed to accelerate warehouse development by leveraging generative AI to simplify and automate Data Warehouse creation, analysis, and management.
This feature introduces the Copilot sidecar chat pane. Use the chat pane to ask questions about your warehouse to Copilot through natural language. Use cases include:
- Natural Language to SQL: Ask Copilot a question about your warehouse data and receive a generated SQL query catered to your warehouse schema.
- Q&A: Ask Copilot a question about your warehouse and receive a docs-based answer and pointers to official documentation.
- Brainstorm: Use Copilot to help you brainstorm relevant trends or patterns to analyze within your data
BULK INSERT
Estimated release timeline: Q1 2025
Release Type: Public preview
The BULK INSERT statement is used to ingest files in Fabric DW (similar to COPY INTO). with the BULK INERT function we are enabling migration from SQL Server to Fabric DW with minimal code changes. The customers would need to rewrite their code and external tools that use BULK INSERT to migrate to COPY INTO as a prerequisite for migrating to Fabric DW. In addition, BULK INSERT support some traditional on-prem options that are not available in COPY INTO.
OPENROWSET
Estimated release timeline: Q1 2025
Release Type: Public preview
Fabric DW enables the users to use the OPENROWSET function to read data from the files in the lake. A simple example of OPENROWSET function is:
SELECT *
FROM OPENROWSET ( BULK ‘<file path>’ )
WITH ( <column definition> )
The OPENROWSET function will read the content of the file(s) at the given <file path>
and return the content of the files.
Thi function enables easy browsing and previewing the files before ingestion.
Result set caching
Estimated release timeline: Q1 2025
Release Type: Public preview
Result set caching saves the results of applicable queries and returns them immediately on subsequent runs, which drastically reduce execution time by bypassing recompilation and recomputation. Cache is automatically managed and requires no manual intervention.
Automatic statistics enhancements
Estimated release timeline: Q1 2025
Release Type: General availability
Various enhancements are planned – shorter execution time of automatic statistic updates, opportunistic support for VARCHAR(MAX) column types, improved storage of intermediate statistic steps, automatic statistics maintenance outside of user query.
SHOWPLAN_XML
Estimated release timeline: Q1 2025
Release Type: General availability
SHOWPLAN_XML returns detailed information about how the statements are going to be executed in the form of a well-defined XML document without executing the T-SQL statement.
Copilot for SQL analytics endpoint
Estimated release timeline: Q1 2025
Release Type: Public preview
Copilot for Data Warehouse is the ultimate paired programmer, and productivity booster for any skill level developer – from data professionals to analysts. It is designed to accelerate warehouse development by leveraging generative AI to simplify and automate Data Warehouse creation, analysis, and management. This feature enables Copilot usage in the SQL analytics endpoint.
Refresh SQL Analytics Endpoint REST API
Estimated release timeline: Q1 2025
Release Type: General availability
Programmatically trigger a refresh of your SQL analytics endpoint to keep tables in sync with any changes made in the parent item.
Code Migration Assistant
Estimated release timeline: Q1 2025
Release Type: Public preview
Designed to accelerate the migration of SQL Server, Synapse dedicated SQL pools, and other warehouses to the Fabric Data Warehouse, users will be able to migrate the code and data from the source database, automatically converting the source schema and code to Fabric Data Warehouse, helping with data migration, and providing AI powered assistance.
Data Clustering
Estimated release timeline: Q1 2025
Release Type: Public preview
Data Clustering enables faster read performance by allowing customers to specify columns for co-locating data on ingestion to enable file skipping on read.
Alter Table - Drop/Rename column
Estimated release timeline: Q1 2025
Release Type: General availability
Alter Table - Drop/Rename will enable customers to alter their table definition by dropping and renaming columns. This is in addition to existing Alter Table functionality which supports adding new column and is in production today.
Temporary tables (session scoped)
Estimated release timeline: Q1 2025
Release Type: General availability
Session scoped, parquet backed temporary tables enable customers to store intermediate result sets for repeated access but are not required to persist permanently. They work as a regular, permament table but disappear when the connection is closed.
MERGE (T-SQL)
Estimated release timeline: Q1 2025
Release Type: General availability
The MERGE T-SQL command for Fabric Data Warehouse brings the power of selection-based DML into a single statement, for all your transformation logic needs.
SQL audit logs
Estimated release timeline: Q1 2025
Release Type: General availability
This feature tracks database events and writes them to an audit log, thereby allowing customers to query the audit file using sys.fn_get_audit_file_v2 for auditing and compliance.
EXECUTE AS
Estimated release timeline: Q1 2025
Release Type: General availability
EXECUTE AS sets the execution context for a session and thereby allows the user to impersonate as another user to validate the necessary permissions that were provided.
BCP
Estimated release timeline: Q1 2025
Release Type: Public preview
Fabric DW is supporting bcp utility and TDS Bulk Load API. Bulk Lod API enables a variety of client tools like bcp, SSIS, ADF, to load data into Fabric DW. And example of bcp command that is loading a content of a file into the DW table is:
bcp gold.artists in "C:\temp\gold_artist.txt" -d TextDW -c -S "<server name>.msit-datawarehouse.fabric.microsoft.com" -G -U theusert@microsoft.com
Shipped feature(s)
VARCHAR(MAX)/VARBINARY(MAX) types
Shipped (Q4 2024)
Release Type: Public preview
Users can define columns with VARCHAR(MAX)/VARBINARY(MAX) types in Data warehouse to store string or binary data up to 1 MB. In SQL endpoint for the Lakehouse, the string types in Delta tables are represented as VARCHAR(MAX) without truncation to 8 KB. The performance differences between the queries that are working with VARCHAR(MAX) and VARCHAR(8000) types are minimized, which enables users to use large types without significant performance penalty.
SQL Analytics Endpoint Improvements
Shipped (Q4 2024)
Release Type: General availability
This feature encompasses updates made to the SQL analytics endpoint experience, including:
- Automatic refresh triggered upon opening the item or connecting to the endpoint
- Improved UI to refresh the SQL analytics endpoint directly in the ribbon
- New property for Last Successful Update in OneLake via the table properties dialog flyout
- Improved error messages
JSON support
Shipped (Q4 2024)
Release Type: General availability
JSON support in Fabric Datawarehouse enables processing of textual data formatted as JSON text. The new JSON features in Fabric DW are:
- FOR JSON query option that formats the query results as JSON text. This was one of the features that is requested on Microsoft Fabric Idea site.
- JSON scalar functions that can also be used in Azure SQL database. In addition to the existing JSON scalar functions (ISJSON, JSON_VALUE, JSON_QUERY, and JSON_MODIFY), Fabric DW is suporting the latest JSON functions that exist in Azure SQL Database – JSON_PATH_EXISTS, JSON_OBJECT, and JSON_ARRAY.
String performance improvements
Shipped (Q4 2024)
Release Type: General availability
Operations on strings (VARCHAR(N)) are common in T-SQL queries. Performance improvements on string functions and operators that are working with strings boosts the performance of the queries that use LIKE predicates, string functions and comparison operators in WHERE predicates, and operators like GROUP BY, ORDER BY, JOIN that are working with string types.
Case insensitive collation support (Warehouse only)
Shipped (Q4 2024)
Release Type: General availability
Using the public REST APIs to create a Data Warehouse includes a new option to set the default collation. This can be used to set a new Case Insensitive Collation default. The two supported collations are Latin1_General_100_CI_AS_KS_WS_SC_UTF8 (which is Case Insensitive) and Latin1_General_100_BIN2_UTF8 (which is Case Sensitive) and continues to be our default.
COLLATE T-SQL clause support is coming soon. This will enable you to utilize the COLLATE command with CREATE or ALTER TABLE to directly specify the collation for your VARCHAR fields.
Nested CTE
Shipped (Q4 2024)
Common Table Expressions (CTE) increases the readability and simplification for complex queries by deconstructing ordinarily complex queries into simple blocks to be used and reused if necessary, instead of rewriting the query. A nested CTE is defined with the definition of another CTE.
T-SQL Notebook integration
Shipped (Q3 2024)
Release Type: Public preview
You can start using T-SQL language support within Notebooks which combines the power of Notebooks and SQL within the same experience - enabling intellisense, autocomplete, cross database queries, richer visualizations and the ability to easily collaborate and share using Notebooks.
TRUNCATE
Shipped (Q3 2024)
The TRUNCATE command quickly removes all rows of data from a table.
ALTER TABLE - Add nullable column
Shipped (Q3 2024)
Support for ALTER TABLE ADD COLUMN to be able to extend already existing tables with new columns that allow NULL values.
Query insights updates
Shipped (Q3 2024)
A historic view of your closed sessions will be made available via Query Insights. This addition it helps you analyze traffic, load, and usage of your DW.
In-place restore within warehouse editor
Shipped (Q2 2024)
You can now easily create restore points and restore the warehouse to a known good state in the event of accidental corruption, using the Warehouse editor experience.
COPY INTO support for secure storage
Shipped (Q2 2024)
Release Type: Public preview
You can now ingest data into your Warehouse using COPY INTO from an external Azure storage account that is protected behind a Firewall.
Copilot
Shipped (Q2 2024)
Release Type: Public preview
Copilot enables developers of any skill level to quickly build and query a warehouse in Fabric. Copilot offers advice and best practices, autocomplete code, help fix and document code, and offer assistance with data prep, modeling, and analysis.
Time travel
Shipped (Q2 2024)
The ability to time travel at the T-SQL statement level empowers users to query historical data from various past timeframes by specifying the timestamp only once for the entire query. Time travel helps save significantly on storage costs by using single copy of data present in One Lake for conducting historical trend analysis, troubleshooting, and data reconciliation. Additionally, it also facilitates achieving stable reporting by upholding the data integrity across various tables within the data warehouse.
Warehouse monitoring experience
Shipped (Q2 2024)
Using the built-in warehouse monitoring experience, you can view both live queries and historical queries, monitor, and troubleshoot performance of their end-to-end solution.