Generate unique identifiers in a warehouse table in Microsoft Fabric
Applies to: ✅ Warehouse in Microsoft Fabric
It's a common requirement in data warehouses to assign a unique identifier to each row of a table. In SQL Server-based environments this is typically done by creating an identity column in a table, however currently this feature isn't supported in a warehouse in Microsoft Fabric. Instead, you need to use a workaround technique. We present two alternatives.
This article describes workaround techniques to generate unique identifiers in a warehouse table.
Method 1
This method is most applicable when you need to create identity values, but the order of the values isn't important (nonsequential values are acceptable).
Unique values are generated in the code that inserts data into the table.
To create unique data using this method, create a table that includes a column that stores unique identifier values. The column data type should be set to bigint. You should also define the column as
NOT NULL
to ensure that every row is assigned an identifier.The following T-SQL code sample creates an example table named
Orders_with_Identifier
in thedbo
schema, where theRow_ID
column serves as a unique key.--Drop a table named 'Orders_with_Identifier' in schema 'dbo', if it exists IF OBJECT_ID('[dbo].[Orders_with_Identifier]', 'U') IS NOT NULL DROP TABLE [dbo].[Orders_with_Identifier]; GO CREATE TABLE [dbo].[Orders_with_Identifier] ( [Row_ID] BIGINT NOT NULL, [O_OrderKey] BIGINT NULL, [O_CustomerKey] BIGINT NULL, [O_OrderStatus] VARCHAR(1) NULL, [O_TotalPrice] DECIMAL(15, 2) NULL, [O_OrderDate] DATE NULL, [O_OrderPriority] VARCHAR(15) NULL, [O_Clerk] VARCHAR (15) NULL, [O_ShipPriority] INT NULL, [O_Comment] VARCHAR (79) NULL );
When you insert rows into the table, via T-SQL scripts or application code or otherwise, generate unique data for
Row_ID
with theNEWID()
function. This function generates a unique value of type uniqueidentifier which can then be cast and stored as a bigint.The following code inserts rows into the
dbo.Orders_with_Identifier
table. The values for theRow_ID
column are computed by converting the values returned by thenewid()
function. The function doesn't require anORDER BY
clause and generates a new value for each record.--Insert new rows with unique identifiers INSERT INTO [dbo].[Orders_with_Identifier] SELECT CONVERT(BIGINT, CONVERT(VARBINARY, CONCAT(NEWID(), GETDATE()))) AS [Row_ID], [src].[O_OrderKey], [src].[O_CustomerKey], [src].[O_OrderStatus], [src].[O_TotalPrice], [src].[O_OrderDate], [src].[O_OrderPriority], [src].[O_Clerk], [src].[O_ShipPriority], [src].[O_Comment] FROM [dbo].[Orders] AS [src];
Method 2
This method is most applicable when you need to create sequential identity values but should be used with caution on larger datasets as it can be slower than alternative methods. Considerations should also be made for multiple processes inserting data simultaneously as this could lead to duplicate values.
To create unique data using this method, create a table that includes a column that stores unique identifier values. The column data type should be set to int or bigint, depending on the volume of data you expect to store. You should also define the column as
NOT NULL
to ensure that every row is assigned an identifier.The following T-SQL code sample creates an example table named
Orders_with_Identifier
in thedbo
schema, where theRow_ID
column serves as a unique key.--Drop a table named 'Orders_with_Identifier' in schema 'dbo', if it exists IF OBJECT_ID('[dbo].[Orders_with_Identifier]', 'U') IS NOT NULL DROP TABLE [dbo].[Orders_with_Identifier]; GO CREATE TABLE [dbo].[Orders_with_Identifier] ( [Row_ID] BIGINT NOT NULL, [O_OrderKey] BIGINT NULL, [O_CustomerKey] BIGINT NULL, [O_OrderStatus] VARCHAR(1) NULL, [O_TotalPrice] DECIMAL(15, 2) NULL, [O_OrderDate] DATE NULL, [O_OrderPriority] VARCHAR(15) NULL, [O_Clerk] VARCHAR (15) NULL, [O_ShipPriority] INT NULL, [O_Comment] VARCHAR (79) NULL ); GO
Before you insert rows into the table, you need to determine the last identifier value stored in the table. You can do that by retrieving the maximum identifier value. This value should be assigned to a variable so you can refer to it when you insert table rows (in the next step).
The following code assigns the last identifier value to a variable named
@MaxID
.--Assign the last identifier value to a variable --If the table doesn't contain any rows, assign zero to the variable DECLARE @MaxID AS BIGINT; IF EXISTS(SELECT * FROM [dbo].[Orders_with_Identifier]) SET @MaxID = (SELECT MAX([Row_ID]) FROM [dbo].[Orders_with_Identifier]); ELSE SET @MaxID = 0;
When you insert rows into the table, unique and sequential numbers are computed by adding the value of the
@MaxID
variable to the values returned by the ROW_NUMBER function. This function is a window function that computes a sequential row number starting with1
.The following T-SQL code—which is run in the same batch as the script in step 2—inserts rows into the
Orders_with_Identifier
table. The values for theRow_ID
column are computed by adding the@MaxID
variable to values returned by theROW_NUMBER
function. The function must have anORDER BY
clause, which defines the logical order of the rows within the result set. However when set toSELECT NULL
, no logical order is imposed, meaning identifier values are arbitrarily assigned. ThisORDER BY
clause results in a faster execution time.--Insert new rows with unique identifiers INSERT INTO [dbo].[Orders_with_Identifier] SELECT @MaxID + ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS [Row_ID], [src].[O_OrderKey], [src].[O_CustomerKey], [src].[O_OrderStatus], [src].[O_TotalPrice], [src].[O_OrderDate], [src].[O_OrderPriority], [src].[O_Clerk], [src].[O_ShipPriority], [src].[O_Comment] FROM [dbo].[Orders] AS [src];