Avoid unnecessary updates on MERGE UPSERT

RJ 266 Reputation points
2025-01-09T16:08:24.4733333+00:00

Hi there,

You all might have used UPSERT operation using MERGE. Currently the update section updates all rows once the key is matched. Is there a way to update only rows which have column value changed and not update all rows to avoid unnecessary updates on rows who rows may have same value and have not changed?

Does it improve peformance?

Below is code to reproduce.

When run first time - new rows got added.

User's image

When run 2nd time - Note Run ONLY MERGE on 2nd time run.

same rows got updated even though there is no change. is there a way to detect only rows which have got changed and update it in UPSERT section? so not all rows get updated unnecessarily in the below case?

User's image

 --CREATE STAGING TABLE
CREATE TABLE #STAGING (custno INT,custname VARCHAR(40) , custcity VARCHAR(40)) -- 80columns
 
INSERT INTO #STAGING
VALUES 
(1,'John','Atlanta')  ,
(2,'Joe','Chicago') ,
(3,'Sam','Cleveland')

--CREATE FACT TABLE
CREATE TABLE #fact (custno INT,custname VARCHAR(40) , custcity VARCHAR(40)) -- 80columns

--MERGE 2ND TIME RUN ONLY BELOW THIS 

--DELCARATIONS 
DECLARE   @RowsWritten  INT                 , @RowsUpdated  INT     ; 

DECLARE @MergeOutputTable TABLE
([Action] varchar(20), 
 [SourceSystemUniqueSalesInvoiceId] bigint
)

MERGE INTO #fact AS T 

 USING #STAGING AS S 

 ON T.custno = S.custno

 WHEN MATCHED THEN

     UPDATE SET 
	 T.custname			= S.custname,
	 T.custcity			= S.custcity
	
 WHEN NOT MATCHED BY TARGET THEN

     INSERT (
	   custno
      ,custname
	  ,custcity)

	   VALUES (
	   S.custno
      ,S.custname
	  ,S.custcity)

 OUTPUT $action, INSERTED.[custno]

 INTO @MergeOutputTable;

   SELECT 

     @RowsWritten = SUM(CASE WHEN [action] = 'INSERT' THEN 1 ELSE 0 END),

     @RowsUpdated = SUM(CASE WHEN [action] = 'UPDATE' THEN 1 ELSE 0 END)

 FROM @MergeOutputTable;

--PRINT OPERATIONS
 print 'insert' +cast(@RowsWritten as varchar(10))
 print 'update' +cast(@RowsUpdated as varchar(10))   

--SELECT * FROM #staging  
--SELECT * FROM #fact

Azure SQL Database
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,682 questions
0 comments No comments
{count} votes

Accepted answer
  1. Viorel 119.2K Reputation points
    2025-01-09T16:34:54.2633333+00:00

    Try to add a condition for WHEN MATCHED: WHEN MATCHED and T.custname <> S.custname and T.custcity <> S.custcity THEN ....

    Also consider the case of null values.

    However, the additional check probably needs some processing time too.


1 additional answer

Sort by: Most helpful
  1. Sai Raghunadh M 1,915 Reputation points Microsoft Vendor
    2025-01-09T17:02:19.2+00:00

    Hi @RJ,

    Thanks for the question and using MS Q&A platform.

    In addition to Viorel

    Yes, when using an UPSERT operation with MERGE, you can avoid unnecessary updates by ensuring that only rows with actual changes in their values are updated. This approach reduces the overhead of updating unchanged rows, thereby improving performance, particularly with large datasets.

    To resolve this issue, you can adjust your MERGE statement to update only when the values in the source table (#STAGING) differ from those in the target table (#fact).

    This can be achieved by including a condition in the WHEN MATCHED clause that compares the values in both tables for the columns being updated (custname and custcity).

    Here is the updated code:

    -- MERGE 2ND TIME RUN ONLY BELOW THIS 
    -- DECLARATIONS
    DECLARE   @RowsWritten INT, @RowsUpdated INT;
    DECLARE @MergeOutputTable TABLE
    ([Action] varchar(20), 
     [SourceSystemUniqueSalesInvoiceId] bigint);
    -- MERGE Operation
    MERGE INTO #fact AS T 
    USING #STAGING AS S 
    ON T.custno = S.custno
    -- When matched and data has actually changed
    WHEN MATCHED AND (T.custname <> S.custname OR T.custcity <> S.custcity) THEN
        UPDATE SET 
            T.custname = S.custname,
            T.custcity = S.custcity
    -- When not matched by target, insert new rows
    WHEN NOT MATCHED BY TARGET THEN
        INSERT (custno, custname, custcity)
        VALUES (S.custno, S.custname, S.custcity)
    OUTPUT $action, INSERTED.[custno]
    INTO @MergeOutputTable;
    -- Calculate the number of rows written and updated
    SELECT 
       @RowsWritten = SUM(CASE WHEN [action] = 'INSERT' THEN 1 ELSE 0 END),
       @RowsUpdated = SUM(CASE WHEN [action] = 'UPDATE' THEN 1 ELSE 0 END)
    FROM @MergeOutputTable;
    -- Print the results
    PRINT 'insert ' + CAST(@RowsWritten AS VARCHAR(10));
    PRINT 'update ' + CAST(@RowsUpdated AS VARCHAR(10));
    

    Does it improve peformance?

    By avoiding unnecessary updates, you reduce the number of I/O operations required to update rows.

    In SQL Server, updating rows is recorded in the transaction log. Reducing the number of updates decreases transaction log usage, which can enhance performance.

    Please refer this documentation for more understanding:

    https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-ver16#syntax

    Hope this helps. Do let us know if you any further queries. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.