Try to add a condition for WHEN MATCHED: WHEN MATCHED and T.custname <> S.custname and T.custcity <> S.custcity THEN ...
.
Also consider the case of null values.
However, the additional check probably needs some processing time too.
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Hi there,
You all might have used UPSERT operation using MERGE. Currently the update section updates all rows once the key is matched. Is there a way to update only rows which have column value changed and not update all rows to avoid unnecessary updates on rows who rows may have same value and have not changed?
Does it improve peformance?
Below is code to reproduce.
When run first time - new rows got added.
When run 2nd time - Note Run ONLY MERGE on 2nd time run.
same rows got updated even though there is no change. is there a way to detect only rows which have got changed and update it in UPSERT section? so not all rows get updated unnecessarily in the below case?
--CREATE STAGING TABLE
CREATE TABLE #STAGING (custno INT,custname VARCHAR(40) , custcity VARCHAR(40)) -- 80columns
INSERT INTO #STAGING
VALUES
(1,'John','Atlanta') ,
(2,'Joe','Chicago') ,
(3,'Sam','Cleveland')
--CREATE FACT TABLE
CREATE TABLE #fact (custno INT,custname VARCHAR(40) , custcity VARCHAR(40)) -- 80columns
--MERGE 2ND TIME RUN ONLY BELOW THIS
--DELCARATIONS
DECLARE @RowsWritten INT , @RowsUpdated INT ;
DECLARE @MergeOutputTable TABLE
([Action] varchar(20),
[SourceSystemUniqueSalesInvoiceId] bigint
)
MERGE INTO #fact AS T
USING #STAGING AS S
ON T.custno = S.custno
WHEN MATCHED THEN
UPDATE SET
T.custname = S.custname,
T.custcity = S.custcity
WHEN NOT MATCHED BY TARGET THEN
INSERT (
custno
,custname
,custcity)
VALUES (
S.custno
,S.custname
,S.custcity)
OUTPUT $action, INSERTED.[custno]
INTO @MergeOutputTable;
SELECT
@RowsWritten = SUM(CASE WHEN [action] = 'INSERT' THEN 1 ELSE 0 END),
@RowsUpdated = SUM(CASE WHEN [action] = 'UPDATE' THEN 1 ELSE 0 END)
FROM @MergeOutputTable;
--PRINT OPERATIONS
print 'insert' +cast(@RowsWritten as varchar(10))
print 'update' +cast(@RowsUpdated as varchar(10))
--SELECT * FROM #staging
--SELECT * FROM #fact
Try to add a condition for WHEN MATCHED: WHEN MATCHED and T.custname <> S.custname and T.custcity <> S.custcity THEN ...
.
Also consider the case of null values.
However, the additional check probably needs some processing time too.
Hi @RJ,
Thanks for the question and using MS Q&A platform.
In addition to Viorel
Yes, when using an UPSERT operation with MERGE, you can avoid unnecessary updates by ensuring that only rows with actual changes in their values are updated. This approach reduces the overhead of updating unchanged rows, thereby improving performance, particularly with large datasets.
To resolve this issue, you can adjust your MERGE statement to update only when the values in the source table (#STAGING) differ from those in the target table (#fact).
This can be achieved by including a condition in the WHEN MATCHED clause that compares the values in both tables for the columns being updated (custname and custcity).
Here is the updated code:
-- MERGE 2ND TIME RUN ONLY BELOW THIS
-- DECLARATIONS
DECLARE @RowsWritten INT, @RowsUpdated INT;
DECLARE @MergeOutputTable TABLE
([Action] varchar(20),
[SourceSystemUniqueSalesInvoiceId] bigint);
-- MERGE Operation
MERGE INTO #fact AS T
USING #STAGING AS S
ON T.custno = S.custno
-- When matched and data has actually changed
WHEN MATCHED AND (T.custname <> S.custname OR T.custcity <> S.custcity) THEN
UPDATE SET
T.custname = S.custname,
T.custcity = S.custcity
-- When not matched by target, insert new rows
WHEN NOT MATCHED BY TARGET THEN
INSERT (custno, custname, custcity)
VALUES (S.custno, S.custname, S.custcity)
OUTPUT $action, INSERTED.[custno]
INTO @MergeOutputTable;
-- Calculate the number of rows written and updated
SELECT
@RowsWritten = SUM(CASE WHEN [action] = 'INSERT' THEN 1 ELSE 0 END),
@RowsUpdated = SUM(CASE WHEN [action] = 'UPDATE' THEN 1 ELSE 0 END)
FROM @MergeOutputTable;
-- Print the results
PRINT 'insert ' + CAST(@RowsWritten AS VARCHAR(10));
PRINT 'update ' + CAST(@RowsUpdated AS VARCHAR(10));
Does it improve peformance?
By avoiding unnecessary updates, you reduce the number of I/O operations required to update rows.
In SQL Server, updating rows is recorded in the transaction log. Reducing the number of updates decreases transaction log usage, which can enhance performance.
Please refer this documentation for more understanding:
Hope this helps. Do let us know if you any further queries. If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.