Way to optimize query where no union needed

Jonathan Brotto 420 Reputation points
2024-09-10T19:56:39.0933333+00:00

I had to use union on two very similar queries where one is just a copy of the other but the having clause is equal with an additional condition in the where clause. Is there a way to clean up the code?

DECLARE @PO_DATE_FROM date = '2024-09-08', @PO_DATE_TO date = '2024-09-09';


-- The request
SELECT 
tmp.[EDI Doc. Type],
tmp.Direction,
tmp.[Doc. Reference (PO, Invoice Nb, ...)],
tmp.PartnerId,
tmp.[Cust. GS Id],
convert(date,tmp.[Timestamp]) as [Timestamp],
CASE tmp.NbConfirmations
	WHEN 1 THEN 'B2Bi'
	WHEN 2 THEN 'PyEDI'
	ELSE 'COMPASS'
END as [Last Successful Step]
FROM (
SELECT 
EDITransactionType as [EDI Doc. Type],
EDIDirection as [Direction],
TRIM(REPLACE(EDIReference,'-','')) as [Doc. Reference (PO, Invoice Nb, ...)],
PartnerID,
PartnerISAId as [Cust. GS Id],
COUNT(distinct t.ProcessName) as NbConfirmations,
convert(date,d.InsertTimestamp) as [Timestamp]
FROM EDIAudit.Reconciliation.Documents d WITH (NOLOCK)
INNER JOIN EDIAudit.Reconciliation.DocumentsTransaction t WITH (NOLOCK)
ON t.DocumentPKey = d.DocumentPKey
WHERE EDITransactionType = 850 AND EDIDirection = 'inbound'
AND CurrentStatus != 'MANUALLY_CONFIRMED'
AND convert(date, d.InsertTimestamp) BETWEEN @PO_DATE_FROM AND @PO_DATE_TO
GROUP BY EDITransactionType,
EDIDirection,
TRIM(REPLACE(EDIReference,'-','')),
PartnerID,
PartnerISAId,
convert(date,d.InsertTimestamp)
HAVING COUNT(distinct t.ProcessName) < 3) as tmp


UNION

SELECT 
tmp.[EDI Doc. Type],
tmp.Direction,
tmp.[Doc. Reference (PO, Invoice Nb, ...)],
tmp.PartnerId,
tmp.[Cust. GS Id],
convert(date,tmp.[Timestamp]) as [Timestamp],
CASE tmp.NbConfirmations
	WHEN 1 THEN 'B2Bi'
	WHEN 2 THEN 'PyEDI'
	ELSE 'COMPASS'
END as [Last Successful Step]
FROM (
SELECT 
EDITransactionType as [EDI Doc. Type],
EDIDirection as [Direction],
TRIM(REPLACE(EDIReference,'-','')) as [Doc. Reference (PO, Invoice Nb, ...)],
PartnerID,
PartnerISAId as [Cust. GS Id],
COUNT(distinct t.ProcessName) as NbConfirmations,
convert(date,d.InsertTimestamp) as [Timestamp]
FROM EDIAudit.Reconciliation.Documents d WITH (NOLOCK)
INNER JOIN EDIAudit.Reconciliation.DocumentsTransaction t WITH (NOLOCK)
ON t.DocumentPKey = d.DocumentPKey
WHERE EDITransactionType = 850 AND EDIDirection = 'inbound'
AND CurrentStatus != 'MANUALLY_CONFIRMED' AND CurrentStatus = 'RECEIVED'
AND convert(date, d.InsertTimestamp) BETWEEN @PO_DATE_FROM AND @PO_DATE_TO
GROUP BY EDITransactionType,
EDIDirection,
TRIM(REPLACE(EDIReference,'-','')),
PartnerID,
PartnerISAId,
convert(date,d.InsertTimestamp)
HAVING COUNT(distinct t.ProcessName) = 3) as tmp
ORDER BY tmp.Timestamp, tmp.PartnerID, tmp.[Doc. Reference (PO, Invoice Nb, ...)];

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
13,998 questions
SQL Server Transact-SQL
SQL Server Transact-SQL
SQL Server: A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.Transact-SQL: A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
102 questions
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,656 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. James Bisiar 160 Reputation points
    2024-09-10T21:00:30.51+00:00

    Here is a cleaned-up and optimized version of your SQL query. I’ve eliminated redundancies and improved formatting for better readability while preserving the functionality.

    DECLARE 
        @PO_DATE_FROM date = '2024-09-08', 
        @PO_DATE_TO date = '2024-09-09';
    
    -- The request
    WITH DocumentSummary AS (
        SELECT 
            EDITransactionType AS [EDI Doc. Type],
            EDIDirection AS [Direction],
            TRIM(REPLACE(EDIReference, '-', '')) AS [Doc. Reference (PO, Invoice Nb, ...)],
            PartnerID,
            PartnerISAId AS [Cust. GS Id],
            COUNT(DISTINCT t.ProcessName) AS NbConfirmations,
            CONVERT(date, d.InsertTimestamp) AS [Timestamp]
        FROM 
            EDIAudit.Reconciliation.Documents d WITH (NOLOCK)
        INNER JOIN 
            EDIAudit.Reconciliation.DocumentsTransaction t WITH (NOLOCK)
            ON t.DocumentPKey = d.DocumentPKey
        WHERE 
            EDITransactionType = 850 
            AND EDIDirection = 'inbound'
            AND CurrentStatus != 'MANUALLY_CONFIRMED'
            AND CONVERT(date, d.InsertTimestamp) BETWEEN @PO_DATE_FROM AND @PO_DATE_TO
        GROUP BY 
            EDITransactionType,
            EDIDirection,
            TRIM(REPLACE(EDIReference, '-', '')),
            PartnerID,
            PartnerISAId,
            CONVERT(date, d.InsertTimestamp)
    )
    
    SELECT 
        ds.[EDI Doc. Type],
        ds.Direction,
        ds.[Doc. Reference (PO, Invoice Nb, ...)],
        ds.PartnerID,
        ds.[Cust. GS Id],
        ds.[Timestamp],
        CASE ds.NbConfirmations
            WHEN 1 THEN 'B2Bi'
            WHEN 2 THEN 'PyEDI'
            ELSE 'COMPASS'
        END AS [Last Successful Step]
    FROM 
        DocumentSummary ds
    WHERE 
        ds.NbConfirmations < 3
    
    UNION
    
    SELECT 
        ds.[EDI Doc. Type],
        ds.Direction,
        ds.[Doc. Reference (PO, Invoice Nb, ...)],
        ds.PartnerID,
        ds.[Cust. GS Id],
        ds.[Timestamp],
        CASE ds.NbConfirmations
            WHEN 1 THEN 'B2Bi'
            WHEN 2 THEN 'PyEDI'
            ELSE 'COMPASS'
        END AS [Last Successful Step]
    FROM 
        DocumentSummary ds
    WHERE 
        ds.NbConfirmations = 3 
        AND EXISTS (
            SELECT 1
            FROM EDIAudit.Reconciliation.Documents d
            WHERE d.CurrentStatus = 'RECEIVED'
            AND CONVERT(date, d.InsertTimestamp) BETWEEN @PO_DATE_FROM AND @PO_DATE_TO
        )
    
    ORDER BY 
        ds.[Timestamp], ds.PartnerID, ds.[Doc. Reference (PO, Invoice Nb, ...)];
    

    Key Improvements:

    1. Common Table Expression (CTE): A WITH clause was used to reduce redundancy in your SELECT statements. The logic for counting NbConfirmations and formatting columns is now centralized.
    2. Eliminated Redundant Queries: The core query logic has been reduced, and differences in conditions (NbConfirmations < 3 or = 3) are applied via filtering after the CTE.
    3. Improved Readability: The code is more organized, making it easier to maintain and understand.

    Let me know if you need any further adjustments!


  2. LiHongMSFT-4306 28,041 Reputation points
    2024-09-11T01:42:58.23+00:00

    Hi @Jonathan Brotto

    Please check if this query returns correct result.

    DECLARE @PO_DATE_FROM date = '2024-09-08', @PO_DATE_TO date = '2024-09-09';
    -- The request
    SELECT 
    tmp.[EDI Doc. Type],
    tmp.Direction,
    tmp.[Doc. Reference (PO, Invoice Nb, ...)],
    tmp.PartnerId,
    tmp.[Cust. GS Id],
    convert(date,tmp.[Timestamp]) as [Timestamp],
    CASE tmp.NbConfirmations
    	WHEN 1 THEN 'B2Bi'
    	WHEN 2 THEN 'PyEDI'
    	ELSE 'COMPASS'
    END as [Last Successful Step]
    FROM (
    SELECT 
    EDITransactionType as [EDI Doc. Type],
    EDIDirection as [Direction],
    TRIM(REPLACE(EDIReference,'-','')) as [Doc. Reference (PO, Invoice Nb, ...)],
    PartnerID,
    PartnerISAId as [Cust. GS Id],
    COUNT(distinct t.ProcessName) as NbConfirmations,
    convert(date,d.InsertTimestamp) as [Timestamp]
    FROM EDIAudit.Reconciliation.Documents d WITH (NOLOCK)
    INNER JOIN EDIAudit.Reconciliation.DocumentsTransaction t WITH (NOLOCK)
    ON t.DocumentPKey = d.DocumentPKey
    WHERE EDITransactionType = 850 AND EDIDirection = 'inbound'
    AND CurrentStatus != 'MANUALLY_CONFIRMED'
    AND convert(date, d.InsertTimestamp) BETWEEN @PO_DATE_FROM AND @PO_DATE_TO
    GROUP BY EDITransactionType,
    EDIDirection,
    TRIM(REPLACE(EDIReference,'-','')),
    PartnerID,
    PartnerISAId,
    convert(date,d.InsertTimestamp)
    HAVING COUNT(distinct t.ProcessName) <= 3) as tmp
    ORDER BY tmp.Timestamp, tmp.PartnerID, tmp.[Doc. Reference (PO, Invoice Nb, ...)];
    

    Best regards,

    Cosmog


    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".


  3. Rodger Kong 360 Reputation points
    2024-09-17T06:40:48.4433333+00:00

    This query is strange. I found the only different between two parts of UNION is "AND CurrentStatus = 'RECEIVED'" in WHERE clause of part 2.

    In part1 it filter CurrentStatus column with condition "!='MANUALLY_CONFIRMED'", means exclude this item from a set. Assume the set include [null, 1, 2, 3, 4, 5], and 'MANUALLY_CONFIRMED' equals 1, 'RECEIVED' equals 5, this condition will get [null, 2, 3, 4, 5].

    In part2 filter is CurrentStatus!='MANUALLY_CONFIRMED' AND CurrentStatus = 'RECEIVED'. According to the assumptions, will get [5].

    It's clearly that result of part 1 include result from part 2, even after grouping. By UNION, same rows will be excluded, so the rows leaved will same as excute only part 1.

    Am I misunderstanding?


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.