Condividi tramite


Optimizing queries with joins and over statements in SQL 2005

I was working on a common scenario and found an interesting side effect. Applications typically have a listing page that get search criteria from a UI and passes to a stored procedure. In the procedure you often find statements like:

SELECT a,b,c,d
FROM MyTable
WHERE c = @SearchC AND d = @SearchD

Often times you need to join information to the main data that is the most current from a historical table. For example in SQL 2005.

SELECT t.*
FROM
(SELECT
ID
, Amt
, AmtDepositDate
, row_number() over (partition by ID order by ID, AmtDepositDate desc) as rn
FROM MoneyTable
) t
WHERE rn=1

I found that when you join the above query to the main search query above a weird optimization occurs in SQL 2005. I suspect (unconfirmed with product group) that a left join can't optmize the query with the over so it ends up do a sub select on all records every row in the main search.

The solution (in my case where there is small sets of data of < 5000 records), I create a table variable and select into the lasted record from the history table. Then I join the table variable to the main table search.

In my scenario with 2000 main records and 1 million history records the left join with the latest select (no table variable) took 5+ minutes. Using the table variables took 2 seconds for everything.

Here is a snippet of the overall solution:

DECLARE @MyTable table (
ID int
,Amt float
,AmtDepositDate datetime
)

INSERT INTO @MyTable
SELECT
t.ID
,t.Amt
, t.AmtDepositDate
FROM
(
SELECT
ID
, Amt
, AmtDepositDate
, row_number() over (partition by ID order by ID, AmtDepositDate desc) as rn
FROM MoneyTable
) t
WHERE rn=1

SELECT
MT2.ID
, MT2.Name
, MT2.City
FROM
MyTable2 MT2
LEFT JOIN @MyTable MT
ON MT2.ID = MT.ID
WHERE
c = @SearchC AND d = @SearchD

Anyone experience similar or can offer up an reason for the optimization?

Comments