LAG’n’LEAD – New T-SQL Features in SQL Server Denali CTP3
Are you tired of doing self joins, just to get the value of a column in the
previus/subsequent row, or maybe even
worse you are doing multiple self joins. This is where you will start
smiling.
Let's test this functionality, how it is simple and gives better
performance than self JOIN.
CREATE TABLE MonthlyProfit
(
yearid int,
Monthid
int,
Profit bigint
)
--- Inserting multiple records through Row Constructor(New feature in
SQL 2008 R2)
INSERT INTO MonthlyProfit (yearid, Monthid,
Profit)
VALUES
(2010,1,1766), (2010,2,100), (2010,3,1500), (2010,4,15000),
(2010,5,900), (2010,6,45),
(2010,7,1766), (2010,8,9100), (2010,9,-100),
(2010,10,50), (2010,11,900000), (2010,12,6575)
What we would like to do now is to write a select statement that gives us
Yearid, MonthId, ProfitThisMonth and ProfitLastMonth, so that we would be able
to compare profit with last month. Before SQL Server Denali CTP3 my SQL
statement would look something like this:
SELECT
t1.yearid,
t1.Monthid,
t1.Profit as
ProfitThisMonth,
t2.Profit as ProfitLastMonth
FROM MonthlyProfit as
t1
LEFT JOIN MonthlyProfit as t2 on (t1.yearid = t2.yearid) AND (t1.Monthid =
t2.Monthid+1)
A pretty simple statement by looking at it, this is not possible without
the nasty self join.
SQL Server Denali CTP3 finally gives us LAG and
LEAD. Here is the description directly stolen from BOL:
Accesses data
from a previous row in the same result set without the use of a self-join in
Microsoft SQL Server
Code-Named “Denali”, Community Technology Preview 3 (CTP 3). LAG provides
access to a row at a given physical
offset that comes before the current row. Use this analytic function in a
SELECT statement to compare values in the
current row with values in a previous row.
The syntax for LAG and
LEAD is the same
LAG (scalar_expression [,offset] [,default]) OVER ( [
partition_by_clause ] order_by_clause)
Let’s have a look at the
statement now:
SELECT
yearid,
monthid,
Profit as
ProfitThisMonth,
LAG(Profit, 1, 0) OVER(Order by yearid, monthid) as
ProfitLastMonth
FROM MonthlyProfit
No more self joins, and a much more simple statement. Read the full
documentation to the LAG and LEAD in BOL.
What about performance?
I set IO statistics on(SET STATISTICS IO
ON;) and run both queries again. Here is the results:
(12 row(s)
affected)
Table 'MonthlyProfit'. Scan count 2, logical reads 13, physical
reads 0, read-ahead reads 0,…
(12 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0,
physical reads 0, read-ahead reads 0,…
Table 'MonthlyProfit'. Scan count 1,
logical reads 1, physical reads 0, read-ahead reads 0,
So the conclusion is,
we get a much simpler query that also does fewer reads, that’s what we call a
WIN-WIN situation.