Using table variables (or temp tables) to speed up… or slow down
In SQL Server you can use temporary tables to store intermediate results. This is a common used technique to speed up query processing. Recently I came across a problem where the temporary tables were causing the performance degradation. I'm not going into the different temporary tables types, there are good posts on these. I'm going to walk you through a real-life example and show that the most given answer in the SQL Server world holds true; "It Depends".
In this case the desired result was a single stored procedure that would return multiple results based on some or no search criteria. If a user navigated to the page the complete set was returned and only one filter was applied. In the result set there was a row count, a paged result and some distinct fields from the total results to provide additional filtering. Think of these as slicers and dicers you use in Excel. I will demonstrate the general idea using the AdventureWorks database. I'm using the 2008 version here.
DECLARE
@t
table (id
int
IDENTITY,
salesorderid int)
INSERT
INTO
@t
(salesorderid)
SELECT
soh.SalesOrderID
FROM
Sales.SalesOrderHeader
soh
INNER
JOIN
Purchasing.ShipMethod
sm
ON
soh.ShipMethodID = sm.ShipMethodID
WHERE
sm.Name =
'CARGO TRANSPORT 5'
ORDER
BY
soh.OrderDate
DESC;
-- 1 Get Rowcount
SELECT
COUNT(*)
AS
TotalRecords
FROM
@t;
-- 2 Get Current page
SELECT
soh.OrderDate, soh.SubTotal, st.Name, p.LastName, c.AccountNumber
FROM
Sales.SalesOrderHeader
soh
INNER
JOIN
@t
t
ON
soh.SalesOrderID = t.salesorderid
INNER
JOIN
Sales.SalesTerritory
st
ON
soh.TerritoryID = st.TerritoryID
INNER
JOIN
Sales.SalesPerson
sp
ON
soh.SalesPersonID = sp.BusinessEntityID
INNER
JOIN
Sales.Customer
c
ON
soh.CustomerID = c.CustomerID
INNER
JOIN
Person.Person
p
ON
sp.BusinessEntityID = p.BusinessEntityID
WHERE
t.id BETWEEN 1 AND 20
ORDER
BY
soh.OrderDate
DESC;
-- 3 Get SalesTerritory values
SELECT
DISTINCT
st.TerritoryID, st.Name
FROM
Sales.SalesTerritory
st
INNER
JOIN
Sales.SalesOrderHeader
soh
ON
st.TerritoryID = soh.TerritoryID
INNER
JOIN
@t
t
ON
soh.SalesOrderID = t.salesorderid
ORdER
BY
st.Name;
-- 4 Get OrderDate Year values
SELECT
DISTINCT
YEAR(soh.OrderDate)
as OrderDateYear
FROM
Sales.SalesOrderHeader
soh
INNER
JOIN
@t
t
ON
soh.SalesOrderID = t.salesorderid
ORDER
BY
OrderDateYear;
In preparation a table variable is filled with the filtered set of ID's and is sorted in the desired order. In the first query we retrieve the row count for the set. The second set is the actual result set the user will see in the application. In this case it's the first page with 20 results. In step 3 and 4 additional results are returned that are used in the application to provide quick filters for the user. When I run this query on my database I get 4 sets with a total of 35 rows. For this I used 25628 reads. That seemed a lot to me. The execution plan for this one showed some missing indexes and some clustered index scans here and there. I added a few indexes. If you are running along, here they are:
CREATE
NONCLUSTERED
INDEX
temp_SALESORDERHEADER_ShipMethodId
ON
Sales.SalesOrderHeader(ShipMethodId)
INCLUDE(SalesOrderID, OrderDate);
CREATE
NONCLUSTERED
INDEX
temp_SALESORDERHEADER_Select
ON
Sales.SalesOrderHeader(SalesOrderId, OrderDate)
INCLUDE(CustomerID, SalesPersonID, TerritoryID, SubTotal);
CREATE
NONCLUSTERED
INDEX
temp_SALESORDERHEADER_Territory
ON
Sales.SalesOrderHeader(SalesOrderID, TerritoryID)
CREATE
NONCLUSTERED
INDEX
temp_TERRITORY_Name
ON
Sales.SalesTerritory(TerritoryID, Name);
CREATE
NONCLUSTERED
INDEX
temp_PERSON_Lastname
ON
Person.Person(LastName)
This got my reads down to 900. That's to show you will need a proper indexing strategy. But still, more work could be done. One of the things I tried first was use a temporary table instead of a table variable. But this increased my reads. Although I could add indexes the index creation itself would cause even more reads. And in this example the set is not that big but when I was working with the real set things got ugly pretty quickly.
The last thing I tried was to remove the temporary tables altogether and use a view. My thought was to create a view with the search criteria in the view and maybe index that view. This view could easily be reused by the other sets. The definition of the view and the adjusted queries are below:
CREATE
VIEW
vSearch
AS
SELECT
soh.SalesOrderID, soh.TerritoryID, soh.OrderDate, soh.SalesPersonID, soh.CustomerID, soh.SubTotal, sm.Name
FROM
Sales.SalesOrderHeader
soh
INNER
JOIN
Purchasing.ShipMethod
sm
ON
soh.ShipMethodID = sm.ShipMethodID
GO
-- 1 Get Rowcount
SELECT
COUNT(*)
AS
TotalRecords
FROM
vSearch
WHERE
Name
=
'CARGO TRANSPORT 5'
-- 2 Get Current page
SELECT
v.OrderDate, v.SubTotal, st.Name, p.LastName, c.AccountNumber
FROM
(SELECT
ROW_NUMBER()
OVER (ORDER
BY
OrderDate
DESC) AS
row_id, SalesOrderID,
TerritoryID, SalesPersonID, CustomerID, OrderDate, SubTotal
FROM
vSearch
WHERE
Name
=
'CARGO TRANSPORT 5')
v
INNER
JOIN
Sales.SalesTerritory
st
ON
v.TerritoryID = st.TerritoryID
INNER
JOIN
Sales.Customer
c
ON
v.CustomerID = c.CustomerID
INNER
JOIN
Person.Person
p
ON
v.SalesPersonID = p.BusinessEntityID
WHERE
v.row_id BETWEEN 1 AND 20
ORDER
BY
v.OrderDate
DESC;
-- 3 Get SalesTerritory values
SELECT
DISTINCT
st.TerritoryID, st.Name
FROM
Sales.SalesTerritory
st
INNER
JOIN
vSearch
v
ON
v.TerritoryID = st.TerritoryID
WHERE
v.Name =
'CARGO TRANSPORT 5'
ORdER
BY
st.Name;
-- 4 Get OrderDate Year values
SELECT
DISTINCT
YEAR(v.OrderDate)
as OrderDateYear
FROM
vSearch
v
WHERE
v.Name =
'CARGO TRANSPORT 5'
ORDER
BY
OrderDateYear;
An additional index was needed:
CREATE
NONCLUSTERED
INDEX
temp_IX_SALESORDERHEADER_ShipMethodID
ON
Sales.SalesOrderHeader
(ShipMethodID)
INCLUDE (OrderDate,CustomerID,SalesPersonID,TerritoryID,SubTotal);
This resulted in 466 reads. More optimizations can be done bit a big win was gained by using a view and just retrieving the data directly instead of using temporary sets.
A big lesson lies in the creation of the temporary objects. If you create a simple table with two columns and just 10 records and you select the table you will only get a few reads. If you create a table variable, insert the real table into the variable and select it again you get a couple of hundred reads. On small sets you won't notice a big difference but on larger sets the difference on cpu, reads, writes and duration is really noticeable.