SQL Server: Unpivot vs Apply
Introduction
SQL Server 2005 introduced
UNPIVOT
to transform columns into rows. This article discussesUNPIVOT
alongside an alternative syntax with fewer limitations.
Example Data
I will start off with the following table holding details about fictitious job applicants. In reality this doubtless would hold a lot of additional information but the columns below are sufficient for my purposes. Note that this table violates first normal form by including a repeating group of applicant referees.
CREATE TABLE #Applicant
(
PK INT IDENTITY PRIMARY KEY,
FirstName NVARCHAR(100) NOT NULL,
LastName NVARCHAR(100) NOT NULL,
Email VARCHAR(100) NOT NULL,
HomeTelephone VARCHAR(20) NULL,
MobileTelephone VARCHAR(20) NOT NULL,
FaxNumber VARCHAR(20) NULL,
Referee1Name NVARCHAR(100) NOT NULL,
Referee1Email VARCHAR(100) NOT NULL,
Referee2Name NVARCHAR(100) NOT NULL,
Referee2Email VARCHAR(100) NOT NULL
)
Now insert a couple of rows of example data
INSERT INTO #Applicant
VALUES ('Bob','Jones','bob.jones@example.com',' +441617151234','+441632359678',NULL,'Frank Smith','frank.smith@example.com','Tim Green','tim.green@example.com'),
('Annette','Bell','annette@example.com',NULL,'+447980123412',NULL,'Jane Sparrow','sparrow.jane@example.com','Gordon Osier','ozzy@example.com');
Basic Unpivot
There are three different types of telephone number here.
Let's use the UNPIVOT
operator to expand these out into different rows.
SELECT *
FROM #Applicant A
UNPIVOT (value FOR column_name IN (HomeTelephone, MobileTelephone, FaxNumber) ) U
http://i.stack.imgur.com/tejFi.png
The below shows an alternative approach using APPLY
.
SELECT *
FROM #Applicant A
CROSS APPLY ( VALUES (A.HomeTelephone, 'HomeTelephone'),
(A.MobileTelephone, 'MobileTelephone'),
(A.FaxNumber, 'FaxNumber') ) U(value, column_name)
http://i.stack.imgur.com/rPIth.png
Comparing the resultsets above two points should be apparent.
- The
APPLY
method has three rows for each applicant - one for each phone number type. TheUNPIVOT
has just as many rows per applicant as they haveNOT NULL
phone numbers. Annette only supplied a Mobile number so just has one row in the unpivoted result. - The
APPLY
method returns more columns. With theUNPIVOT
method the original columns in the Unpivot list are no longer in scope and can not be accessed in theSELECT
list.
With the APPLY
method it is of course possible to get the same result as UNPIVOT
by being more explicit about the exact columns and rows required (code listing below). However the reverse is not true. UNPIVOT
can't return the additional rows and columns.
SELECT A.PK,
A.FirstName,
A.LastName,
A.Email,
A.Referee1Name,
A.Referee1Email,
A.Referee2Name,
A.Referee2Email,
U.value,
U.column_name
FROM #Applicant A
CROSS APPLY ( VALUES (A.HomeTelephone, 'HomeTelephone'),
(A.MobileTelephone, 'MobileTelephone'),
(A.FaxNumber, 'FaxNumber') ) U(value, column_name)
WHERE U.value IS NOT NULL
Execution Plans
The execution plans are nearly identical for both approaches above with no significant differences in efficiency.
http://i.stack.imgur.com/rtaps.png
Mismatched Datatypes
Now suppose the requirement is to view all contact details for an applicant including email address. I will use the APPLY
method first.
SELECT A.PK,
A.FirstName,
A.LastName,
U.value,
U.column_name
FROM #Applicant A
CROSS APPLY ( VALUES (A.HomeTelephone, 'HomeTelephone'),
(A.MobileTelephone, 'MobileTelephone'),
(A.FaxNumber, 'FaxNumber'),
(A.Email, 'Email')) U(value, column_name)
WHERE U.value IS NOT NULL
This returns the following result
http://i.stack.imgur.com/FO4mP.png
Now let's try UNPIVOT
SELECT U.*
FROM #Applicant A
UNPIVOT (value FOR column_name IN (HomeTelephone, MobileTelephone, FaxNumber, Email) ) U
An error!
Msg 8167, Level 16, State 1, Line 3
The type of column "Email" conflicts with the type of other columns specified in the UNPIVOT list.
The problem is that Email is VARCHAR(100)
but the others VARCHAR(20)
.
The VALUES
approach is happy to implicitly CAST
using the same rules as UNION ALL
but UNPIVOT
is more fussy.
This issue can be worked around with a CTE or derived table as below.
WITH A
AS (SELECT PK,
FirstName,
LastName,
CAST(HomeTelephone AS VARCHAR(100)) AS HomeTelephone,
CAST(MobileTelephone AS VARCHAR(100)) AS MobileTelephone,
CAST(FaxNumber AS VARCHAR(100)) AS FaxNumber,
Email
FROM #Applicant)
SELECT U.*
FROM A UNPIVOT (value FOR column_name IN (HomeTelephone,
MobileTelephone,
FaxNumber,
Email) ) U
Grouped Columns
Now let us return to the issue of the non normalised referees. In order to transform them to a more normalised structure this is trivial with the APPLY
SELECT A.PK,
A.FirstName,
A.LastName,
U.RefereeName,
U.RefereeEmail
FROM #Applicant A
CROSS APPLY ( VALUES (A.Referee1Name, A.Referee1Email),
(A.Referee2Name, A.Referee2Email)) U(RefereeName, RefereeEmail)
WHERE NOT (U.RefereeName IS NULL AND U.RefereeEmail IS NULL)
http://i.stack.imgur.com/bjndS.png
The UNPIVOT
operator has no support for groupings of multiple columns.
One possible way it might be done is below. The plan for this is less efficient than the APPLY
method.
WITH A
AS (SELECT PK,
FirstName,
LastName,
Referee1Name,
CAST(Referee1Email AS NVARCHAR(100)) AS Referee1Email,
Referee2Name,
CAST(Referee2Email AS NVARCHAR(100)) AS Referee2Email
FROM #Applicant)
SELECT PK,
FirstName,
LastName,
MAX(CASE
WHEN column_name LIKE 'Referee%Name'
THEN value
END) AS RefereeName,
MAX(CASE
WHEN column_name LIKE 'Referee%Email'
THEN value
END) AS RefereeEmail
FROM A UNPIVOT (value FOR column_name IN (Referee1Name, Referee1Email, Referee2Name, Referee2Email) ) U
GROUP BY PK,
FirstName,
LastName,
CASE
WHEN column_name IN ( 'Referee1Name', 'Referee1Email' )
THEN 1
ELSE 2
END
ORDER BY PK
SQL Server 2005
This article uses the Table Value Constructors introduced in SQL Server 2008. For anyone out there still stuck on 2005, fear not! The functionality is still available but just requires a slightly more verbose syntax.
SELECT *
FROM #Applicant A
CROSS APPLY (SELECT A.HomeTelephone, 'HomeTelephone'
UNION ALL
SELECT A.MobileTelephone, 'MobileTelephone'
UNION ALL
SELECT A.FaxNumber, 'FaxNumber') U(value, column_name)
Conclusion
The UNPIVOT
operator has a concise syntax that works well when trying to transform a single repeated column into rows. However mismatched datatypes or the need for dealing with grouped columns may lead one to prefer an alternative approach.