다음을 통해 공유


SQL Server: Unpivot vs Apply

Introduction

SQL Server 2005 introduced UNPIVOT to transform columns into rows. This article discusses UNPIVOT alongside an alternative syntax with fewer limitations.

Example Data

I will start off with the following table holding details about fictitious job applicants. In reality this doubtless would hold a lot of additional information but the columns below are sufficient for my purposes. Note that this table violates first normal form by including a repeating group of applicant referees.

CREATE TABLE #Applicant
  (
     PK              INT IDENTITY PRIMARY KEY,
     FirstName       NVARCHAR(100) NOT NULL,
     LastName        NVARCHAR(100) NOT NULL,
     Email           VARCHAR(100) NOT NULL,
     HomeTelephone   VARCHAR(20) NULL,
     MobileTelephone VARCHAR(20) NOT NULL,
     FaxNumber       VARCHAR(20) NULL,
     Referee1Name    NVARCHAR(100) NOT NULL,
     Referee1Email   VARCHAR(100) NOT NULL,
     Referee2Name    NVARCHAR(100) NOT NULL,
     Referee2Email   VARCHAR(100) NOT NULL
  )

Now insert a couple of rows of example data

INSERT INTO #Applicant
VALUES      ('Bob','Jones','bob.jones@example.com',' +441617151234','+441632359678',NULL,'Frank Smith','frank.smith@example.com','Tim Green','tim.green@example.com'),
            ('Annette','Bell','annette@example.com',NULL,'+447980123412',NULL,'Jane Sparrow','sparrow.jane@example.com','Gordon Osier','ozzy@example.com');

Basic Unpivot

There are three different types of telephone number here.

Let's use the UNPIVOT operator to expand these out into different rows.

SELECT *
FROM   #Applicant A
UNPIVOT (value FOR column_name IN (HomeTelephone, MobileTelephone, FaxNumber) ) U

http://i.stack.imgur.com/tejFi.png

The below shows an alternative approach using APPLY.

SELECT *
FROM   #Applicant A
       CROSS APPLY ( VALUES (A.HomeTelephone, 'HomeTelephone'),
                            (A.MobileTelephone, 'MobileTelephone'),
                            (A.FaxNumber, 'FaxNumber') ) U(value, column_name)

http://i.stack.imgur.com/rPIth.png

Comparing the resultsets above two points should be apparent.

  1. The APPLY method has three rows for each applicant - one for each phone number type. The UNPIVOT has just as many rows per applicant as they have NOT NULL phone numbers. Annette only supplied a Mobile number so just has one row in the unpivoted result.
  2. The APPLY method returns more columns. With the UNPIVOT method the original columns in the Unpivot list are no longer in scope and can not be accessed in the SELECT list.

With the APPLY method it is of course possible to get the same result as UNPIVOT by being more explicit about the exact columns and rows required (code listing below). However the reverse is not true. UNPIVOT can't return the additional rows and columns.

SELECT A.PK,
       A.FirstName,
       A.LastName,
       A.Email,
       A.Referee1Name,
       A.Referee1Email,
       A.Referee2Name,
       A.Referee2Email,
       U.value,
       U.column_name
FROM   #Applicant A
       CROSS APPLY ( VALUES (A.HomeTelephone, 'HomeTelephone'),
                            (A.MobileTelephone, 'MobileTelephone'),
                            (A.FaxNumber, 'FaxNumber') ) U(value, column_name)
WHERE  U.value IS NOT NULL 

Execution Plans

The execution plans are nearly identical for both approaches above with no significant differences in efficiency.

http://i.stack.imgur.com/rtaps.png

Mismatched Datatypes

Now suppose the requirement is to view all contact details for an applicant including email address. I will use the APPLY method first.

SELECT A.PK,
       A.FirstName,
       A.LastName,
       U.value,
       U.column_name
FROM   #Applicant A
       CROSS APPLY ( VALUES (A.HomeTelephone, 'HomeTelephone'),
                            (A.MobileTelephone, 'MobileTelephone'),
                            (A.FaxNumber, 'FaxNumber'),
                            (A.Email, 'Email')) U(value, column_name)
WHERE  U.value IS NOT NULL 

This returns the following result

http://i.stack.imgur.com/FO4mP.png

Now let's try UNPIVOT

SELECT U.*
FROM   #Applicant A
UNPIVOT (value FOR column_name IN (HomeTelephone, MobileTelephone, FaxNumber, Email) ) U

An error!

Msg 8167, Level 16, State 1, Line 3

The type of column "Email" conflicts with the type of other columns specified in the UNPIVOT list.

The problem is that Email is VARCHAR(100) but the others VARCHAR(20).

The VALUES approach is happy to implicitly CAST using the same rules as UNION ALL but UNPIVOTis more fussy.

This issue can be worked around with a CTE or derived table as below.

WITH A
     AS (SELECT PK,
                FirstName,
                LastName,
                CAST(HomeTelephone AS VARCHAR(100))   AS HomeTelephone,
                CAST(MobileTelephone AS VARCHAR(100)) AS MobileTelephone,
                CAST(FaxNumber AS VARCHAR(100))       AS FaxNumber,
                Email
         FROM   #Applicant)
SELECT U.*
FROM   A UNPIVOT (value FOR column_name IN (HomeTelephone, 
                                            MobileTelephone, 
                                            FaxNumber, 
                                            Email) ) U 

Grouped Columns

Now let us return to the issue of the non normalised referees. In order to transform them to a more normalised structure this is trivial with the APPLY

SELECT A.PK,
       A.FirstName,
       A.LastName,
       U.RefereeName,
       U.RefereeEmail
FROM   #Applicant A
       CROSS APPLY ( VALUES (A.Referee1Name, A.Referee1Email),
                            (A.Referee2Name, A.Referee2Email)) U(RefereeName, RefereeEmail)
WHERE  NOT (U.RefereeName IS NULL AND U.RefereeEmail IS NULL)

http://i.stack.imgur.com/bjndS.png

The UNPIVOT operator has no support for groupings of multiple columns.

One possible way it might be done is below. The plan for this is less efficient than the APPLY method.

WITH A
     AS (SELECT PK,
                FirstName,
                LastName,
                Referee1Name,
                CAST(Referee1Email AS NVARCHAR(100)) AS Referee1Email,
                Referee2Name,
                CAST(Referee2Email AS NVARCHAR(100)) AS Referee2Email
         FROM   #Applicant)
SELECT PK,
       FirstName,
       LastName,
       MAX(CASE
             WHEN column_name LIKE 'Referee%Name'
               THEN value
           END) AS RefereeName,
       MAX(CASE
             WHEN column_name LIKE 'Referee%Email'
               THEN value
           END) AS RefereeEmail
FROM   A UNPIVOT (value FOR column_name IN (Referee1Name, Referee1Email, Referee2Name, Referee2Email) ) U
GROUP  BY PK,
          FirstName,
          LastName,
          CASE
            WHEN column_name IN ( 'Referee1Name', 'Referee1Email' )
              THEN 1
            ELSE 2
          END 
ORDER BY PK

SQL Server 2005

This article uses the Table Value Constructors introduced in SQL Server 2008. For anyone out there still stuck on 2005, fear not! The functionality is still available but just requires a slightly more verbose syntax.

SELECT *
FROM   #Applicant A
       CROSS APPLY (SELECT A.HomeTelephone, 'HomeTelephone'
                    UNION ALL
                    SELECT A.MobileTelephone, 'MobileTelephone'
                    UNION ALL
                    SELECT A.FaxNumber, 'FaxNumber') U(value, column_name) 

Conclusion

The UNPIVOT operator has a concise syntax that works well when trying to transform a single repeated column into rows. However mismatched datatypes or the need for dealing with grouped columns may lead one to prefer an alternative approach.