T-SQL: How the Order of Elements in the ORDER BY Clause Implemented in the Output Result
Introduction
This article shows that how the order of the elements in the ORDER BY clause in an SQL query can affect the output result sort. We will see how changing the order of these elements can change the output result. All Code samples in this article are downloadable from this URL.
Case Study
The ORDER BY clause can have multiple elements (columns, expressions, etc). We will see how using these elements in the correct order is necessary to achieve the right output result. For simplicity, the term columns will be used instead of elements in this article.
Solution
When using multiple columns in the ORDER BY clause, SQL Server evaluates the order of the output result from left to right. The first column in the ORDER BY clause is used as the major key. SQL Server then evaluates the second column in the ORDER BY clause. It retains the sort order from the first column, then applies sort based on the second column within that. and so on through the list of columns.
Let's explore this behavior in a simple example before we move on to a more complex one.
Example 1
We can start this scenario with the following code which creates the Books table and populates it with sample data.
Code 001:
--drop sample table if exists!
IF OBJECT_ID('dbo.Books','U') IS NOT NULL
DROP TABLE dbo.Books;
GO
--create sample table
CREATE TABLE dbo.Books
(
BookId INT PRIMARY KEY,
Title NVARCHAR(500),
PublishDate DATE,
Author NVARCHAR(200)
);
GO
INSERT dbo.Books
( BookId, Title, PublishDate, Author)
VALUES
( 1, N'Title A', '2004-01-01', 'Author 01'),
( 2, N'Title B', '2003-01-01', 'Author 01'),
( 3, N'Title C', '2001-01-01', 'Author 02'),
( 4, N'Title D', '2000-01-01', 'Author 02'),
( 5, N'Title E', '2005-01-01', 'Author 03'),
( 6, N'Title F', '2003-01-01', 'Author 03');
GO
We can now explore the ORDER BY clause in the various states. We can start the exploration by specifying a single column in the order by clause and see how its sort will change when adding another column. The following script shows two queries:
Query_1 performs the SELECT operation with the ORDER BY clause that sorts the output result based on the PublishDate column.
Query_2 performs the same SELECT statement with a minor difference- adding the Author column to the ORDER BY clause. We added it as the second column in the ORDER by clause. As we see in the next picture, the sort of the first result changes slightly. The BookIds 6 and 2 supplant because the database engine finds two PublishDate(s) with the same value. While it keeps the first sort order it then applies the second sort rule based on the Author column. So, it transforms the positions based on PublishDate within the same Author.
Code 002:
--query_1
SELECT *
FROM dbo.Books
ORDER BY PublishDate;
--query_2
SELECT *
FROM dbo.Books
ORDER BY PublishDate, Author;
Now, we can change our test queries. We compare the former query_2 with a little difference; we change the positions of the ORDER BY columns in the query_3. The following code performs this test. As illustrated in the next picture, the sort results in two query changes completely.
Code 003
--query_2
SELECT *
FROM dbo.Books
ORDER BY PublishDate, Author;
--query_3
SELECT *
FROM dbo.Books
ORDER BY Author, PublishDate;
Example 2
ORDER BY clause is not just for the SELECT sort. One of the most important usages of the ORDER BY clause is in the window functions. These functions usually need an ORDER BY clause. This example is about generating a new id based on a custom sort. For exploring this example, we use the AdventureWorks database. The problem is generating a new id based on these rules:
1 – We need to show columns from these tables: SalesOrderHeader and SalesOrderDetail
2 – Each header item must be presented as a row right after its all details items.
In the first step, we can see the main query based on one single SalsOrderId. The following code and picture show this:
Code 004:
SELECT d.SalesOrderID, d.LineTotal as New_Total
FROM Sales.SalesOrderDetail AS d
WHERE d.SalesOrderID = 43660
UNION all
SELECT h.SalesOrderID, h.SubTotal
FROM Sales.SalesOrderHeader AS h
WHERE h.SalesOrderID = 43660
For simplicity, we perform and show the result of this example based on two SalesOrderIds: 43660 and 43667. In the second step, we can write the following code that uses the ROW_NUMBER window function to implement this query for multiple SalesOrderIds. The point of the article is still alive here. If we change the ORDER BY clause columns order, the output result will fail. The following code shows two queries; first is the desired query.
Code 005:
--correct query
;WITH c1 AS
(
SELECT h.SalesOrderID, h.SubTotal as New_Total
FROM Sales.SalesOrderHeader AS h
WHERE h.SalesOrderID IN (43660, 43667)
UNION all
SELECT d.SalesOrderID, d.LineTotal
FROM Sales.SalesOrderDetail AS d
WHERE d.SalesOrderID IN (43660, 43667)
)
SELECT SalesOrderID ,
New_Total,
ROW_NUMBER() OVER (PARTITION BY SalesOrderID ORDER BY New_Total) AS Prt_Rn
FROM c1
ORDER BY SalesOrderID, Prt_Rn;
-------------------------------------------
--wrong query
;WITH c1 AS
(
SELECT h.SalesOrderID, h.SubTotal as New_Total
FROM Sales.SalesOrderHeader AS h
WHERE h.SalesOrderID IN (43660, 43667)
UNION all
SELECT d.SalesOrderID, d.LineTotal
FROM Sales.SalesOrderDetail AS d
WHERE d.SalesOrderID IN (43660, 43667)
)
SELECT SalesOrderID ,
New_Total,
ROW_NUMBER() OVER (PARTITION BY SalesOrderID ORDER BY New_Total) AS Prt_Rn
FROM c1
ORDER BY Prt_Rn , SalesOrderID ;
The final solution will be like the following query, which the order of the ORDER BY clause in the second windows function is really important as we see in the above query results.
Code 006:
WITH c1 AS
(
SELECT h.SalesOrderID, h.SubTotal
FROM Sales.SalesOrderHeader AS h
WHERE h.SalesOrderID IN (43660, 43667)
UNION all
SELECT d.SalesOrderID, d.LineTotal
FROM Sales.SalesOrderDetail AS d
WHERE d.SalesOrderID IN (43660, 43667)
)
, c2 AS
(
SELECT SalesOrderID ,
SubTotal as New_Total,
ROW_NUMBER() OVER (PARTITION BY SalesOrderID ORDER BY SubTotal) AS Prt_Rn
FROM c1
)
SELECT
ROW_NUMBER() OVER (ORDER BY SalesOrderID, Prt_Rn) as New_Id ,
SalesOrderID ,
New_Total
FROM c2;
Conclusion
In this article, we showed that the elements sequence in the ORDER BY clause in one SQL query is very important. We saw how changing the order of these elements changes the output result - giving the wrong result.
See Also