แชร์ผ่าน


TOP 100 Percent ORDER BY Considered Harmful.

(Updated 2006-27-03 9:00am Pacfiic Time - at the bottom)

SQL is a declarative language.  That means that the language declares the form of the output but not the method used to generate those results.  There are cases, however, where the language is not quite rich enough to describe what customers want.  There are also cases where it is possible to over-infer guarantees that are not really in the language at all.

ORDER BY is an example that I'd like to discuss in this post.  In ANSI SQL, the language has the ability to specify the output order of a query (the "presentation order").   This applies to query results and to cursor results, if a cursor is being used.  This is exposed via ORDER BY, and this is really only legal in the ANSI spec on the outer-most query block of a query.  Microsoft SQL Server allows this in more places than the spec indicates (which is something we can do and still be in-line with the specification).  Specifically, we allow the use of ORDER BY in sub-selects or in view definitions to help define the set of rows that qualify with a TOP operation (TOP is not in ANSI SQL, by the way).

The default plan implementation for this code happens to sort the rows as part of performing the TOP operation.  Often this meant that the results happened to be returned in sorted order, and this led customers to believe that there was a guarantee that rows were sorted.  This is actually not the case.  If you want rows to be returned to the user in sorted order, you need to use an ORDER BY on the outermost query block (per ANSI) to guarantee the output presentation order. 

In SQL Server 2005, you can see how the output order is *not* guaranteed through the following example:

use tempdb

create table t1 (col1 int, col2 int)
declare @i int
set @i = 0
while @i < 20
begin
insert into t1(col1, col2) values (@i, rand()*1000)
set @i = @i + 1
end

create view v as (select top 100 percent * from t1 order by col1 desc)

set showplan_text on
select * from v

The output from this example is:

StmtText
----------------------------------------------
|--Table Scan(OBJECT:([tempdb].[dbo].[t1]))

col1 col2
----------- -----------
0 443
1 418
2 291
3 726
4 948
5 315
6 835
7 247
8 755
9 78
10 88
11 906
12 640
13 876
14 422
15 746
16 528
17 909
18 186
19 868

You'll notice that the original table was created as a heap (no clustered index) and that no secondary indexes are defined.  So, if there were an ordering guarantee for the query based on the order by in the view, the rows should be sorted in descending order and the query plan would need to have a sort in it to make that happen.  However, you'll notice that the query plan contains only a Table Scan, which will return the rows back in the order they happen to be read from disk.  In this particular case, the optimizer recognizes that TOP 100 PERCENT qualifies all rows and does not need to be computed at all.  It gets removed from the query plan, and there is no other reason to do an intermediate sorting operation.  As such, the output isn't returned in any particular order.

So, please do not assume that nested sub-selects will guarantee order.  The optimizer will consider rewrites that invalidates this assumption.  If you need rows returned in a particular order, please add that to the outermost block of your SELECT statement.

Thanks,

Conor Cunningham

PS: Update... One comment I received seeks a bit more detail on why the ORDER BY is "ignored" in this case.  I'll try to expand a bit more to see if that helps. 

From the semantics of the query, the optimizer only really honors the ORDER BY as part of the evaluation of the TOP in that same scope.  The syntax is a bit unfortunate because it causes people to believe that things "will be ordered".  However, it really only says "I want this set of rows".  Presentation orders only apply to the output of the query, not intermediate nodes.  Since we can reorder operations, you can't actually view this as a procedural guarantee "first I sort, then I do whatever is 'above' the sort".  You don't need an optimizer if that were the case, as you aren't asking declarative questions anymore. 

The bottom line is that even if we do the sort as part of the TOP operation in a sub-select, it does NOT guarantee anything about the output order of the query. 

SELECT TOP 99 PERCENT * FROM T ORDER BY col1

is not the same as:

SELECT * FROM (SELECT TOP 99 PERCENT * FROM T ORDER BY col1) AS A

The top query guarantees the output order of the query.  The bottom query does not (even if the rows happen to come back in sorted order)

Comments

  • Anonymous
    March 24, 2006
    Man, that explains a lot!  
    I've never seen this issue explained so well.
    ...Off to fix dubiously constructed views!

  • Anonymous
    March 24, 2006
    Excellent Post .....

    I get the part where TOP 100 was ignored. I am curious as to why the ORDER BY clause in the view definition was ignored.

  • Anonymous
    March 25, 2006
    Mi &#232; capitato di frequente di vedere l'uso della clausola TOP 100 PERCENT usato nella definizione di...

  • Anonymous
    July 26, 2006
    The comment has been removed

  • Anonymous
    September 25, 2006
    Since we’re starting a new blog for the whole query processing team, we think it’s appropriate to start...

  • Anonymous
    October 03, 2006
    When working with SQL Server 2000, I used to have this little trick I'd pull out after exhausting all...

  • Anonymous
    October 03, 2006
    When working with SQL Server 2000, I used to have this little trick I'd pull out after exhausting all...

  • Anonymous
    January 08, 2007
    When working with SQL Server 2000, I used to have this little trick I'd pull out after exhausting all

  • Anonymous
    April 20, 2007
    TOP 100 Percent ORDER BY Considered Harmful Straight to the subject about how it works in SQL 2000 &amp;

  • Anonymous
    August 01, 2007
    Last week I wrote about a special case of the TOP operator known as ROWCOUNT TOP . This week I'll take

  • Anonymous
    August 27, 2008
    One of the greatest lessons I've learned in building software is that every technical implementation

  • Anonymous
    January 20, 2009
    PingBack from http://www.hilpers.com/1005357-sortierung-in-view-nach-umstellung

  • Anonymous
    January 21, 2009
    PingBack from http://www.keyongtech.com/2150129-view-order-by-lost-on

  • Anonymous
    April 27, 2009
    PingBack from http://www.smallbis.com.au/cf_flotsam/2008/11/order-by-clause-not-working-in-sql-server-view/

  • Anonymous
    June 01, 2009
    PingBack from http://paidsurveyshub.info/story.php?id=62596

  • Anonymous
    June 09, 2009
    PingBack from http://insomniacuresite.info/story.php?id=6587

  • Anonymous
    June 15, 2009
    PingBack from http://edebtsettlementprogram.info/story.php?id=24591