Share via


SQL Server XML: Sorting Data in XML Fragments

Working with data sets made us all aware of the fact that a set has no order. XML documents are not data sets, thus they always have a natural order.

So what should we do when we have XML fragments in the wrong order?

Problem Definition

The original problem was to sort a XML document with two levels: Get double-sorted xml document from xml-document   

Approaches 

A - Using T-SQL

Using T-SQL means, that we need to deconstruct our data by parsing the necessary values. We use the nodes() method to extract the elements on the level we want to sort it. And we extract the order criteria with the value() method to sort it with the ORDER BY clause. Finally we can reconstruct the XML fragment by using FOR XML with the PATH mode.

Here is the trivial case. We are completely deconstructing a flat hierarchy and use only the data in T-SQL to reconstruct the XML:

DECLARE @Data XML = N'
<element name="c" />        
<element name="b" />        
<element name="a" />        
';
 
WITH Deconstructed AS (
        SELECT  Element.value('@name', 'NVARCHAR(255)') AS  ElementName
        FROM    @Data.nodes('/element') [Elements] ( Element )
    )
    SELECT  ElementName AS [@name]
    FROM    Deconstructed
    ORDER BY  ElementName
    FOR XML PATH('element');

And this is the result:

<element name="a" />
<element name="b" />
<element name="c" />

A more complex case is the following. We are still working with a flat hierarchy, but now our elements are no longer trivial. The deconstruction process must now provide us with the sort criteria and the rest of the XML fragments per element:

DECLARE @Data XML = N'
<element name="c" >     
    <subelement name="t" />
</element>
<element name="b">
    <subelement name="s" />
</element>      
<element name="a" />        
';
 
WITH Deconstructed AS (
        SELECT  Element.value('@name', 'NVARCHAR(255)') AS  ElementName,
                Element.query('.') AS  ElementContent
        FROM    @Data.nodes('/element') [Elements] ( Element )
    )
    SELECT  ElementContent AS '*'
    FROM    Deconstructed
    ORDER BY  ElementName
    FOR XML PATH('');

Here is the result:

<element name="a" />
<element name="b">
   <subelement name="s" />
</element>
<element name="c">
   <subelement name="t" />
</element> 

B - Using XQuery

Using XQuery means that we use the order clause of a FLWOR statement.

Here is the trivial case again:

DECLARE @Data XML = N'
<element name="c" />        
<element name="b" />        
<element name="a" />        
';
 
SELECT  Fragment.query('
            for $element in /element
            order by $element/@name ascending
            return $element
        ')
FROM    @Data.nodes('.') Fragment ( Fragment );

And the expected result:

<element name="a" />
<element name="b" />
<element name="c" />

As the XQuery FLWOR statement already works on nodes, we already have a solution for the more complex case:

DECLARE @Data XML = N'
<element name="c" >     
    <subelement name="t" />
</element>
<element name="b">
    <subelement name="s" />
</element>      
<element name="a" />        
';
 
SELECT  Fragment.query('
            for $element in /element
            order by $element/@name ascending
            return $element
        ')
FROM    @Data.nodes('.') Fragment ( Fragment );

And here is the result:

<element name="a" />
<element name="b">
   <subelement name="s" />
</element>
<element name="c">
   <subelement name="t" />
</element>

Problem Solution

Sorting the first level of the list:

DECLARE @Data XML = N'
<level1 name="3">       
    <level2 name="f" />
    <level2 name="e" />
    <level2 name="d" />
</level1>
<level1 name="2">
    <level2 name="c" />     
    <level2 name="b" />
</level1>
<level1 name="1">       
    <level2 name="a" />
</level1>
';
 
SELECT  Levels.query('
        for $level1 in /level1
        order by $level1/@name ascending
        return $level1
        ')
FROM    @Data.nodes('.') Levels ( Levels );

Here is the result, the list is only sorted on the top level:

<level1 name="1">
  <level2 name="a" />
</level1>
<level1 name="2">
  <level2 name="c" />
  <level2 name="b" />
</level1>
<level1 name="3">
  <level2 name="f" />
  <level2 name="e" />
  <level2 name="d" />
</level1>

 Here we already see that we need a kind of nested sort, because we have only sorted the outer levels. In a FLWOR statement we can use complex return expressions, especially we can use further FLWOR statements:

DECLARE @Data XML = N'
<level1 name="3">       
    <level2 name="f" />
    <level2 name="e" />
    <level2 name="d" />
</level1>
<level1 name="2">
    <level2 name="c" />     
    <level2 name="b" />
</level1>
<level1 name="1">       
    <level2 name="a" />
</level1>
';
 
SELECT  Levels.query('
        for $level1 in /level1
        order by $level1/@name ascending
        return 
            <level1 name="{$level1/@name}">{
                for $level2 in $level1/level2
                order by $level2/@name ascending
                return $level2
            }</level1>
        ')
FROM    @Data.nodes('.') Levels ( Levels );

Now we have our double-sorted list:

<level1 name="1">
  <level2 name="a" />
</level1>
<level1 name="2">
  <level2 name="b" />
  <level2 name="c" />
</level1>
<level1 name="3">
  <level2 name="d" />
  <level2 name="e" />
  <level2 name="f" />
</level1>

Conclusion

Using the T-SQL approach means that we need to handle the conversion from and to XML to overcome the barrier between XML and T-SQL. While this is only a small step, it simply means more code. And more code is more complex per se.

The XQuery FLWOR expression on the other hand allows us to use a more compact notation. And this kind of XQuery processing was exactly built for these kinds of manipulation. It is the better choice in our case.

Terminology

Fragment: Part of an XML document

A fragment is not a document, thus it is not well-formed

FLWOR: FOR, LET, WHERE, ORDER BY, RETURN (XQuery)


See Also